I have a PCIe 3.0x4 NVMe ssd hooked up to a compatible motherboard. The entire OS including user files uses 35.5 GB out of the 498.4 GB total.
I started noticing problems when I had a C++ program generating several thousand files. At first, each batch of 1000 would go smoothly. Over time, 1000 files would get deleted, and another 1000 created. This happened at least 10 times. Eventually, it would take several seconds to half a minute to copy and paste these 1000 files vs when it was only a few seconds at most.
I then needed to restore something from the trash. The trash probably had over 10,000 files. It took about a minute for the trash to load, and then about 10 to 30 seconds to select an item and then restore it. I then emptied the trash to see if it would become faster, and it did. The empty took over a minute. I can now copy and paste 1000 files and have it load in under a second. What is going on?
If I am going to be dealing with large quantities of files like this, how can I keep the Files app from slowing down? My guess is to automate deleting old items when the trash bin gets over a certain amount of items, or something like that. Any better ideas? Is there some setting I can use to prevent the slowdown?
Files in the trash occupy space on the file system. Failure to empty the trash will over time fill the file system, cause fragmentation, and in general slow things down.
You should make it a practice to always empty the trash, and if you are doing tasks that create that many files for the trash frequently then empty the trash frequently.
You can easily write a script to empty the trash, though I do not know off hand if there is such an app already available.
The disk wasn’t close to filled though. It is also an SSD and " unlike hard drives, SSDs have no moving parts. The data is read directly from the drive’s non-volatile memory cells into RAM, and that process is just as fast for a fragmented file as it is for a contiguous file. Therefore, no defragging is required ."
I did not say ‘defragging is required’. I said the drive getting full leads to fragmentation. That statement applies in the great majority of machines.
There are 2 parts to filling up a drive. 1 is the amount of data actually stored. The second is the number of inodes actually used. You can have 1 bit of data for each inode and it uses the entire inode even though the data does not come even remotely close to filling it up. A lot of tiny files means a lot of inodes used without a lot of data stored.
While I cannot point to the actual cause, if you look at ~/.local/share/Trash you will find 3 subdirectories that contain the actual ‘trash’ with at least 2 files for each trashed file. That means, for example, that if there are 10,000 files in the trash you have used a minimum of 20,000 inodes just for the trash.
Accounting for the files and the info needed to potentially restore them does take overhead and memory for management. If you have multiple runs with the same file names and location being pushed to the trash it multiplies the management overhead because it now has to handle multiple copies of the ‘same’ file.
The statement was made that emptying the trash restored the normal speed seen before the trash filled up. Ergo, the overhead needed to manage a large quantity of files in the trash seems to be a major factor in the slowdown.
I think you’ll generally find that if a directory has a large number of files, the time to open a file in that directory becomes long, linear search time. Consider the trash as a directory.
If you are looking to create a large number of files, consider doing something like you’ll see in a number of caching applications. Suppose your lotsoffiles directory is where you’re going to keep these files. Instead of saving file ‘abc123456’ in lotsoffiles/abc123456, save it in lotsoffiles/a/b/abc123456, using the first two letters to create a ‘hash bucket’ for that file. There is a cost to this method, that of checking for a subdirectory ‘a’ in the toplevel lotsoffiles, and for a sub-subdirectory ‘b’. But now, when you later want to read that file, you won’t be forcing the system to do a linear search of the toplevel, instead (presuming your file names are somewhat uniformly spread out over the alphabet) the system will be searching a smaller directory.
You may see samples of this method in use in your home directory if there’s a .cache or .ccache directory there.
I am wiping the Fedora drive due to leftovers from Eclipse installs, and will reformat to ext3 or ext4 to see if it helps at all. Current install uses Btrfs.