So, I have recently learned that when you copy files from your SSD to a USB drive, the “Copying finished” message doesn’t actually mean the process is done. Apparently, there is some unwritten cash that needs to finish in the background. Therefore, instead of taking out the USB drive physically, one should use the “eject” button and wait for it to disappear on the list of drives.
Today, I had to copy a small assortment of PDF files, around 700 MiB in total. I copied them normally and clicked on “Eject” next to the USB drive name. The “waiting” icon persisted for a few seconds, and then the drive disappeared. Upon putting it back on, I realized that it had only written 100 MiB of data. I tried again, but I waited a full minute before using the “eject” button this time. It copied the full 700 MiB. However, some files gave an error open while opening them.
I tried one more time by waiting a few extra minutes on top of the previous minute. Just like last time, I still had to wait an additional minute after clicking “Eject”.
Fortunately, after mounting the drive again and opening every single file one by one, I confirmed that the process had finally finished successsfully.
It is not the waiting that is problematic, but the system straight up “misinforming” twice about something that had not taken place: During copying, and ejection.
WillI I have to check each file one-by-one to make sure? There has to be a better way, and, more importantly, why is the default setup so notoriously unreliable for such a basic function?
People seem to use “hdparm” and “udev” rules to disable the write cache for specific USB devices, I am not a Linux user, but disabling the write cache for “affected” devices is probably the most sensible option. There also seems to be a mount option “sync”!
Nowadays, programs like file managers make use of “smart” copy routines from the operating system. These handle all the caching, trigger “server side copy” and so on, it makes sense most of the time to use them, but for slow devices and huge system memory and write cache enabled, this will end up in a somewhat bad experience.
Otherwise, yes, run “sync” as already pointed out. Maybe you can create a hotkey to run it, make sure you see when its done.
If Dolphin would allow to run commands for both sides of the split view pane, like it was possible with any basic file manager in the late 80s already, you could:
Run a script from there, which double checks the MD5 checksums of all the files on the left (source) and right (destination), but since Dolphin can’t do this, you need to run that script from a terminal and copy / paste your source and destination paths manually.
The problem is usually (cheaper) USB drives that report “all written” when they really haven’t.
I often move many (70+ million) relatively small files (20 - 200KB) files (PDF’s usually) amounting to multiple TB’s of data around from server to server and server to NAS drive for data migrations.
On Windows I use robocopy a great deal as anything else is frankly unusable at this scale.
On linux, it’s usually rsync or rclone.
I also produce a manifest file to go with the delivery; filepath and filename, filesize and SHA256 of the content. I create it my end, send it with the data and the recipient runs the same code at their end to ensure that they have all the files, they are all the same size and the contents are identical. Overkill for your use case, but it’s the only way to provide a reliable audit of what left and what arrived.
Give rclone a try, and let the USB drive sit for a minute or two after the clone has finished before running any ejects or sync commands.
Buying quality drives, if you’re using cheapies as I usually do, can also make a big difference to throughput and reliability.
As much as possible I use rsync for copying files and avoid the desktop file manager or the cp command. Rsync provides a list of the files being copied and does not return a prompt until the last file has been written and confirmed on the destination.
This habit avoids underestimating the inherent lag on writing to a slow usb device and the potentially misleading messages you (@hellishexperience ) are referring to.
When you copy a file, it pretends that you can open the files from the usb directory. However, it actually opens them from the source directory. I thought if I eject, it will wait till the data is written and then let me eject?
I find it hard to believe that Fedora would fail at such a basic task. Does that mean that with a “normal” USB drive, this should not happen? That is, “Eject” would wait till all the data gets written?
It’s the kernel and the filesystem that takes care of sending data to a device and removing that device when the device says “I have written all the data you gave me the to the flash on the board”; Fedora is about as guilty of failure as the “cp” program or whatever you were using to copy the data.
As much as I wanna try this, I find it a little bit uncomfortable that I need to run a script for something as basic as copying a file… I would have guessed that something like this would be fixed long ago.
I was using Dolphin (default KDE file manager) to copy the files.
The USB was formatted as exFat, since I may use it occasionally on Windows. Would ext4 or similar formats make these issues less likely?
I would assume unlikely - the data is sent to a buffer to be written to the device and the device reports when it has done so - the filesystem format should make no difference.
I also cannot replicate this scenario using Dolphin to write to a USB with a single large file (operation not permitted as file in use). I’ll whip up a few hundred files of garbage and see if that makes any difference.
Did you take your 700 MB of files and drag them onto the USB of copy them in chunks or some other method?
There’s some discussion here on the KDE forum about how Dolphin might be able to interrogate the kernel and filesystem to provide more information to the user. To my non-expert eyes it seems difficult to do in a fully reliable way.
I mean, Windows has a system where you can see copying in progress. Is it because they don’t use caching? I am given to understand that not having caching causes unnecessary writes to the target drive?
Caching does not cause any additional writes to the target - you have 1GB to write, you have to write 1GB. Caching just means you can put that 1 GB into a chunk of memory and carry on with other work whilst that cache/chunk of ram is really written to the target. Windows caches too - all operating systems use caches ubiquitously.
Why can’t the system show me the real progress? Otherwise, I’d rather wait and copy my files without caching. I don’t wanna “pray” every time I am copying something.
At least historically, the kernel didn’t provide much transparency to the application layer on this.
Since kernel 6.5 it’s apparently better (I’m just linking information that was quoted on that KDE thread). So it sounds like some improvements could be made in applications, but I don’t know if the means are there that could tell you 100% that all operations were complete.
The bug that does stand out to me in your OP is that you apparently successfully ejected the drive, but the copy hadn’t completed. That’s very bad and breaks the system’s contract with you.