Quality of Life question about "copying files"

,

This question doesn’t require a large explanation:

Is there a way to “queue” the copying of files?

.

If a bunch of files gets copies together then “the queue is internal to the same operation”, while if two groups (of one or more files each) get copied and pasted the computer will try to execute the operations in parallel.

This isn’t ideal even for an SSD and the HDDs have it the worse.

Is there a way to make so that when one copy-pastes separate files the computer will first complete the first group, then the second, then the third… ?

Windows never had this option (as far as I know), I wonder if someone coded it in Linux…

Why not? What are thinking needs to be prevented?

I am not that knowledgeable about Linux file systems, but for the little I read “they read and write data in a way which is more betterer than Windows has ever done”.

That said, when pasting with two or more separate operations, they “fight” against each other, taking more time.
THAT is the main problem.

How do you do it? Do you first select a bunch of files and paste them, then while they are being copied, select another bunch of files and paste them somewhere else? I don’t know how graphical file managers perform 2 operations there, but you can use the command below to perform first copy and after it is finished start second copy:

cp "files to copy" "paste location 1" ; cp "other files to copy" "paste location 2"
1 Like

Can you quantify how much of a slowdown you see?

I wonder how much of any slowdown would actually be down to the time spent on disk I/O (as opposed to the time the file manager app spends animating progress bars or whatever).

With a spinning HDD, clearly there would be more efficient and less efficient ways to physically write data to a bunch of files, but wouldn’t the Linux filesystem cache and the disk controller’s cache provide most of the intelligence to do this efficiently? Would an extra layer of “optimisation” at the application level achieve much here?

I cast “sudo dnf update --refresh”!

Jokes aside,
it’s interesting/nice if there already is a Konsole version of “my request”, but I am looking for a “User Interface” version, since it would also be easier to use.

Most, if not all people, prefer to first do thing in the intuitive way (which is also usually easier and/or more convenient), and only after go for the “more inconvenient way”.
When I update my computers here on Fedora KDE I mainly use the Konsole because it is faster, I can see all the different packets (and possible related problems) and if there’s an error, a fail, or if the power cuts out, I believe it “may be the safest way to update or do prevent corrupted data”.

.

What comes easiest to humans when using a computer (desktop, laptop or phone) is to “select multiple files through the UI, and then deciding to cut, copy or delete them”.
It’s so “normal and convenient” for us that basically all companies already put a UI function that allows this (it’s present in basically all Linux Distros too).

It also comes easier to people to “learn how to play with the UI” rather than “being asked to memorise and/or use magic spells which they don’t know/care the meaning of”…

I am between “the granpa trying to figure out how folders work” and “those people who code software and build hardware”.
I am an enthusiast with special interests, and honestly I’ll take the first chance available to do something through the UI when the Konsole is not the only option available (I have to work with Windows 98 for a completely separate project; I don’t like DOS man :frowning:, I want me .ico files on le desktop…).

The only issue I see here is the coding work which would come into making this (assuming that it does not already exist).

.

.

The functioning of such operation would be this:
A) “if group_1 of files and group_2 of files are being copied from Drive A to Drive B, then first finish copying g_1 before starting copying g_2”.

B) “if multiple files are being copied from Drive 1 to Drive 2 and 3, then finish operation with D 2 and then do D 3”.

C) “if D 1 and D 2 are pasting files to D 3, then finish first D 1’s files and then D 3’s”.

I am sure that at sizes reaching 10gb of files there really wouldn’t be an issue, but in cases where way more and way larger batches of files are being copies (I have just solved an issue of mine, so now I have to re-copy my files in my 10tb backup drive) the slowdown may rise from what may have overall been a minute at worse to what may even be one more hour.

I think the answer is that the computer will do what you tell it to do. If you try to start a bunch of copy operations at once, then the computer will try to run them in parallel.

If you want things to run in sequence, then you should wait for one operation to finish before starting the next. Or you could list the operations in a file and then tell the computer to run them. If the computer is reading the commands from a file, it will (by default) wait for one operation to finish before starting the next.

For example:

cp -r /mnt/a/g1 /mnt/b/g1
cp -r /mnt/a/g2 /mnt/b/g2
cp -r /mnt/d1/* /mnt/d2
cp -r /mnt/d1/* /mnt/d3
cp -r /mnt/d1/* /mnt/d3
cp -r /mnt/d2/* /mnt/d3

If you want to do that sort of thing, you just list your commands in a plain-text file and then use the source command to execute them (e.g. source my-commands.txt).

P.S. If you decide you need to cancel the operations, use Ctrl+C.


Running the commands in sequence like that probably won’t matter much with SSDs. It might make a small difference for HDDs due to their larger seek times.

If you want to do other things on your computer while a large copy operation is running in the background, you might try prefixing all those cp (copy) commands with ionice -c 3 (It means “Input/Output (be) nice”. The -c 3 is “class 3” = only use the processor when it is “idle”.)

2 Likes

Thank you for this explanation.

.

I am sure that this isn’t really a problem for “the enterprise guys” or even the coders, or the guys who have a server, because they’d just use these konsole commands.

To guys like me instead, if I were to copy two 1tb batches of files to (in this case the 10tb HDD) at the same time, then this “parallel copying conflict” would raise the overall time needed to write the copied data to even hours.

.

It’s not a deadly necessity, End Users like me can, as said before (and as we already did in Windows) just copy-paste these batches one at the time, and just wait for the first operation to be done before doing the second.

.

In the end, what I am really asking is if it even exists, really.
I don’t believe it does, but I can’t be sure unless and until I ask.

It looks like someone else has already filed this request.

However, the request is still open/pending. So it probably isn’t implemented yet.

Hmm, one of the replies in the above report claims that the Nemo file manager can do it. You might try using that file manager.

1 Like

Yes, it does look like what I am talking about here,
even if “network” makes me think of computers connected to a local server (browsers already have the option to download one file at the time).

Still, you helped me a lot these 2 last weeks alone. Thank you.

.

I will leave this page open for now. It has been made a couple of hours ago, so there’s no reason to already point to a old source as “the answer” when it can just be a part of the answer.

1 Like

No they do not fight.

The way this works is that the writes are placed into kernel buffers.
The kernel optimises gettIng these buffers written to disk.

1 Like

Even if we assume that “speed does not get reduced” I believe that “keeping a program’s data close together is always best” (when talking HDD).

Doesn’t matter if it’s a videogame, a movie or some software, making the HD’s head jump around as little as possible makes performance better.

You can try a different file browser.

I use Thunar and it copies files one after another.

sudo dnf install thunar

The kernel’s file systems know how to do this as well when there are mutliple streams of file data being written. Again not a problem practice.

Taking this Seagate 10TB drive as an example, you could expect write speeds of about 220 MB/s.

That translates to 2.5 hours to write 2 TB of files to the drive. They say their benchmark is about “transferring files of various sizes”, so it’s probably a little faster if your 2 TB consists of a small number of very large files, and slower if it consists of a large number of very small files.

So “hours” is a normal level of performance for this volume of data.

Have you been able to make a direct comparison? i.e. how long does it take when you manually do two successive 1 TB batches, vs how long does it take when you start two 1 TB batches simultaneously?

Microsoft Research looked into why a Windows Explorer copy was slower then using the terminal COPY command. What they found was that it was the cost of updating the GUI progress windows that took up most of the time.

They tested a version of the GUI with less feedback and people did not like it. They liked to see that progress was being made and the speed was secondary.

Anyway when you are benchmarking something like this take into account the costs of the GUI and it’s affect on the speed.

3 Likes

Damn that is interesting.

@isaac0clarke ask on discuss.kde.org, this is a kio thing, which is more abstract than the cp command

When copying to USB sticks this is really annoying. If two copy operations run in parallel, it takes much longer than it takes when in serial. The flash drive simply cannot cope with it.
I’m not sure if this requires doing something about it and how, but it’s worth a discussion.

Tip: Use Collector first to collect all the files to be copied, then copy.

I absolutely could, but the reason why I chose Fedora and why I’m advising it to others is “because I don’t want to touch things, let the OS’ devs do the devving”.

As I said, Windows never seems to have had it too, I was just asking if Linux did.
I am fine with just “doing things the old way, one at the time”.

.

Maybe, I don’t know tho.
Let’s just say that “my gut tells me” there’s a good reason why files usually get copied “one at the time” instead of copy-pasting random chunks of 0s and 1s, fragmented everywhere
(it functionally never happens ever because all OSs are coded to not do this, but such similar case is closest to reality when two different copy-paste operations are done at the same time).

.

I’d really rather not take examples, at all, because be it a dying HDD, a USB pen 3.x, or even an SSD, when copying multiple small files the speeds always tank, and with larger files they stay fast only if “these larger files are not really just many small files in a trenchcoat”.

Functionally speaking,
my 10tb HDD copied 1tb of games in around 2 and 1/2 hours (don’t care about the specifics, I am making an example);
if I were to copy anything else larger than a couple of gb at the same time to the same 10tb the time would almost double over because that’s just what computers do.

.

(No matter how I write this down, it may just sound like I am being very rude. I am not, please read this coldly and analitically. I mean no insult.)
I am so gigantigly positive that my Ryzen 5600x, 16gb DDR4 3200Mhz RTX 2070 PC was experiencing no “slowdown” when copying data from a 4tb HDD to a 10tb HDD because I was “watching YouTube” or “hecking out the speed of copying files”.

I can see the old one having such problems, but not modern ones.

.

I do not understand this.

I really don’t want to make a dozen more accounts for a dozen more websites. I just don’t have it as a priority.

I was just curious if this stuff existed. If it doesn’t AND the discussion is to be “brought higher upstream” then I am not really interested in taking such path…

.

Indeed, this is the real issue here. No matter if it’s tb upon tb of data to be copied over 20 hours or 50gb of data onto a USB, when done parallel it just-

BRO IS THIS WHAT i AM THINKING IT IS?!

If its page is just accurate AND it doesn’t randomly crash THEN THIS IS LITERALLY WHAT I WAS LOOKING FOR!!!

I am too tired now and I have to go away from the PC, so answer me this:
Is this it?
Because if it is, YOU JUST WON THE SOLUTION BADGE!!!

1 Like