I have had this problem for a long time (don’t quite recall when it started at which version).
I started with the very first version of FC 1.0 many years ago. I upgraded to every new version of course by installing inside a VM (either VirtualBox or VMware).
I regulary defrag my guest VM disks, then I run dd if=/dev/zero of=zzz bs=1M to zero fill disk free space, then shutdown the VM and compact the physical disk (VHD or VMDK).
This had been working for years, until at some version this routine was broken. I am not sure the exact version, maybe starting at version 10?
The problem is that dd writes data directly to host OS disk, like fire that burns through walls.
To emphasize what I mean, dd actually zero fill all free disk space of the host OS disk.
For example, the host disk has 400G free space out of 1TB. When dd finishes, I can see the resulting zzz file in the guest Fedora system is about 400GB, while the virtual disk size is only 20GB.
I have to attach Fedor’s VHD/VMDK to Ubuntu and do the routine. I believe it’s a specific problem of Fedora itself because this “bourn through” problem does not happen on Ubuntu.
You are zeroing out the free space so that the VM software will eliminate the blocks of 0’s in the virtual disk. More information may help understand what is going on.
I understand that. I just thought it might have to do with expendable virtual disks.
But after running some tests myself, I think the issue is different, possibly related to BTRFS file systems.
I have created a similar zero-filled file in my home folder (BTRFS subvolume), and it kept increasing at a size already higher than the entire disk of the host system. At some point I interrupted the process. While the dd command was running, I could notice the empty disk space on the host system (macOS) decreasing, but at a much smaller pace.
Then I ran the same command under /boot (ext4 file system), and the command stopped successfully when the partition got full.
@neoinmatrix, do you run the command with the output file on a BTRFS subvolume? If yes, would you consider the issue started to happen around the time Fedora introduced BTRFS as a standard file system[1] for root and home?
And I there was a typo when I stated the problem began at version 10. Maybe I wanted to type version 30, which corresponds to Fedora 33 release with btrfs.
If not then it is not eating into that. Indeed, with btrfs and compression (the default), that VM with a 200GB file inside may use up just 20GB in total
If it’s a default installation, it’s using Btrfs with compression. Zero filled files will be highly compressible. It’ll take a lot of writes to fill the free space, and it won’t really be full of zeros.
What’s the problem? And what’s the goal? And is the drive spinning or flash?
Because Btrfs defaults to discards, which will “punch holes” i.e. zeros or more correctly unmap, for unused blocks. On qemu/kvm using raw backing files for the VM this results in creating sparse files, and with unmap enabled it will pass through the storage stack to the physical media as “trim”, if supported.
So…I don’t know if Virtual Box supports unmap/trim in the guest but I’d look into that instead.
Update: OK so for the shrinking a disk image use case, yeah you want to make sure the hypervisor sees the unmap/trim issues by the guest via the btrfs discard mount option, thereby marking blocks in the backing file as unused. And then it’s less work when shrinking the backing file.
As for defragmenting, this is file system specific because why fragmentation happens is file system specific. Btrfs supports defragment but the defaults aren’t the best for compression. There’s no point using an extent value higher than 128KiB because the compressed extent size is limited to 128KiB. This is set with the -t flag, so no point in making it bigger than 128KiB or it just adds time and wear. I myself use 64KiB. The other form of fragmentation is free space, and that 's maintained by btrfs balance filtered for least used data block groups. I never do this, because I see no benefit from it. For >80% full file systems, maybe there’s a point.