#you can set a different recordsize for each partition based on its purpose
liviu@bobdenaut:~$ sudo zfs set recordsize=128k zroot/home
liviu@bobdenaut:~$ sudo zfs get compress zroot/home
NAME PROPERTY VALUE SOURCE
zroot/home compression lz4 inherited from zroot
liviu@bobdenaut:~$ ./test.sh
sysctl: cannot stat /proc/sys/vfs/zfs/txg/timeout: No such file or directory
1 K size -> 1 K alloc
2 K size -> 1 K alloc
3 K size -> 5 K alloc
4 K size -> 1 K alloc
5 K size -> 1 K alloc
6 K size -> 1 K alloc
7 K size -> 1 K alloc
8 K size -> 9 K alloc
9 K size -> 1 K alloc
11 K size -> 1 K alloc
12 K size -> 1 K alloc
13 K size -> 1 K alloc
15 K size -> 17 K alloc
16 K size -> 1 K alloc
17 K size -> 1 K alloc
23 K size -> 1 K alloc
24 K size -> 1 K alloc
25 K size -> 29 K alloc
31 K size -> 1 K alloc
32 K size -> 1 K alloc
33 K size -> 1 K alloc
63 K size -> 1 K alloc
64 K size -> 65 K alloc
65 K size -> 1 K alloc
127 K size -> 1 K alloc
128 K size -> 1 K alloc
129 K size -> 1 K alloc
254 K size -> 261 K alloc
255 K size -> 1 K alloc
256 K size -> 1 K alloc
257 K size -> 1 K alloc
512 K size -> 1 K alloc
1024 K size -> 1029 K alloc
If the filesystems are in the same pool that is backed by the same device(s), I would expect you would want the same block size for them.
As for snapshots, sometimes you do want to separate things (typically you don’t want your user data reverted when you revert your OS, or vice versa). But for package-owned files like those under /etc and /usr (and some under /var), you probably do want those to revert “in sync” when you do a rollback or else you might risk breaking/destabilizing the software when the versions of its files end up mis-matched.
/etc is a bit of a mixed bag. I typically keep it together with the rest of the OS because the changes I make are small and easily reproducible. I have experimented with making a local git repo to track changes to /etc. For example:
$ zfs list root/0/etc.git
NAME USED AVAIL REFER MOUNTPOINT
root/0/etc.git 5.69M 90.9G 5.69M /srv/etc.git
In theory (and if I keep up with pushing the changes I make to /etc to my “backup” repo), I should be able to find and restore my intentional changes after a rollback. (But I haven’t had a good occasion to put this into practice yet.)
Edit: The recordsize is a little different from the block size (which is set on the pool). You can get significant performance improvements by matching that with the record size used by a database, but for most programs it probably won’t matter much. It also affects how well the compression algoritms work, so there are disk space trade-offs to changing that.
indeed this is my concern too, when it comes to snapshots but, when you get the chance to setup different tuning parameters per different dataset then I think this is the best way to go.
For example, the whole /var/log /var/cache and /var/crash and maybe other I would set “sync=disabled”, like this:
zfs set sync=standard zroot
zfs set sync=disabled zroot/var/cache
zfs set sync=disabled zroot/var/crash
zfs set sync=disabled zroot/var/log
zfs set sync=disabled zroot/var/spool
zfs set sync=disabled zroot/var/tmp
zfs set sync=disabled zroot/var/www
the system becomes faster and I dont care about data corruption on these partitions (well maybe log) but will see …
Also, you can exclude them from snapshoting:
zfs set com.sun:auto-snapshot=false zroot/var/cache
zfs set com.sun:auto-snapshot=false zroot/var/crash
zfs set com.sun:auto-snapshot=false zroot/var/lib/containers
zfs set com.sun:auto-snapshot=false zroot/var/lib/libvirt
zfs set com.sun:auto-snapshot=false zroot/var/log
zfs set com.sun:auto-snapshot=false zroot/var/spool
zfs set com.sun:auto-snapshot=false zroot/var/tmp
zfs set com.sun:auto-snapshot=false zroot/var/www
I’m still experimenting with the git repo idea myself. FWIW, I just added the following to my setup which I think might help keep my local repo up-to-date.
I still have to remember to run cd /etc && commit -a --amend --no-edit after rpmconf -a though. I think I’ll try creating a /usr/local/bin/rpmconf “wrapper” script to take care of that automatically …
/usr/local/bin/rpmconf:
#!/usr/bin/bash
command -p rpmconf "$@"
set -e; cd /etc; git commit -a --amend --no-edit; git push;
$ modinfo zfs | grep zfs_txg_timeout
parm: zfs_txg_timeout:Max seconds worth of delta per txg
If so, create a conf file under /etc/modprobe.d and a corresponding dracut conf file under /etc/dracut.conf.d to add it to the initramfs. Then run dracut -f to update your initramfs.
liviu@bobdenaut:~$ cat /etc/dracut.conf.d/modprobe_files.conf
install_items+=" /etc/modprobe.d/nvidia_params.conf "
install_items+=" /etc/modprobe.d/zfs_arc.conf "
liviu@bobdenaut:~$ cat /etc/modprobe.d/zfs_arc.conf
# Set Max ARC size => 16GB
options zfs zfs_arc_max=16106127360
# Set Min ARC size => 2GB
options zfs zfs_arc_min=2147483648
# force commit Transaction Group (TXG) at 120 secs, increase to aggregated more data (default 5 sec)
options zfs vfs.zfs.txg.timeout=120
# force commit Transaction Group (TXG) if dirty_data reaches 95% of dirty_data_max (default 20%, FreeBSD 12.1)
options zfs vfs.zfs.dirty_data_sync_pct=95
# max gap between any two aggregated writes, 0 to minimize frags (default 4096, 4KB)
options zfs vfs.zfs.vdev.write_gap_limit=0
liviu@bobdenaut:~$ sysctl -n vfs.zfs.txg.timeout
sysctl: cannot stat /proc/sys/vfs/zfs/txg/timeout: No such file or directory
The module option wouldn’t have the vfs. prefix. Also, it would use _ (or -) instead of . in the option name and it would be a global setting that would effect all zfs pools and filesystems.
If there is a per-filesystem setting that can be set with sysctl, I’m not familiar with it.
FYI, you can check the current/active value of any module parameter with, e.g., cat /sys/module/zfs/parameters/zfs_txg_timeout.
Some of those can also be changed at runtime (without rebooting) by “echoing” a new value to the file. You’ll have to check the documentation to find out what settings can be changed on the fly though.