Couple of notes on the above:
- Fedora lacks OOTB SELinux policies to support homed fully. There are community members who provided custom SELinux policies to fulfill the gap.
- Tried homed with MD RAID1 (2 SSDs, 1.92 TB) as a block device for LUKS - it had enormously long TRIM activities on each login, lasting for 6-8 minutes. And that’s practically empty disk. I believe, better LUKS/fs defaults are needed in the case of block devices. On the other hand, homed worked OK with LUKS on loopback file. And having FIDO2 support for Yubikey 5 was a blessing.
Tried homed with MD RAID1 (2 SSDs, 1.92 TB) as a block device for LUKS - it had enormously long TRIM activities on each login, lasting for 6-8 minutes. I believe, better LUKS/fs defaults are needed in the case of block devices.
Probably homed just needs to add persistent discard flag in metadata when creating LUKS2 containers. It’s easy to add and discard gets enabled on any subsequent device activation automatically.
Maybe, I don’t know LUKS2 that well.
I experimented with both
--luks-offline-discard parameters and if any of those were set to
1, then discard took 6-8 minutes on the 1.92TB MD RAID1 block device. I had to set both to
0 in order to make my homed managed user work as intended. But discard is needed for SSDs, I guess. It would be better to have enabled but in some other, clever way.
That sounds like something worth reporting to the linux-raid mailing list. I know MD raid10’s discard support was pretty inefficient. But linux.git commit d30588b2731fb (“md/raid10: improve raid10 discard request”) addressed that.
raid0 was optimized back in 2017, see linux.git commit 29efc390b946 (“md/md0: optimize raid0 discard handling”)
As for raid1, Xiao Ni offered: “raid1 doesn’t have the chunk concept, so it just needs to submit the discard directly to disk too”
Should then it be addressed in the default/OOTB MD or LUKS configs when initiated/managed by systemd-homed ?
It needs to be triaged. Unlikely to be specific to the way fedora sets up the IO stack. But regardless, the logical next step would be to work with an MD developer to further isolate the cause of the slowdown due to discards on your setup. I’ve made Xiao Ni aware of this thread.
Thank you for raising awareness to MD developer. “My setup” is destroyed in favor of “home dir on separate MD mirror” now as I had no knowledge nor time to experiment with my “workhorse” risking to “brick” it. But I’ll share what I did with Xiao.
I tried to reproduce myself. I encounter one failure when creating the /home/xiao directory which is used to for test.
homectl create test
Operation on home xiao failed: Access denied
systemctl status systemd-homed
systemd-homework: Formatting tool for compiled-in default file system btrfs not available, falling back to ext4 instead.
ystemd-homework: Failed to create home image /home/.#homeworkxiao.homec722fcc89bedf9b4: Permission denied
systemd-homed: Operation on xiao failed: Permission denied
Do you know how fix this.
And are the two options are used when creating the systems-homed directory? Can you give a command so I can try to reproduce myself
Thank you for reaching out.
These are the steps I used to create my homed-managed user on MD RAID1 block device:
- I created MD RAID1 block device
/dev/md0 using 2 Micron 5400 1.92TB SATA SSDs:
sudo mdadm -C /dev/md0 --level=raid1 --raid-devices=2 /dev/sdc1 /dev/sdd1
- I configured systemd-homed following this excellent tutorial - Building a new home with systemd-homed on fedora. Important part of this is to configure custom SELinux policies for systemd-homed, as Fedora’s OOTB policies still will not cover what’s required to function homed properly.
- Next, I created my homed-managed user:
sudo homectl create <my user name> --storage=luks --disk-size=500G --fs-type=ext4 --luks-extra-mount-options=defcontext=system_u:object_r:user_home_dir_t:s0 -G wheel --real-name="<real name of my user>" --image-path=/dev/md0
- I rebooted and tried to log in to my new user. GUI login timed out. I found out that the discard took 6-8 minutes to complete by trying to activate my new user via homectl CLI while I’m logged in as old user and analyzing system log.
- My next attempt was to turn discard off:
sudo homectl update <my user name> --luks-offline-discard=0
sudo homectl update <my user name> --luks-discard=0
- Now I could log in normally. But I didn’t like that I had to disable discard.
Please let me know If you need anything else.
Note - I had to drop the idea of using homed-managed user this time and I converted my normal user home dir to simple “home on MD RAID1” without the systemd-homed until homed implementation gets more mature.
Ok, this command does not add discard flag in LUKS2 metadata by default. Perhaps let’s open a bug for it and track it properly elsewhere.
edit: I don’t immediately say this is the culprit, but it means active dm-crypt device will not pass-through discards to lower layers.
edit2: seems the discard flag gets added manually when dm-crypt is activated which is fine.
It seems it does as with all other parameters left to its defaults this one was ON:
LUKS Discard: online=yes offline=no
Changing it to
online=no offline=yes let me to login immediately, but I got same 6-8 minutes delay due to discard on log off.
I had to to set both to
0 to stop discard activities.
Interesting that on my 2014 MacBook Pro, with homed user I got these default discard settings:
LUKS Discard: online=no offline=yes
But that has no impact as my user home dir is an LUKS encrypted loopback file. In this case homed-managed user works nicely.
Should I do that ? And where - LUKS, MD, systemd-homed ?
Note that since kernel 6.2,
discard=async mount option is enabled by default on btrfs.
I tried to configure systemd-homed but it fails. I already asked at Building a new home with systemd-homed on fedora - #40 by xiao. Did you encounter this problem?
No, with F37 everything was smooth, except discard with MD raid1.
I haven’t reproduce this problem in my environment. I used the same commands from step1 to step3. For step4, I used this command
ssh $user_name@$host and it took several seconds to login to successfully.
These are my nvme type:
/dev/nvme1n1 /dev/ng1n1 PHLJ134500BZ1P0UGN INTEL SSDPE2KX010T8 1 1.00 TB / 1.00 TB 512 B + 0 B VDV10170
/dev/nvme0n1 /dev/ng0n1 PHLJ135005Q71P0UGN INTEL SSDPE2KX010T8 1 1.00 TB / 1.00 TB 512 B + 0 B VDV10170
can you please try this command instead while logged in as non-homed-managed user?
sudo homectl activate <your homed-managed user name goes here>
This does the opposite - deactivates/logs out:
sudo homectl deactivate <your homed-managed user name goes here>
date && homectl activate xiao && date
Mon Apr 17 08:34:02 PM EDT 2023
Please enter password for user xiao:
Mon Apr 17 08:34:16 PM EDT 2023
date && homectl deactivate xiao && date
Mon Apr 17 08:33:45 PM EDT 2023
Mon Apr 17 08:33:45 PM EDT 2023
Hmm, these commands still work well
Yes, this look normal, I mean timing. The only major difference in MD RAID1 setup is that mine were SATA SSDs and yours are NVMEs…
Can you check please discard settings of your setup with
sudo homectl inspect <your homed user> ?
Mine discard looked like this after creating my user with homectl:
LUKS Discard: online=yes offline=no
This is the output
User name: xiao
Last Change: Tue 2023-04-18 02:42:19 EDT
Login OK: yes
Password OK: yes
GID: 60206 (xiao)
Aux. Groups: wheel
Storage: luks (strong encryption)
Image Path: /dev/disk/by-uuid/b0725c73-3825-477a-8fe9-0ea8ce264e4f
LUKS Discard: online=yes offline=yes
LUKS UUID: b0725c73-3825-477a-8fe9-0ea8ce264e4f
Part UUID: 44685141-a085-459d-91a8-c2b65a5f4a1a
FS UUID: e3c62524-8058-49ee-8a7f-7224c30304c2
File System: ext4
LUKS MntOpts: defcontext=system_u:object_r:user_home_dir_t:s0
LUKS Cipher: aes
Cipher Mode: xts-plain64
Volume Key: 256bit
Mount Flags: nosuid nodev exec
Disk Size: 931.3G
Disk Floor: 5.0M
Disk Ceiling: 5.0T
Auth. Limit: 30 attempts per 1min
Local Sig.: yes
OK then. I see only major difference of your and mine setups - SSD technology and disk vendor… But I’m not sure if that could contribute to the problem. And you tried it on fresh install (if I understood it right) and I tried on existing install of Fedora, which was upgraded twice from F35…
So, maybe just bad luck on my side