Hello everyone,
i’m here because i have an issue with my fresh fedora 33.
Here is my laptop: lenovo Yoga 530-14ARR (81H9) with amd ryzen 2500U and Vega 8 integrated with bios up-to-date.
With my fresh fedora 33 install i get iommu issues. Here is what dmesg gives me:
[ 6.119668] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 6.119670] #PF: supervisor instruction fetch in kernel mode
[ 6.119671] #PF: error_code(0x0010) - not-present page
[ 6.119672] PGD 0 P4D 0
[ 6.119675] Oops: 0010 [#2] SMP NOPTI
[ 6.119677] CPU: 3 PID: 137 Comm: irq/25-AMD-Vi Tainted: G D 5.9.12-200.fc33.x86_64 #1
[ 6.119678] Hardware name: LENOVO 81H9/LNVNB161216, BIOS 8MCN58WW 03/26/2020
[ 6.119681] RIP: 0010:0x0
[ 6.119684] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[ 6.119685] RSP: 0018:ffffb41f0037fec0 EFLAGS: 00010246
[ 6.119687] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 6.119688] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffb41f0037fed0
[ 6.119690] RBP: ffff938720418000 R08: ffffffffaaa5a9a0 R09: ffffb41f0037fb58
[ 6.119691] R10: 0000000000000000 R11: ffffb41f0037fb5d R12: ffff938720418bbc
[ 6.119693] R13: 0000000000000001 R14: 0000000000000001 R15: ffff938720418000
[ 6.119695] FS: 0000000000000000(0000) GS:ffff9387232c0000(0000) knlGS:0000000000000000
[ 6.119696] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6.119698] CR2: ffffffffffffffd6 CR3: 000000031a95c000 CR4: 00000000003506e0
[ 6.119699] Call Trace:
[ 6.119703] task_work_run+0x65/0xa0
[ 6.119706] do_exit+0x352/0xae0
[ 6.119709] ? kthread+0x11b/0x140
[ 6.119712] rewind_stack_do_exit+0x17/0x20
[ 6.119714] RIP: 0000:0x0
[ 6.119716] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[ 6.119717] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
[ 6.119719] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 6.119720] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 6.119722] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 6.119723] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 6.119725] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 6.119727] Modules linked in: cmac bnep joydev sunrpc hid_multitouch iwlmvm wacom snd_hda_codec_realtek snd_hda_codec_generic uvcvideo ledtrig_audio snd_hda_codec_hdmi mac80211 snd_hda_intel snd_intel_dspcfg edac_mce_amd snd_hda_codec videobuf2_vmalloc videobuf2_memops kvm_amd videobuf2_v4l2 snd_hda_core libarc4 videobuf2_common vfat snd_hwdep kvm btusb fat videodev iwlwifi btrtl btbcm btintel snd_seq irqbypass mc bluetooth snd_seq_device hid_sensor_accel_3d rapl hid_sensor_trigger snd_pcm hid_sensor_iio_common cfg80211 industrialio_triggered_buffer kfifo_buf industrialio ecdh_generic ecc snd_timer sp5100_tco pcspkr wmi_bmof k10temp ideapad_laptop i2c_piix4 snd soundcore sparse_keymap rfkill i2c_amd_mp2_plat i2c_amd_mp2_pci acpi_cpufreq binfmt_misc zram ip_tables amdgpu hid_sensor_hub crct10dif_pclmul iommu_v2 crc32_pclmul gpu_sched crc32c_intel i2c_algo_bit ttm ghash_clmulni_intel serio_raw drm_kms_helper cec drm nvme ccp nvme_core wmi video pinctrl_amd i2c_hid fuse
[ 6.119754] CR2: 0000000000000000
[ 6.119756] ---[ end trace d200887f2f7aa7bc ]---
[ 6.119758] RIP: 0010:amd_iommu_int_thread+0x16c/0x410
[ 6.119761] Code: d2 31 ff 66 44 89 54 24 14 0f b6 ec 45 0f b7 e4 89 ee e8 b7 8b ef ff 49 89 c7 48 85 c0 0f 84 2a 01 00 00 48 8b 80 90 03 00 00 <48> 8b 78 38 48 85 ff 74 18 48 83 c7 48 48 c7 c6 10 fc 0e aa e8 8b
[ 6.119762] RSP: 0018:ffffb41f0037fe38 EFLAGS: 00010286
[ 6.119764] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffa9671800
[ 6.119766] RDX: ffff938720f099b8 RSI: ffff938720024000 RDI: 0000000000000000
[ 6.119767] RBP: 0000000000000000 R08: ffff93872083b6a0 R09: 0000000000000000
[ 6.119769] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 6.119770] R13: 00000000fff20b40 R14: 0000000000000050 R15: ffff938720024000
[ 6.119772] FS: 0000000000000000(0000) GS:ffff9387232c0000(0000) knlGS:0000000000000000
[ 6.119774] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 6.119775] CR2: ffffffffffffffd6 CR3: 000000031a95c000 CR4: 00000000003506e0
[ 6.119777] Fixing recursive fault but reboot is needed!
I already sent a bug report with bugzilla but i had 2 questions that i would like to ask:
First, would it be possible to get more logs ? or if one of you have an idea of what i can look for ?
Moreother, despite the fact that i get this error, the only problem i see when i use the laptop is that i can’t resume from suspend (so the issue may come from amdgpu ?)
The only solution i found so far is to use “amd_iommu=off” whrn booting. Whith this modification and don’t get any error as iommu is deactivated and i can resume from suspend.
I tried a lot of modifications (disabling iommu in bios, multiple iommu configurations) but none worked.
My second question is to know if it’s better to leave it as it is by default or if it’s best to use “amd_iommu=off” ? I understand that iommu is important from security point of view but despite all my researches i’m not sure if it’s critical or not. Moreother i’m not even sure that it is working when i get the kernel panic. (and the sentence: “Fixing recursive fault but reboot is needed!”
Edit 1: using “iommu=soft” also seems to solve the issue but when i look at dmseg i just see the same thing that i get with “amd_iommu=off” which is that there is iommu error initializing iommuv2 and that device “1002:15dd” (i guess iommu device) is not added due to errors
Edit 3: While using ivrs_ioapic[4]=00:14.0 ivrs_ioapic[5]=00:00.2
the issue doesn’t happen anymore but my touchpad doesn’t work, dmseg gives:
[ +0,005670] iommu ivhd0: AMD-Vi: Event logged [INVALID_DEVICE_REQUEST device=00:00.1 pasid=0x00000 address=0xfffffffdf8250200 flags=0x0a00] [ +0,027567] i2c_amd_mp2 AMDI0011:00: initial bus enable failed
Edit 4: I narrowed the issue down to module “i2c_amd_mp2_plat”. When i blacklist this module, i don’t have the issue anymore (this is also the module causing the not resuming after suspend issue). However, my touchpad and touchscreen stop working with that. So i think that my issue comes from my touchpad driver (linked with kernel iommu configuration)
Thank you in advance for your answers,
see you,
Rémy