Fedora 37 kernel 6.1 issue with lenovo dock

This thread, regarding a regression for MST Hubs (DisplayPort Multi-Stream Transport), seems likely relevant to be a root cause.

I have a similar issue after upgrading today with the Lenovo YOGA Slim 7 Pro. I am currently running the Linux 6.1.6-200.fc37.x86_64 x86_64 Kernel version and Fedora Linux 37 (Workstation Edition). I didn’t have them in the morning before the upgrade.

Thanks for this. I first assumed that #2171 was only about AMD GPUs, but when looking through https://gitlab.freedesktop.org/superm1/linux/-/commit/2145b4de3fea9908cda6bef0693a797cc7f4ddfc which was referenced in Displays behind MST hubs non-functional (regression in kernel 6.1) (#2171) · Issues · drm / amd · GitLab, I found that also i915 related code was changed, and I also searched for the string MST in /var/adm/messages. Result:
Lines with this string only occurred during times when the system was running with kernel 6.1.
Samples lines are:

Jan 18 08:32:46 kernel: i915 0000:00:02.0: [drm] *ERROR* Step 2 of creating MST payload for 00000000e179ad29 failed: -5
Jan 19 17:31:19 kernel: i915 0000:00:02.0: [drm] Failed to create MST payload for port 00000000f6eacae7: -110
Jan 19 17:31:19 kernel: i915 0000:00:02.0: [drm] *ERROR* Failed to create MST payload for DP-6: -110

Lines of the first type (“Step 2 of creating MST payload…”) appeared most often (total 15 times), whereas each of the other two lines appeared only once.

1 Like

I installed 6.1.7-200.fc37.x86_64 this morning:
reboot system boot 6.1.7-200.fc37.x Mon Jan 23 09:47 still running
, and so far have had no problems with the external monitors connected to the Thunderbolt dock. I still had some MST lines in /var/log/messages though:

Jan 23 09:48:46 kernel: i915 0000:00:02.0: [drm] Failed to create MST payload for port 00000000835ea903: -110
Jan 23 09:48:46 kernel: i915 0000:00:02.0: [drm] *ERROR* Failed to create MST payload for DP-6: -110
Jan 23 15:05:45 kernel: i915 0000:00:02.0: [drm] Failed to create MST payload for port 00000000f0045e1e: -110
Jan 23 15:05:45 kernel: i915 0000:00:02.0: [drm] *ERROR* Failed to create MST payload for DP-8: -110

Regarding the ACPI: Added _OSI lines: With kernel 6.0.18-300.fc37, there were:

Jan 23 09:32:15 kernel: ACPI: Added _OSI(Module Device)
Jan 23 09:32:15 kernel: ACPI: Added _OSI(Processor Device)
Jan 23 09:32:15 kernel: ACPI: Added _OSI(3.0 _SCP Extensions)
Jan 23 09:32:15 kernel: ACPI: Added _OSI(Processor Aggregator Device)
Jan 23 09:32:15 kernel: ACPI: Added _OSI(Linux-Dell-Video)
Jan 23 09:32:15 kernel: ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio)
Jan 23 09:32:15 kernel: ACPI: Added _OSI(Linux-HPI-Hybrid-Graphics)

But with kernel 6.1.7-200.fc37, there are fewer:

Jan 23 09:46:48 kernel: ACPI: Added _OSI(Module Device)
Jan 23 09:46:48 kernel: ACPI: Added _OSI(Processor Device)
Jan 23 09:46:48 kernel: ACPI: Added _OSI(3.0 _SCP Extensions)
Jan 23 09:46:48 kernel: ACPI: Added _OSI(Processor Aggregator Device)

. So these lines are similar to those with kernel versions 6.1.5 and 6.1.6.

However, after reading Displays behind MST hubs non-functional (regression in kernel 6.1) (#2171) · Issues · drm / amd · GitLab, I tried the following:

  • disconnect the dock from the laptop and wait some seconds
  • connect the dock again
    Result: The monitors went to sleep and even when disconnecting the monitor cables from the dock and connecting again, they did not come back from sleep.

So what is working with this kernel?

  • lock screen with l
  • switch off the two monitors
  • wait some minutes (e.g. for example after returning from a break)
  • log in again
  • turn on the monitors
    This will move all windows to their previous locations on all three screens.

Saw this Thead after postimg my own, it looks like I have the same issue. But it’s not limited to Lenovo’s dock as i have the same issues with an i-tec dock, too.

see also https://discussion.fedoraproject.org/t/lenovo-t14s-issue-with-multiple-monitors-after-update-last-night/75854

same issues also with Kernel 6.1.7

Same problem here. Thinkpad z13 and Lenovo Docking Station. I also get a kernel warning:

WARNING: CPU: 7 PID: 2875 at drivers/gpu/drm/amd/amdgpu/…/display/dc/core/dc_link.c:3533 update_mst_stream_alloc_table+0x129/0x130 [amdgpu]

Kernel 6.1.7.200

same problem:

kernel 6.1.5 2/4 ext displays broken
kernel 6.1.6 4/4 ext displays working
kernel 6.1.7 2/4 ext displays broken

[  309.658012] [drm:dc_link_allocate_mst_payload [amdgpu]] *ERROR* Failure: pbn_per_slot==0 not allowed. Cannot continue, returning DC_UNSUPPORTED_VALUE.
[  309.936290] [drm:dc_link_allocate_mst_payload [amdgpu]] *ERROR* Failure: pbn_per_slot==0 not allowed. Cannot continue, returning DC_UNSUPPORTED_VALUE
[   28.279047] ------------[ cut here ]------------
[   28.279053] WARNING: CPU: 8 PID: 2184 at drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link.c:3533 update_mst_stream_alloc_table+0x129/0x130 [amdgpu]
[   28.279472] Modules linked in: rfcomm snd_seq_dummy snd_hrtimer michael_mic nft_objref nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink bnep sunrpc qrtr_mhi vfat fat btusb btrtl btbcm uvcvideo btintel btmtk videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 bluetooth videobuf2_common snd_usb_audio videodev snd_usbmidi_lib snd_rawmidi mc intel_rapl_msr qrtr ath11k_pci snd_soc_dmic snd_soc_acp6x_mach snd_acp6x_pdm_dma intel_rapl_common ath11k snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_hda_codec_realtek snd_sof_pci snd_hda_codec_generic snd_hda_codec_hdmi snd_sof edac_mce_amd qmi_helpers snd_hda_intel snd_sof_utils snd_soc_core snd_intel_dspcfg snd_intel_sdw_acpi snd_compress mac80211 ac97_bus kvm_amd snd_hda_codec snd_pcm_dmaengine snd_hda_core snd_pci_ps snd_rpl_pci_acp6x
[   28.279526]  libarc4 snd_pci_acp6x snd_hwdep kvm snd_seq asus_wmi snd_seq_device ledtrig_audio irqbypass sparse_keymap cfg80211 snd_pcm rapl platform_profile snd_timer joydev snd_pci_acp5x wmi_bmof snd_rn_pci_acp3x snd snd_acp_config rfkill snd_soc_acpi k10temp i2c_piix4 thunderbolt snd_pci_acp3x mhi soundcore amd_pmc acpi_cpufreq acpi_tad zram r8152 cdc_mbim cdc_wdm cdc_ncm cdc_ether usbnet mii amdgpu drm_ttm_helper ttm rtsx_pci_sdmmc iommu_v2 gpu_sched mmc_core nvme video uas ucsi_acpi drm_buddy hid_multitouch crct10dif_pclmul crc32_pclmul nvme_core crc32c_intel polyval_clmulni polyval_generic drm_display_helper ghash_clmulni_intel sha512_ssse3 typec_ucsi serio_raw usb_storage sp5100_tco ccp rtsx_pci cec typec nvme_common wmi i2c_hid_acpi i2c_hid ip6_tables ip_tables fuse
[   28.279587] CPU: 8 PID: 2184 Comm: gnome-shell Not tainted 6.1.7-200.fc37.x86_64 #1
[   28.279591] Hardware name: OriginPC Voyager a1600/Voyager a1600, BIOS 1.30 08/02/2022
[   28.279592] RIP: 0010:update_mst_stream_alloc_table+0x129/0x130 [amdgpu]
[   28.279953] Code: e8 03 89 c1 f3 48 a5 48 81 c4 90 00 00 00 5b 5d 41 5c c3 cc cc cc cc 41 0f b7 40 04 4d 89 19 49 89 59 08 66 41 89 41 10 eb 87 <0f> 0b e9 14 ff ff ff 0f 1f 44 00 00 55 48 89 fd 53 bb 0a 00 00 00
[   28.279955] RSP: 0018:ffffb09889fb7608 EFLAGS: 00010202
[   28.279959] RAX: 0000000000000002 RBX: 0000000000000000 RCX: 0000000000000000
[   28.279961] RDX: 0000000000000000 RSI: ffffb09889fb7608 RDI: ffffb09889fb7698
[   28.279962] RBP: ffff8f1d29980aa0 R08: ffffb09889fb76c0 R09: ffffb09889fb7440
[   28.279964] R10: ffff8f1cddca1c00 R11: ffff8f1cd44ce9c0 R12: 0000000000000002
[   28.279965] R13: ffff8f1ceed73000 R14: ffffffffc0e858e0 R15: ffff8f1ccdd04080
[   28.279967] FS:  00007fdfd047f5c0(0000) GS:ffff8f25d6800000(0000) knlGS:0000000000000000
[   28.279969] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   28.279971] CR2: 000055c41912515c CR3: 0000000107760000 CR4: 0000000000750ee0
[   28.279973] PKRU: 55555554
[   28.279974] Call Trace:
[   28.279979]  <TASK>
[   28.279985]  dc_link_allocate_mst_payload+0x85/0x280 [amdgpu]
[   28.280382]  core_link_enable_stream+0x780/0x930 [amdgpu]
[   28.280722]  dce110_apply_ctx_to_hw+0x649/0x6f0 [amdgpu]
[   28.281061]  dc_commit_state_no_check+0x37e/0xc70 [amdgpu]
[   28.281388]  ? dc_validate_global_state+0x2b0/0x3e0 [amdgpu]
[   28.281746]  dc_commit_state+0x92/0x110 [amdgpu]
[   28.282076]  amdgpu_dm_atomic_commit_tail+0x4a0/0x2a90 [amdgpu]
[   28.282427]  ? load_balance+0x17d/0xdd0
[   28.282435]  ? update_load_avg+0x7e/0x780
[   28.282438]  ? __cgroup_account_cputime+0x4c/0x70
[   28.282444]  ? psi_group_change+0x15f/0x380
[   28.282449]  ? _raw_spin_unlock+0x15/0x30
[   28.282454]  ? finish_task_switch.isra.0+0x9b/0x300
[   28.282458]  ? __switch_to+0x106/0x420
[   28.282463]  ? __schedule+0x367/0x1360
[   28.282469]  ? get_nohz_timer_target+0x18/0x190
[   28.282473]  ? schedule+0x67/0xe0
[   28.282477]  ? schedule_timeout+0xfa/0x140
[   28.282479]  ? preempt_count_add+0x6a/0xa0
[   28.282482]  ? preempt_count_add+0x6a/0xa0
[   28.282485]  ? _raw_spin_lock_irq+0x19/0x40
[   28.282488]  ? _raw_spin_unlock_irq+0x1b/0x40
[   28.282490]  ? wait_for_completion_timeout+0x12a/0x140
[   28.282494]  ? wait_for_completion_interruptible+0x111/0x1b0
[   28.282498]  ? __bpf_trace_dma_fence+0x10/0x10
[   28.282507]  commit_tail+0x94/0x130
[   28.282512]  drm_atomic_helper_commit+0x112/0x140
[   28.282516]  drm_atomic_commit+0x67/0xd0
[   28.282521]  ? drm_plane_get_damage_clips.cold+0x1c/0x1c
[   28.282525]  drm_mode_atomic_ioctl+0x93d/0xb80
[   28.282532]  ? drm_atomic_set_property+0xbb0/0xbb0
[   28.282535]  drm_ioctl_kernel+0xa9/0x150
[   28.282541]  drm_ioctl+0x1e7/0x450
[   28.282545]  ? drm_atomic_set_property+0xbb0/0xbb0
[   28.282550]  amdgpu_drm_ioctl+0x4a/0x80 [amdgpu]
[   28.282819]  __x64_sys_ioctl+0x90/0xd0
[   28.282825]  do_syscall_64+0x5b/0x80
[   28.282830]  ? exit_to_user_mode_prepare+0x18f/0x1f0
[   28.282835]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   28.282839] RIP: 0033:0x7fdfd3f23d6f
[   28.282874] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
[   28.282876] RSP: 002b:00007ffd2c7fd770 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   28.282879] RAX: ffffffffffffffda RBX: 000055c41b9dcc40 RCX: 00007fdfd3f23d6f
[   28.282881] RDX: 00007ffd2c7fd810 RSI: 00000000c03864bc RDI: 000000000000000a
[   28.282882] RBP: 00007ffd2c7fd810 R08: 0000000000000013 R09: 0000000000000013
[   28.282884] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c03864bc
[   28.282885] R13: 000000000000000a R14: 000055c4199d5d00 R15: 000055c41be3c7b0
[   28.282889]  </TASK>
[   28.282890] ---[ end trace 0000000000000000 ]---

RX 6800M on corsair voyager a1600

03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 22 [Radeon RX 6700/6700 XT/6750 XT / 6800M] (rev c3)
68:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Rembrandt [Radeon 680M] (rev c8)

6.1.8-200 Problem is still present … So when can we expect a fix for this problem?

The fix will come from upstream, and likely the kernel. To install a newer kernel, you have a few options, in order from most likely to be supported vs least:

This all comes with the caveat that none of these methods are guaranteed to fix your problem and you might be more likely to introduce new problems by installing kernels that haven’t gone through full community testing, and you might have limited community support running a custom kernel if you go down the compiling your own route. I’m not discouraging you from doing this, as I’ve definitely done this all myself when waiting for some fix or patch to land for my hardware, but it’s good to have a sort of “this hiking trail doesn’t have guard rails” sign up front.

Does someone maybe have a proposal for which diagnostics information we can/should collect for solving problems with external monitors connected to docking stations, and where to report such data?
BTW - Even with the 6.0.18-300 kernel, the two external monitors sometimes do not display content after the system comes back from energy saving (can be solved by disconnecting the docking station and connecting it again. Unfortunately, I don’t remember and don’t have any data about which kernel was the last one without such problems. I believe these problems started at some time in the second half 2022.
I’d be happy to spend some time on analyzing these types of problems and work on proposals for kernel regression tests - as it seems that such tests are not yet part of KernelRegressionTests - Fedora Project Wiki .

I just tested it with the kernel 6.1.9. Everything seems to work.

https://bodhi.fedoraproject.org/updates/FEDORA-2023-4006357f7e

1 Like

Same issue here with an Lenovo Slim 7 and a Dell Dock and running 6.1.7-200.fc37.x86_64 .
Sometimes things work kinda, but other times it just won’t work.

This for alone is one of the most annoying productivity bugs which i noticed during the last 3 years of using fedora.

@siamsensei It is really very tedious. But as I understand it, it is not a Fedora specific problem but a problem in the kernel.
Temporarily I use the vanilla kernel until the problem is completely fixed. With it it runs without problems.

https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories

With kernel 6.1.9-200, the situation appears to be similar to what it was with kernel 6.0.18-300, meaning:

  • If I lock the system and switch off the monitors which are connected to the docking station, and then (after a break) switch on the monitors, the two monitors are recognized and the windows are moved back to the previous positions on the external monitors
  • If I lock the system and do not switch off the monitors, the monitors sometimes remain black after unlocking the screen. In this case, I disconnect the docking station and reconnect it again, which causes one of the following:
    • the windows are moved back to the previous positions on the external monitors
    • the monitors remain black and are not recognized at all, even after disconnecting the monitors from the docking station. In this case, a reboot does not help: After a reboot, the two external monitors are still not recognized. The only action which helps in this case is to shut down and power off the system and start it again with the power button.

If anyone can think of any useful data I could collect to analyze such behavior, I’d be more than happy to do that.

I can postulate the reason why the monitors are recognized after being powered off but not when left energized.

The system configures a monitor when it sees the edid data that is sent by the monitor at initial configuration, which happens when the monitor is powered on. That data may not be sent if the monitor is left powered on and just disconnected. The monitor itself may be at issue.

I tried out 6.2.0-0.rc7.249.vanilla.fc37 and with that one, most of the problems which happen when the monitors and the docking station are connected disappeared. Except just a few minutes ago, when I started the laptop while the docking station and the two monitors were connected and the monitors were switched on: After logging in, the monitors went to power saving mode but the mouse did not stop at the top of the laptop monitor, meaning all three screens were still managed by the window system.
Plus, earlier today, with the same kernel, the following always happened: Right after disconnecting the docking station from the laptop (with two monitors being connected and switched on), some of the windows were shown partially on the laptop screen, the mouse disappeared and did not apear again for at least a minute. So I had to hard power off the laptop.

Is there maybe a list of hardware combinations of laptop and docking station for two monitors for kernel 6.1 or 6.2 which does not have such problems? Or a docking station which is known to work with Lenovo ThinkPad X1 when running these kernels?

I just ran some docking station unplug/plug tests with kernels 6.2.0-0.rc7.249.vanilla.fc37 and 6.0.18-300.fc37, using the logger command to mark the start and end events in the /var/log/messages file.

After unplugging the cable to the docking station, most of the error messages were identical, but with kernel 6.2.0-0.rc7.249.vanilla.fc37, the following additional lines occured after several kded5[<PID>]: colord: EDID ICC Profile already exists and before several kded5[<PID>]: colord: Failed to register device: "device id 'xrandr-California Institute of Technology' already exists" messages:

165155 Feb  7 23:43:09 kernel: i915 0000:00:02.0: [drm] *ERROR* [ENCODER:350:DDI TC3/PHY TC3][DPRX] Failed to enable link training
165156 Feb  7 23:43:10 kernel: i915 0000:00:02.0: [drm] *ERROR* Step 2 of creating MST payload for 00000000dffa0fa4 failed: -5

After pluggin in the cable, there were more differences. The most significant difference for me was that with kernel 6.2.0-0.rc7.249.vanilla.fc37, near the end of the reconfiguration, some lines after this block (which were identical for both kernels):

346 Feb  8 00:12:02 kernel: thunderbolt 1-1: new device found, vendor=0x108 device=0x1630
347 Feb  8 00:12:02 kernel: thunderbolt 1-1: Lenovo ThinkPad Thunderbolt 3 Dock

, the folllowing additional lines appeared:

389 Feb  7 23:43:47 kernel: i915 0000:00:02.0: [drm] *ERROR* Failed to get ACT after 3000ms, last status: 00
390 Feb  7 23:43:50 kernel: i915 0000:00:02.0: [drm] *ERROR* Failed to get ACT after 3000ms, last status: 00
391 Feb  7 23:43:54 kernel: i915 0000:00:02.0: [drm] *ERROR* Failed to get ACT after 3000ms, last status: 00
392 Feb  7 23:43:54 kernel: i915 0000:00:02.0: [drm] *ERROR* Step 2 of creating MST payload for 000000003b027738 failed: -5
393 Feb  7 23:43:57 kernel: i915 0000:00:02.0: [drm] *ERROR* Failed to get ACT after 3000ms, last status: 00

Of note, 6.1.9 has brought in patches that have resolved my issues above.

1 Like

Yeah 6.1.9 fixed it as well on my T14s gen2 using 3 Len-T24 FHD monitor on a Lenovo USB 3 dock gen2