Fedora 34 freezes

Got it @chrismurphy I’ll play around with Windows about the timezone.

Do you mean just install memtest and leave for the night or firstly overclock cpu/ram and then start memtest?

As for today’s freeze, I saw that time wasn’t changing, but I could move the pointer and at this moment I could switch to tty3 it was prompting login as expected. But When I decided to type it was already completely frozen. Trying to give as much information about my freeze behavior as possible))

Returning back to netconsole, it seems like I’ve properly set it up, it’s starting from the boot order and running successfully.

Note: Firewall is temporarily disabled.

Here’s dmesg |grep netcon:

[   10.535605] netpoll: netconsole: local port 6666
[   10.535610] netpoll: netconsole: local IPv4 address 192.168.1.100
[   10.535612] netpoll: netconsole: interface 'wlo1'
[   10.535613] netpoll: netconsole: remote port 6666
[   10.535614] netpoll: netconsole: remote IPv4 address 192.168.1.163
[   10.535616] netpoll: netconsole: remote ethernet address 8c:89:...
[   10.535875] Modules linked in: netconsole(+) overlay bnep sunrpc squashfs vfat fat loop snd_sof_pci_intel_cnl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_hda_codec_realtek intel_tcc_cooling x86_pkg_temp_thermal snd_hda_codec_generic intel_powerclamp snd_compress coretemp ledtrig_audio iwlmvm snd_hda_codec_hdmi snd_pcm_dmaengine ac97_bus iTCO_wdt kvm_intel intel_pmc_bxt snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi mac80211 ee1004 iTCO_vendor_support mei_hdcp ucsi_ccg intel_rapl_msr snd_hda_codec libarc4 kvm uvcvideo snd_hda_core btusb btrtl snd_hwdep videobuf2_vmalloc iwlwifi irqbypass videobuf2_memops snd_seq btbcm videobuf2_v4l2 btintel rapl videobuf2_common snd_seq_device dcdbas intel_cstate bluetooth i2c_i801 intel_uncore cfg80211 pcspkr
[   10.535935]  write_msg+0xd8/0xf0 [netconsole]
[   10.535936]  init_netconsole+0x20e/0x1000 [netconsole]
[   10.545436] printk: console [netcon0] enabled
[   10.545440] netconsole: network logging started

So in this case remove PC is running windows and nc -l -u -p 6666 > netconsole.txt but the file is empty when I plug in the drive on the problematic laptop (which triggers kernel logs in /var/log/messages). I’m able to see this messages on the host laptop but it’s not transferring those messages to my remote PC even though I can ping it.

I’ve also tried this process on my other laptop running F33 respectively changing the IP before testing. Still no luck…

I found my problem why mine was freezing up on me. I don’t have any issues of a freeze up on my fedora 34 using a Ethernet connection. But when I turn off the wire connected and turn on and connect to a WiFi it freezes up after about 15 minutes. So that tells me there is a problem in the kernel with the WiFi drivers. My driver was recently added to the kernel as it never was in the kernel for a long while. As long as I don’t use my WiFi my system won’t freeze up.

1 Like

Yeah, interesting case you had.

I use WI-Fi most of the time, sometimes I don’t have freezes the whole day and my workflow is pretty the same from day-to-day. The problem is we can’t trace back the source of the freezes in my case. Seems like logs aren’t giving much of the information to debug.

I also use an ethernet connection using an adapter. And the laptop isn’t freezing after a specific time. It’s very random at this point while we don’t have enough information.

I don’t know why but I’m linking freeze with losing the power of USB ports as I understand from the logs and there are errors about Nvidia drivers not installed and one more error is about that this machine isn’t dell (it’s something related to the bios), can’t get what’s the point here…

I contact Razer support in my free time and try to install Nvidia drivers.

1 Like

Well, it took me two days to figure out what it is. But I’m not done yet and will see what I can do to use my WiFi so I can connect my other computer to the Ethernet so I can share internet with it. So more work ahead for me.

1 Like

When I look at the logs, I find this line for my WiFi driver.
rtw_8821ce 0000:03:00.0 wlp3s0: renamed from wlan0
which I never saw it do on Linux before. I mean it’s always been named as wlan0 when I have used WiFi before on other computers.

1 Like

Hmmm, maybe the driver is doing weird things??

I’ve found this article very randomly, not searching the net specifically for my problem, though here it is:
https://www.phoronix.com/scan.php?page=news_item&px=Samsung-860-870-More-Quirks

I don’t know maybe this is a very dumb connection to my issue, I’ve Samsung 980 SSD - M.2 NVMe. Maybe disable trim or something else and see the result.

Can this be the source of the freeze issue? @ankursinha do you know any tool/command I can test and see if the laptop will freeze while testing?

Yeh, worth trying:

https://fedoraproject.org/wiki/Changes/EnableFSTrimTimer#User_Experience

No, without knowing what the issue is it’s hard to think of a command that will simulate the freeze :frowning:

1 Like

Yeah, I see.

I’ll disable fstrim and see the result.

1 Like

I believe that is called Consistent Network Device Naming and Predictable Network Interface Names. Also see the biosdevname man page.

Nope, trim didn’t help. My laptop still freezes…

I ran into a similar issue where I’d get freezes with no logs etc. It looks like there’s a known bug in the 5.14.x kernels related to the I/O scheduler.

Here are the bugs:

https://bugzilla.redhat.com/show_bug.cgi?id=2008529

https://bugzilla.kernel.org/show_bug.cgi?id=214503

As the kernel bug suggests, it could be a scheduler bug. So, can you check what your scheduler is currently?

cat /sys/block/*/queue/scheduler

If it’s bfq, that could be causing it. Try changing it to mq-deadline:

echo mq-deadline | sudo tee /sys/block/sd*/queue/scheduler

Change the sd* bit here depending on what your drives are called.

2 Likes

For me, it started like from 5.13.x, hope this issue can be fixed in 5.14.x

cat /sys/block/*/queue/scheduler:

[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[none] mq-deadline kyber bfq 
[none] mq-deadline kyber bfq 
none

After running echo mq-deadline | sudo tee /sys/block/nvme1n1/queue/scheduler, cat /sys/block/*/queue/scheduler returns:

[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[mq-deadline] kyber bfq none
[none] mq-deadline kyber bfq 
[mq-deadline] kyber bfq none
none

Changed second line from the bottom.

Sometimes after closing the lid when the system goes in sleep mode and then opening the lid after 5-10 seconds system goes to sleep mode again, no matter how many times I press any key to wake it up, the system still enters sleep mode until I reboot the machine.

Sometimes this happens when I connect the power cord.

1 Like

Hard to say if that’s related to the freezes. Let’s see if changing your scheduler fixes the freezes, and then we can look at the suspend etc. issues :+1:

1 Like

I’ve moved to kernel 5.14.9-200.fc34

Let’s see if I’ll still get freeze issues…

1 Like

Caught two freezes in one hour yesterday :///

This is the last freeze log journalctl -b 0 -e -p err:

-- Journal begins at Fri 2021-10-08 21:26:42 +04, ends at Sat 2021-10-09 01:26:41 +04. --
Oct 09 01:26:39 Fedora-RazerBlade15 kernel: x86/cpu: SGX disabled by BIOS.
Oct 09 01:26:39 Fedora-RazerBlade15 kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.I2C2.TPD0], AE_NOT_FOUND (20210604/dswload2-162)
Oct 09 01:26:39 Fedora-RazerBlade15 kernel: ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20210604/psobject-220)
Oct 09 01:26:39 Fedora-RazerBlade15 kernel: ACPI Error: No handler for Region [VRTC] (000000000531e353) [SystemCMOS] (20210604/evregion-130)
Oct 09 01:26:39 Fedora-RazerBlade15 kernel: ACPI Error: Region SystemCMOS (ID=5) has no handler (20210604/exfldio-261)
Oct 09 01:26:39 Fedora-RazerBlade15 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.EC0.RTEC due to previous error (AE_NOT_EXIST) (20210604/psparse-529)
Oct 09 01:26:39 Fedora-RazerBlade15 kernel: ACPI Error: Aborting method \_SB.PCI0.LPCB.EC0._REG due to previous error (AE_NOT_EXIST) (20210604/psparse-529)
Oct 08 21:26:42 Fedora-RazerBlade15 kernel: dell_smbios: Unable to run on non-Dell system
Oct 08 21:26:44 Fedora-RazerBlade15 kernel: nvidia-gpu 0000:01:00.3: i2c timeout error e0000000
Oct 08 21:26:44 Fedora-RazerBlade15 kernel: ucsi_ccg 24-0008: i2c_transfer failed -110
Oct 08 21:26:44 Fedora-RazerBlade15 kernel: ucsi_ccg 24-0008: ucsi_ccg_init failed - -110
Oct 08 21:26:44 Fedora-RazerBlade15 alsactl[978]: alsa-lib parser.c:242:(error_node) UCM is not supported for this HDA model (HDA Intel PCH at 0x6014118000 irq 171)
Oct 08 21:26:44 Fedora-RazerBlade15 alsactl[978]: alsa-lib main.c:1405:(snd_use_case_mgr_open) error: failed to import hw:0 use case configuration -6
Oct 08 21:26:44 Fedora-RazerBlade15 alsactl[978]: alsa-lib parser.c:242:(error_node) UCM is not supported for this HDA model (HDA NVidia at 0xa1080000 irq 17)
Oct 08 21:26:44 Fedora-RazerBlade15 alsactl[978]: alsa-lib main.c:1405:(snd_use_case_mgr_open) error: failed to import hw:1 use case configuration -6
Oct 08 21:26:44 Fedora-RazerBlade15 /usr/sbin/irqbalance[982]: libcap-ng used by "/usr/sbin/irqbalance" failed due to not having CAP_SETPCAP in capng_apply
Oct 08 21:26:44 Fedora-RazerBlade15 sssd[1004]: Could not open file [/var/log/sssd/sssd.log]. Error: [2][No such file or directory]
Oct 08 21:26:51 Fedora-RazerBlade15 setroubleshoot[1973]: SELinux is preventing nginx from write access on the sock_file valet.sock. For complete SELinux messages run: sealert -l fb2619b3-3a11-463c-b745-afe49dddbe59
Oct 08 21:26:53 Fedora-RazerBlade15 systemd[1]: Failed to start Crash recovery kernel arming.
Oct 08 21:26:53 Fedora-RazerBlade15 systemd[1]: Failed to start Initializes network console logging of kernel messages.
Oct 08 21:26:56 Fedora-RazerBlade15 systemd-coredump[3345]: [🡕] Process 1098 (dnsmasq) of user 983 dumped core.
                                                            
                                                            Stack trace of thread 1098:
                                                            #0  0x000055bff7609256 lookup_domain (dnsmasq + 0x53256)
                                                            #1  0x000055bff75d9a3a forward_query.lto_priv.0 (dnsmasq + 0x23a3a)
                                                            #2  0x000055bff75de5d0 check_dns_listeners (dnsmasq + 0x285d0)
                                                            #3  0x000055bff75c2b00 main (dnsmasq + 0xcb00)
                                                            #4  0x00007f2ca010bb75 __libc_start_main (libc.so.6 + 0x27b75)
                                                            #5  0x000055bff75c348e _start (dnsmasq + 0xd48e)
Oct 08 21:26:57 Fedora-RazerBlade15 abrt-notification[3410]: [🡕] Process 1092 (dnsmasq) crashed in lookup_domain()
Oct 08 21:27:00 Fedora-RazerBlade15 lightdm[3193]: gkr-pam: unable to locate daemon control file
Oct 08 21:27:03 Fedora-RazerBlade15 sssd_kcm[4306]: Could not open file [/var/log/sssd/sssd_kcm.log]. Error: [2][No such file or directory]
Oct 08 21:27:11 Fedora-RazerBlade15 bluetoothd[980]: src/profile.c:record_cb() Unable to get Hands-Free Voice gateway SDP record: Host is down

I have changed windows date to UTC but as you can see from the logs it still shows “future” date.

So yesterday I got four freezes and when it occurred I switched to another tty and saw interesting messages. See attachments:

So I quickly googled around this topic:
https://gitlab.freedesktop.org/drm/intel/-/issues/3068
https://bbs.archlinux.org/viewtopic.php?id=257893

But I didn’t find any solution. @ankursinha maybe you have any suggestions, or these screens may contain useful information. I do see that there is some problem with GPU, I assume it’s an internal GPU.

On one screens there is error in Xorg, as I remember I’m using Wayland…

There are numerous results when looking for rcs0 on google. Maybe that will lead you the proper way. Most seen to hint at i915 driver issues for the intel GPU.

1 Like

Hey, @computersavvy thanks for the suggestion. I’ll google with rcs0.

I thought exactly like this. Cuz I don’t have installed Nvidia drivers yet.