GPU NVIDIA fans not running on Fedora 37

I have a dual-boot computer and usually use the GPU HDMI port of my NVIDIA to connect my screen; But when using Fedora the GPU fans don’t run; eventually the GPU overheats and the computer goes black and crashes. I’m very new and I’m not sure how to solve this (technically I could use the motherboard port every time I use Linux but this is an impractical and cumbersome solution).

System information:

System:
  Kernel: 6.0.9-300.fc37.x86_64 arch: x86_64 bits: 64 compiler: gcc
    v: 2.38-24.fc37 Desktop: KDE Plasma v: 5.26.3 tk: Qt v: 5.15.7 wm: kwin_x11
    dm: SDDM Distro: Fedora release 37 (Thirty Seven)
Machine:
  Type: Desktop Mobo: ASUSTeK model: H170-PRO v: Rev X.0x
    serial: <superuser required> UEFI: American Megatrends v: 3805
    date: 05/16/2018
CPU:
  Info: quad core model: Intel Core i5-7400 bits: 64 type: MCP arch: Kaby Lake
    rev: 9 cache: L1: 256 KiB L2: 1024 KiB L3: 6 MiB
  Speed (MHz): avg: 3000 min/max: 800/3500 cores: 1: 3000 2: 3000 3: 3000
    4: 3000 bogomips: 24000
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3
Graphics:
  Device-1: NVIDIA TU106 [GeForce RTX 2070 Rev. A] vendor: ASUSTeK
    driver: nouveau v: kernel arch: Turing pcie: speed: 2.5 GT/s lanes: 16
    ports: active: HDMI-A-1 empty: DP-1, DP-2, DP-3, HDMI-A-2 bus-ID: 01:00.0
    chip-ID: 10de:1f07 temp: 45.0 C
  Display: x11 server: X.Org v: 1.20.14 with: Xwayland v: 22.1.5
    compositor: kwin_x11 driver: X: loaded: modesetting unloaded: fbdev,vesa
    dri: nouveau gpu: nouveau display-ID: :0 screens: 1
  Screen-1: 0 s-res: 1920x1080 s-dpi: 96
  Monitor-1: HDMI-A-1 mapped: HDMI-1 model: Acer GN246HL res: 1920x1080
    dpi: 92 diag: 609mm (24")
  API: OpenGL v: 4.3 Mesa 22.2.3 renderer: NV166 direct render: Yes
Audio:
  Device-1: Intel 100 Series/C230 Series Family HD Audio vendor: ASUSTeK
    driver: snd_hda_intel v: kernel bus-ID: 00:1f.3 chip-ID: 8086:a170
  Device-2: NVIDIA TU106 High Definition Audio vendor: ASUSTeK
    driver: snd_hda_intel v: kernel pcie: speed: 2.5 GT/s lanes: 16
    bus-ID: 01:00.1 chip-ID: 10de:10f9
  Sound API: ALSA v: k6.0.9-300.fc37.x86_64 running: yes
  Sound Server-1: PulseAudio v: 16.1 running: no
  Sound Server-2: PipeWire v: 0.3.61 running: yes
Network:
  Device-1: Broadcom vendor: ASUSTeK driver: brcmfmac v: kernel pcie:
    speed: 5 GT/s lanes: 1 port: N/A bus-ID: 02:00.0 chip-ID: 14e4:43c3
  IF: wlp2s0 state: down mac: <filter>
  Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: ASUSTeK PRIME B450M-A driver: r8169 v: kernel pcie: speed: 2.5 GT/s
    lanes: 1 port: d000 bus-ID: 05:00.0 chip-ID: 10ec:8168
  IF: enp5s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Bluetooth:
  Device-1: Cambridge Silicon Radio Bluetooth Dongle (HCI mode) type: USB
    driver: btusb v: 0.8 bus-ID: 1-7:2 chip-ID: 0a12:0001
  Report: rfkill ID: hci0 rfk-id: 0 state: up address: see --recommends
Drives:
  Local Storage: total: 3.64 TiB used: 154.22 GiB (4.1%)
  ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO 1TB size: 931.51 GiB
    speed: 31.6 Gb/s lanes: 4 serial: <filter> temp: 38.9 C
  ID-2: /dev/sda vendor: Western Digital model: WD10EZEX-22MFCA0
    size: 931.51 GiB speed: 6.0 Gb/s serial: <filter> temp: 30 C
  ID-3: /dev/sdb vendor: Seagate model: ST2000DM008-2FR102 size: 1.82 TiB
    speed: 6.0 Gb/s serial: <filter> temp: 31 C
Partition:
  ID-1: / size: 878.89 GiB used: 153.94 GiB (17.5%) fs: btrfs dev: /dev/dm-0
    mapped: luks-fbe3e44d-337a-4c1c-81b4-664bd06d85c7
  ID-2: /boot size: 973.4 MiB used: 267.9 MiB (27.5%) fs: ext4
    dev: /dev/sda2
  ID-3: /boot/efi size: 598.8 MiB used: 17.4 MiB (2.9%) fs: vfat
    dev: /dev/sda1
  ID-4: /home size: 878.89 GiB used: 153.94 GiB (17.5%) fs: btrfs
    dev: /dev/dm-0 mapped: luks-fbe3e44d-337a-4c1c-81b4-664bd06d85c7
Swap:
  ID-1: swap-1 type: zram size: 8 GiB used: 0 KiB (0.0%) priority: 100
    dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 29.0 C mobo: N/A gpu: nouveau temp: 45.0 C
  Fan Speeds (RPM): N/A gpu: nouveau fan: 0
Info:
  Processes: 273 Uptime: 6m Memory: 31.27 GiB used: 2.78 GiB (8.9%)
  Init: systemd v: 251 target: graphical (5) default: graphical Compilers:
  gcc: 12.2.1 Packages: pm: rpm pkgs: N/A note: see --rpm Shell: Bash v: 5.2.9
  running-in: kitty inxi: 3.3.23

Thank you for your attention;
G.

The only time I have had that scenario was when the fans on the GPU had failed. Replacing the fans fixed it.

However, if I am understanding your post it seems that either the fans run when running windows or that there is little enough stress on the GPU that it never overheats.

Is that correct or am I misunderstanding?

Are you using the default nouveau driver for the GPU or have you installed the nvidia driver from rpmfusion? The nvidia driver usually controls the GPU fan speeds and temps.
The nouveau driver does not fully support the newer nvidia GPUs.

Please provide the output of inxi -Fzxx as preformatted text.

Yes, that’s the case; the fans are easily manageable on Windows GeForce Experience; nevertheless fans just don’t move on the Fedora Boot; even though I installed the non free RPM Nvidia Driver.

Hi - I see you edited in your system information, and in there it looks like it is actively using the nouveau driver for your NVIDIA card? I have the RPM non-free NVIDIA driver installed, and this is what my inxi -Fzxx shows in the Graphics section:

Graphics:
  Device-1: NVIDIA TU117M vendor: Hewlett-Packard driver: nvidia v: 520.56.06
    arch: Turing pcie: speed: 2.5 GT/s lanes: 8 ports: active: none
    empty: HDMI-A-1 bus-ID: 01:00.0 chip-ID: 10de:1f99
  Device-2: AMD Cezanne [Radeon Vega Series / Radeon Mobile Series]
    vendor: Hewlett-Packard driver: amdgpu v: kernel arch: GCN-5.1 pcie:
    speed: 8 GT/s lanes: 16 ports: active: eDP-1 empty: none bus-ID: 05:00.0
    chip-ID: 1002:1638 temp: 48.0 C
  Device-3: Luxvisions Innotech HP TrueVision HD Camera type: USB
    driver: uvcvideo bus-ID: 3-3:3 chip-ID: 30c9:0035
  Display: wayland server: X.org v: 1.20.14 with: Xwayland v: 22.1.5
    compositor: gnome-shell driver: gpu: amdgpu display-ID: 0
  Monitor-1: eDP-1 model: BOE Display 0x094d res: 1920x1080 dpi: 142
    diag: 395mm (15.5")
  API: OpenGL v: 4.6 Mesa 22.2.3 renderer: AMD Radeon Graphics (renoir LLVM
    15.0.0 DRM 3.48 6.0.10-300.fc37.x86_64) direct render: Yes

The fact that yours says driver: nouveau still would tell me that the NVIDIA drivers either aren’t fully installed or aren’t active - just to check, have you tried removing and reinstalling them using the RPMFusion guide?

https://rpmfusion.org/Howto/NVIDIA?highlight=(\bCategoryHowto\b)#Installing_the_drivers

As noted, this clearly shows the nouveau driver in use.
Please show us the installed nvidia drivers.
dnf list installed *nvidia* will show all needed info if you installed from the rpmfusion repo.

This is the result of dnf list installed *nvidia*

nvidia-gpu-firmware.noarch                           20221109-144.fc37                           @updates

Hmm, just as @computersavvy suspected, looks like something didn’t actually execute in the RPMFusion driver installation - here’s what dnf list installed *nvidia* on mine looks like after installing from rpmfusion:

Installed Packages
akmod-nvidia.x86_64                       3:520.56.06-1.fc37 @rpmfusion-nonfree-nvidia-driver
kmod-nvidia-6.0.10-300.fc37.x86_64.x86_64 3:520.56.06-1.fc37 @@commandline      
kmod-nvidia-6.0.8-300.fc37.x86_64.x86_64  3:520.56.06-1.fc37 @@commandline      
kmod-nvidia-6.0.9-300.fc37.x86_64.x86_64  3:520.56.06-1.fc37 @@commandline      
nvidia-gpu-firmware.noarch                20221109-144.fc37  @updates           
nvidia-settings.x86_64                    3:520.56.06-1.fc37 @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia.x86_64                3:520.56.06-1.fc37 @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-cuda-libs.x86_64      3:520.56.06-1.fc37 @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-kmodsrc.x86_64        3:520.56.06-1.fc37 @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-libs.i686             3:520.56.06-1.fc37 @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-libs.x86_64           3:520.56.06-1.fc37 @rpmfusion-nonfree-nvidia-driver
xorg-x11-drv-nvidia-power.x86_64          3:520.56.06-1.fc37 @rpmfusion-nonfree-nvidia-driver

Think you might need to try that installation again, using the RPMFusion site’s instructions (perhaps you enabled the right repository, but the install process just didn’t complete afterward?)

Hope that helps!

Right, and here is mine after I just installed the nvidia driver new following an update from F36 to F37

$ dnf list installed *nvidia*
Installed Packages
akmod-nvidia.x86_64                                                      3:520.56.06-1.fc37                                @rpmfusion-nonfree
kmod-nvidia-6.0.10-300.fc37.x86_64.x86_64                                3:520.56.06-1.fc37                                @@commandline     
nvidia-gpu-firmware.noarch                                               20221109-144.fc37                                 @updates          
nvidia-persistenced.x86_64                                               3:520.56.06-1.fc37                                @rpmfusion-nonfree
nvidia-settings.x86_64                                                   3:520.56.06-1.fc37                                @rpmfusion-nonfree
xorg-x11-drv-nvidia.x86_64                                               3:520.56.06-1.fc37                                @rpmfusion-nonfree
xorg-x11-drv-nvidia-cuda.x86_64                                          3:520.56.06-1.fc37                                @rpmfusion-nonfree
xorg-x11-drv-nvidia-cuda-libs.x86_64                                     3:520.56.06-1.fc37                                @rpmfusion-nonfree
xorg-x11-drv-nvidia-kmodsrc.x86_64                                       3:520.56.06-1.fc37                                @rpmfusion-nonfree
xorg-x11-drv-nvidia-libs.x86_64                                          3:520.56.06-1.fc37                                @rpmfusion-nonfree

Don’t know why, but after the upgrade the driver would not function so I removed everything and reinstalled then it worked.
To avoid removing the firmware package I used dnf remove *nvidia* --exclude=nvidia-gpu-firmware then reinstalled with dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda.
Yes, I use apps that require cuda and have for a long time.

Hi; I applied the recommended
dnf remove *nvidia* --exclude=nvidia-gpu-firmware
and then reinstalled with
dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda.
This is the new result. Fans are not working still because inxi -Fzxx | grep fan

Fan Speeds (RPM): N/A gpu: nvidia fan: 0%

General result:

System:
  Kernel: 6.0.10-300.fc37.x86_64 arch: x86_64 bits: 64 compiler: gcc
    v: 2.38-25.fc37 Desktop: KDE Plasma v: 5.26.4 tk: Qt v: 5.15.7 wm: kwin_x11
    dm: SDDM Distro: Fedora release 37 (Thirty Seven)
Machine:
  Type: Desktop Mobo: ASUSTeK model: H170-PRO v: Rev X.0x
    serial: <superuser required> UEFI: American Megatrends v: 3805
    date: 05/16/2018
CPU:
  Info: quad core model: Intel Core i5-7400 bits: 64 type: MCP arch: Kaby Lake
    rev: 9 cache: L1: 256 KiB L2: 1024 KiB L3: 6 MiB
  Speed (MHz): avg: 3249 high: 3500 min/max: 800/3500 cores: 1: 3000 2: 3500
    3: 3000 4: 3499 bogomips: 24000
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3
Graphics:
  Device-1: NVIDIA TU106 [GeForce RTX 2070 Rev. A] vendor: ASUSTeK
    driver: nvidia v: 520.56.06 arch: Turing pcie: speed: 8 GT/s lanes: 16
    ports: active: none off: HDMI-A-2 empty: DP-1, DP-2, HDMI-A-1, Unknown-1
    bus-ID: 01:00.0 chip-ID: 10de:1f07
  Display: x11 server: X.Org v: 1.20.14 with: Xwayland v: 22.1.5
    compositor: kwin_x11 driver: X: loaded: nvidia
    unloaded: fbdev,modesetting,nouveau,vesa alternate: nv
    gpu: nvidia,nvidia-nvswitch display-ID: :0 screens: 1
  Screen-1: 0 s-res: 1920x1080 s-dpi: 92
  Monitor-1: HDMI-A-2 mapped: HDMI-1 note: disabled model: Acer GN246HL
    res: 1920x1080 dpi: 92 diag: 609mm (24")
  API: OpenGL v: 4.6.0 NVIDIA 520.56.06 renderer: NVIDIA GeForce RTX
    2070/PCIe/SSE2 direct render: Yes
Audio:
  Device-1: Intel 100 Series/C230 Series Family HD Audio vendor: ASUSTeK
    driver: snd_hda_intel v: kernel bus-ID: 00:1f.3 chip-ID: 8086:a170
  Device-2: NVIDIA TU106 High Definition Audio vendor: ASUSTeK
    driver: snd_hda_intel v: kernel pcie: speed: 8 GT/s lanes: 16
    bus-ID: 01:00.1 chip-ID: 10de:10f9
  Sound API: ALSA v: k6.0.10-300.fc37.x86_64 running: yes
  Sound Server-1: PulseAudio v: 16.1 running: no
  Sound Server-2: PipeWire v: 0.3.61 running: yes
Network:
  Device-1: Broadcom vendor: ASUSTeK driver: brcmfmac v: kernel pcie:
    speed: 5 GT/s lanes: 1 port: N/A bus-ID: 02:00.0 chip-ID: 14e4:43c3
  IF: wlp2s0 state: down mac: <filter>
  Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: ASUSTeK PRIME B450M-A driver: r8169 v: kernel pcie: speed: 2.5 GT/s
    lanes: 1 port: d000 bus-ID: 05:00.0 chip-ID: 10ec:8168
  IF: enp5s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
Bluetooth:
  Device-1: Cambridge Silicon Radio Bluetooth Dongle (HCI mode) type: USB
    driver: btusb v: 0.8 bus-ID: 1-7:2 chip-ID: 0a12:0001
  Report: rfkill ID: hci0 rfk-id: 0 state: up address: see --recommends
Drives:
  Local Storage: total: 3.64 TiB used: 156.78 GiB (4.2%)
  ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO 1TB size: 931.51 GiB
    speed: 31.6 Gb/s lanes: 4 serial: <filter> temp: 33.9 C
  ID-2: /dev/sda vendor: Western Digital model: WD10EZEX-22MFCA0
    size: 931.51 GiB speed: 6.0 Gb/s serial: <filter>
  ID-3: /dev/sdb vendor: Seagate model: ST2000DM008-2FR102 size: 1.82 TiB
    speed: 6.0 Gb/s serial: <filter>
Partition:
  ID-1: / size: 878.89 GiB used: 156.5 GiB (17.8%) fs: btrfs dev: /dev/dm-0
    mapped: luks-fbe3e44d-337a-4c1c-81b4-664bd06d85c7
  ID-2: /boot size: 973.4 MiB used: 267.9 MiB (27.5%) fs: ext4
    dev: /dev/sda2
  ID-3: /boot/efi size: 598.8 MiB used: 17.4 MiB (2.9%) fs: vfat
    dev: /dev/sda1
  ID-4: /home size: 878.89 GiB used: 156.5 GiB (17.8%) fs: btrfs
    dev: /dev/dm-0 mapped: luks-fbe3e44d-337a-4c1c-81b4-664bd06d85c7
Swap:
  ID-1: swap-1 type: zram size: 8 GiB used: 0 KiB (0.0%) priority: 100
    dev: /dev/zram0
Sensors:
  System Temperatures: cpu: 33.0 C mobo: N/A gpu: nvidia temp: 52 C
  Fan Speeds (RPM): N/A gpu: nvidia fan: 0%
Info:
  Processes: 303 Uptime: 9m Memory: 31.27 GiB used: 3.27 GiB (10.5%)
  Init: systemd v: 251 target: graphical (5) default: graphical Compilers:
  gcc: 12.2.1 Packages: pm: rpm pkgs: N/A note: see --rpm pm: flatpak pkgs: 5
  Shell: Bash v: 5.2.9 running-in: kitty inxi: 3.3.23

Also, in the case of dnf list installed *nvidia*

Installed Packages
akmod-nvidia.x86_64                                    3:520.56.06-1.fc37              @rpmfusion-nonfree
kmod-nvidia-6.0.10-300.fc37.x86_64.x86_64              3:520.56.06-1.fc37              @@commandline     
nvidia-gpu-firmware.noarch                             20221109-144.fc37               @updates          
nvidia-persistenced.x86_64                             3:520.56.06-1.fc37              @rpmfusion-nonfree
nvidia-settings.x86_64                                 3:520.56.06-1.fc37              @rpmfusion-nonfree
xorg-x11-drv-nvidia.x86_64                             3:520.56.06-1.fc37              @rpmfusion-nonfree
xorg-x11-drv-nvidia-cuda.x86_64                        3:520.56.06-1.fc37              @rpmfusion-nonfree
xorg-x11-drv-nvidia-cuda-libs.x86_64                   3:520.56.06-1.fc37              @rpmfusion-nonfree
xorg-x11-drv-nvidia-kmodsrc.x86_64                     3:520.56.06-1.fc37              @rpmfusion-nonfree
xorg-x11-drv-nvidia-libs.x86_64                        3:520.56.06-1.fc37              @rpmfusion-nonfree
xorg-x11-drv-nvidia-power.x86_64                       3:520.56.06-1.fc37              @rpmfusion-nonfree

Glad to see that you now have the nvidia driver installed and functional.
What is the info you see with the nvidia-settings app related to fan and temp?
How do you know the fan is not running? Have you looked at it while the system is running, or just going by the reported speed in inxi? The reported speed may or may not be correct.

On mine, a GTX 1050, I do not get a fan speed reported in inxi, but the thermal temp remains near 60 C while running a GPU process from boinc using cuda. I can see the temp with lm_sensors and gkrellm as well as both fan speed and temp within the nvidia-settings app. Note that I leave the fan in auto control and it keeps the temp near where shown in the image below.

Hi, right now I have an open case and I can see the fans on the gpu not moving unfortunately.

And what does the nvidia-settings app tell you?

When I launch NVIDIA settings the Fan Speed is always set to 0. If I do Enable GPU Fan Settings it will start the fans immediately but it’s not the default. Meaning that every launch the fans will be off unless I set it here.

I have found some alternatives and will report results:

I see you have manually set the speeds. That is fine. I can do so as well if I choose.

The only drawback is that the fan is now locked to the speed you select so it may require more tweaking to maintain temps or running the fan at a higher speed than actually required to make sure it does not overheat with load changes.

With a single GPU the automatic control keeps the temp at about 60 C for me and I am happy with that. Note that the reported temp with nvidia-settings is for card 0 only.

Using lm_sensors and gkrellm probably will allow you to monitor temps on multiple cards at the same time.

How do you do for having the possibility to manage the fan ? I don’t have this option for my part

Fedora 37 & fedora 41 are drastically different, as well as using wayland instead of X11 for the DE.
You are opening a necro thread that already had a solution posted

I am closing this one, and if you have a question that is related where you desire an answer then please open your own topic and provide details so that we may be able to assist.