So this posting is about how to modify the script bellow for your specific AMDGPU based graphics card as , in its current form its setup for my RX6800XT.
The first thing to say is why this script, and not just let the cards BIOS do the right thing.
I found that while running Stable diffusion (or in fact any sustained full on Compute) using the AMDGPU pushed the Junction temperature of the graphics card to beyond AMD’s maximum temperature, when letting the GPU alone govern the fan speed.
Other GUI controls where not working as well as i would liked, so I decided to tailor make this to
my own needs.
Of course if you are happier with one of the alternative gui tools, then dont let me put you off.
This script governs the TDP limit and the fan at the same time, as when on full load the cards
fans alone in some situations (warm room, hot season, etc) are not enough.
I push the fans to Maximum as we reach 100 degrees, but if the temperature continues to rise
despite the fans being at maximum, i start to pull down the max TDP limiting the heat generated.
You are free to adjust the curve of these two to suit your card, by simply adjusting the numbers
used in each temperature reading, and subsequent power/fan control update.
But before we get to all of that, with your card installed we need to gather the cards hardware register settings as follows.
run the following command on your system.
find /sys/class/drm/card[0-9]/device/hwmon/hwmon[0-9]
This will list a tree of files, including files like
/sys/class/drm/card1/device/hwmon/hwmon2/power1_cap_max
and
/sys/class/drm/card1/device/hwmon/hwmon2/temp3_input
/sys/class/drm/card1/device/hwmon/hwmon2/temp1_input
/sys/class/drm/card1/device/hwmon/hwmon2/temp2_input
Some cards have the above 3 temperature sensors for
The temp2 equates to the junction in sensors, temp1 is the overall temperature and temp3 is the memory
29 degrees is represented by a number 29000
I chose to use the junction temperature, as the other two will be influences by the core GPU junction temp anyway, and the junction temp is the first temperature to rise, so the quickest to get a reading off.
As every cards power1_cap_max is different, you need to find out yours.
the RX6800XT is 293 watts and this is represented by the number
cat /sys/class/drm/card1/device/hwmon/hwmon2/power1_cap_max 293000000
I think the RX5700 will be 180000000
But it could be lower than that if the setting amdgpu.ppfeaturemask
in the grub line is not set as follows.
cat /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="rhgb quiet amd_iommu=on iommu=pt psi=1 amdgpu.ppfeaturemask=0xffffffff"
GRUB_DISABLE_RECOVERY="true"
GRUB_ENABLE_BLSCFG=true
the other settings like psi=1 help things like obs studio work with the amd opencl settings.
so create a directory as root of /opt/amdgpu
create an empty file for now of /opt/amdgpu/setamdgpu
with the command touch /opt/amdgpu/setamdgpu
and again as root create the following file
/etc/systemd/system/setamdgpu.service
put the following in it:
[Unit]
Description=Set AMDGPU power to 293 watts and control fans
[Service]
ExecStart=/opt/amdgpu/setamdgpu
[Install]
WantedBy=default.target
then run the following commands
systemctl daemon-reload
systemctl enable setamdgpu
systemctl start setamdgpu
Now populate the file /opt/amdgpu/setamdgpu
with the script based on the data you have gathered above.
#!/bin/sh
####### version for RX6800XT
####### setup an alias to make the code look cleaner
export CARD="/sys/class/drm/card[0-9]/device/hwmon/hwmon[0-9]"
####### enable the fan control
echo 1 > $(find ${CARD}/pwm1_enable)
####### set the initial powercap to 293 watts
echo 293000000 > $(find ${CARD}/power1_cap)
while true
do
sleep 1
AMDTEMP=$(cat ${CARD}/temp2_input)
case ${AMDTEMP} in
12[6-9]000)
echo 250 > $(find ${CARD}/pwm1);echo 240000000 > $(find ${CARD}/power1_cap);;
12[0-5]000)
echo 250 > $(find ${CARD}/pwm1);echo 240000000 > $(find ${CARD}/power1_cap);;
11[6-9]000)
echo 250 > $(find ${CARD}/pwm1);echo 250000000 > $(find ${CARD}/power1_cap);;
11[0-5]000)
echo 250 > $(find ${CARD}/pwm1);echo 250000000 > $(find ${CARD}/power1_cap);;
10[6-9]000)
echo 250 > $(find ${CARD}/pwm1);echo 260000000 > $(find ${CARD}/power1_cap);;
10[0-5]000)
echo 250 > $(find ${CARD}/pwm1);echo 260000000 > $(find ${CARD}/power1_cap);;
9[6-9]000)
echo 240 > $(find ${CARD}/pwm1);echo 270000000 > $(find ${CARD}/power1_cap);;
9[0-5]000)
echo 230 > $(find ${CARD}/pwm1);echo 270000000 > $(find ${CARD}/power1_cap);;
8[6-9]000)
echo 220 > $(find ${CARD}/pwm1);echo 280000000 > $(find ${CARD}/power1_cap);;
8[0-5]000)
echo 210 > $(find ${CARD}/pwm1);echo 280000000 > $(find ${CARD}/power1_cap);;
7[6-9]000)
echo 200 > $(find ${CARD}/pwm1);echo 290000000 > $(find ${CARD}/power1_cap);;
7[0-5]000)
echo 190 > $(find ${CARD}/pwm1);echo 293000000 > $(find ${CARD}/power1_cap);;
6[6-9]000)
echo 180 > $(find ${CARD}/pwm1);echo 293000000 > $(find ${CARD}/power1_cap);;
6[0-5]000)
echo 170 > $(find ${CARD}/pwm1);echo 293000000 > $(find ${CARD}/power1_cap);;
5[6-9]000)
echo 160 > $(find ${CARD}/pwm1);echo 293000000 > $(find ${CARD}/power1_cap);;
5[0-5]000)
echo 150 > $(find ${CARD}/pwm1);echo 293000000 > $(find ${CARD}/power1_cap);;
4[6-9]000)
echo 140 > $(find ${CARD}/pwm1);echo 293000000 > $(find ${CARD}/power1_cap);;
4[0-5]000)
echo 130 > $(find ${CARD}/pwm1);echo 293000000 > $(find ${CARD}/power1_cap);;
3[6-9]000)
echo 120 > $(find ${CARD}/pwm1);echo 293000000 > $(find ${CARD}/power1_cap);;
3[0-5]000)
echo 110 > $(find ${CARD}/pwm1);echo 293000000 > $(find ${CARD}/power1_cap);;
*)
echo 100 > $(find ${CARD}/pwm1);echo 293000000 > $(find ${CARD}/power1_cap);;
esac
done
now run
systemctl restart setamdgpu
Other things you might want to do are as follows.
dnf install 'rocm*'
amd’s equivalent to cuda, ie rocm that will give you access to opencl. hip etc.
replace fedoras codecs vaccated of any 264 codecs with the non free ones, by first installing rpmfusion repos and then running the following commands.
dnf remove mesa-vdpau-drivers-23.1.1-1.fc38.x86_64 mesa-va-drivers-23.1.1-1.fc38.x86_64
dnf install qt5-qtwebengine-freeworld.x86_64 audacious-plugins-freeworld.x86_64 gstreamer1-plugins-bad-freeworld.x86_64 libavcodec-freeworld.x86_64 mesa-va-drivers-freeworld.x86_64 mesa-vdpau-drivers-freeworld.x86_64 xpra-codecs-freeworld.x86_64 --allowerasing
regards peter