I’m trying to follow what you did, so bear with me here, but did you do any of these:
You created the toolbox toolbox create my-nvidia-toolbox
then enetered the toolbox toolbox enter
then installed the Nvidia gpu drivers inside ? sudo dnf install akmod-nvidia xorg-x11-drv-nvidia-cuda
Run nvidia-smi inside the container ?
but in this case nvidia-smi is not recognized as a command (presumably because the nvidia container toolkit mounts it somehow, and toolbox is not triggering the toolkit.
I am not sure how to install the nvidia drivers in an ubuntu toolbox (though given the podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable localhost/toolbox/ubuntu:22.04 nvidia-smi -L works just fine, I am not sure that it is necessary at all)
I’m not able to test this for you, I’m literally in a shop on a Laptop. . . Although I do have a AMD Laptop with NvidiaGPU
Off of memory, when I did this in the past i needed to add the drivers inside the container,
if Ubuntu, you need to sudo apt update then sudo apt install nvidia-driver-Xxx
nvidia-smi is a command that comes with the driver . . I believe
. . sorry i can’t test this for you, worst case scenario rebuild your container right?!
Is having a Ubuntu container mission critical here? I’m just saying for the sake of compatibility. although I think 545.29.06 should be available for Ubuntu?
After much tinkering I decided to give up on toolbox in favour of using NVIDIA container toolkit.
The biggest problem with this driver installation approach is that the driver version has to exactly match the host, and with enough containers it’s gonna be impossible to keep juggling these versions and keep them in sync.
NVIDIA container toolkit seems to allow me to bypass these shenanigans and only cares about the driver on the host.
I don’t think toolbox currently leverages NVCT but maybe I’ll reach out and see if they’d be interested in me adding it in.
thank you for sharing your attempt at setting up the Nvidia Container Toolkit (NVCT) for toolbox. I’ve followed your steps, and found that one can use the NVCT via toolbox by commenting out the following line in your podman cmd:
# --volume /dev:/dev:rslave \
So far I haven’t found any problem using my toolbox without the option above, but I have no idea of any potential risk of removing the line for mounting.