I’ve been hitting some kernel oopses recently while trying to get a virtual machine working how I like.
I was going to report these but it says my kernel is tainted so I can’t report it. Setting aside the fact that I’m told not to report a kernel bug that halts the system, I’m not even sure why it is tainted in the first place.
I tried checking the entries in
/sys/module/<module>/taint but there was nothing returned besides a new line character.
I’d appreciate some suggestions on how to find out what is tainting my kernel and why.
Thanks in advance!
journalctl -k | grep taint
Often the first oops is not tainted, but will taint the kernel if you keep running it, meaning we don’t want to know about bugs that happen on a kernel that has already had an issue, we want to know about the first bug, which caused the initial issue. The reasoning is because anything that happens after a kernel has oopsed may not be a real issue, and could be triggered by the state the kernel is left in after that first oops.
Thanks but can you provide a bit more info? What exactly am I looking for? What logs the message, is it kernel or something else?
Well, that reasoning makes sense, I’m fine with that. Thanks for explaining! I’m not sure I agree it is often the case though.
For example, I’m looking at taint flags of
GW in my message. According to my research this means that a module is loaded which the kernel developers don’t like even though it is licensed (G) and there was a previous warning issued (W).
Unfortunately, I still don’t know what that module is, i.e. what’s triggering the
Well, modules that taint are not a standard part of Fedora, or in official fedora repositories, so it was either something you installed from a 3rd party repo, or something you installed from a vendor. vmware/virtualbox/etc all have non upstream modules that they refuse to upstream, and still want to install. drop the output of lsmod and I can probably pinpoint which module(s) are tainting.
Hmm, this gets more and more interesting. I re-installed Fedora only a few days ago, so I haven’t enabled any 3rd party repos yet. I’ve only installed a few applications so far.
I really appreciate your help and offer to look through the modules for me. I would also like to learn to do it myself for in the future though. I suppose the average user may not care, but I’d like a big clear warning before something taints the kernel, with an easy way to find what it is.
Anyway here’s the output of lsmod. I didn’t see anything suspicious, but it’s hardly my area of expertise.
Oops, you are reading that wrong. ‘G’ means all of your modules are GPL licensed, it doesn’t mean there is a problem with any of them, but it does verify that you are not running proprietary modules. The ‘W’ tells us you had a previous warning which tainted the kernel. If that doesn’t show up in your logs because the buffer has filled up, it might be worth a fresh boot and seeing if you can find it.
Thank you Justin! I appreciate your time and all the info you’ve provided.
I’ll go through dmesg closely and figure out what exactly is going on.
If this does end up being re-portable, which I expect it will as I can reproduce it at will, do you (or anyone) have any recommendations where or who to report it to? I’ve never dealt with a problem this severe before which is completely reproducible.
Bugzilla would be the first place to start, then we can get some idea as to what is actually happening and figure out where to send it upstream.
In my opinion, one should use KVM exclusively for that purpose. However, it is not very clear to me why the ‘Virtual Machine Manager’ was marked deprecated, although not all functions are yet available via ‘Cockpit’.
I used many other solutions before, but KVM is the most compatible one in the case we are talking about type 2 hypervisors. I have not the best experiences when we are talking about vmware-virtualization as software on top of a regular Linux host in general.
For professional use-cases, Vmware is of cause one of the best solutions available and used by many people of the RHEL staff. One can try o-virt-based solutions, but the setup in not that easy for clustered solutions and will take some time.
That is certainly worth a look: OpenShift Virtualization, and I think one node non-clustered solution will be available soon:
https://www.openshift.com/learn/whats-new (one may want to follow the news here)
I’ve bookmarked those links to read more at a later time.
I actually used to work for VMWare, and I am a big fan of some of their technology. Sorry to hear you haven’t had the best experiences with it. I generally try to use the packages provided by the distro before going elsewhere, so at this time I am using KVM.
I didn’t realize Virtual Machine Manager is marked deprecated. I’ve been using that because it offers so much more than Boxes, although I’ve done some work with virsh directly as well.
Yes. VMware Workstation is problematic since you need to install the kernel headers and stuff when you update your machine to a new kernel version, which can be quite frequently depending on the distribution you use. The support for new OS-release is a couple of versions behind, so I assume in the compatibility-matrix, Fedora 34 is not listed, but I haven’t checked that.
Commandline tools are fantastic for servers without a GUI, and often it’s easier to spin-up a VM with virsh in a SSH session.
Makes sense! I’m not too familiar with Workstation, as I worked on ESX. It sounds like Workstation could be improved for Linux and CLI use.
I’m going to try to stick with KVM and virsh for now.
I can’t help with a virsh setup, I assume.
Why do you assume that?
Regardless, I don’t think there should be any setup possible leading to a kernel oops / system halt. I have a way to work around it, or at least not halt until log out / shut down, but that isn’t a good long term solution.