a few days ago I updated the kernel to 6.9.x and since then Fedora only sees 1 CPU core. I tried the 6.10.x kernel but it’s the same. So I went back to kernel 6.8.x which works fine.
What could be the problem and how do I know when I can safely update to the new/current kernel?
pr_info_once("Ignoring hot-pluggable APIC ID %x in present package.\n",
apic_id);
topo_info.nr_rejected_cpus++;
pr_err_once("APIC ID %x exceeds kernel limit of: %x\n", apic_id, MAX_LOCAL_APIC - 1);
topo_info.nr_rejected_cpus++;
pr_warn_once("CPU limit of %d reached. Ignoring further CPUs\n", nr_cpu_ids);
topo_info.nr_rejected_cpus++;
pr_warn("Enumerated BSP APIC %x is not marked in APICBASE MSR\n", apic_id);
pr_warn("Assuming crash kernel. Limiting to one CPU to prevent machine INIT\n");
set_nr_cpu_ids(1);
goto fwbug;
pr_warn("Boot CPU APIC ID not the first enumerated APIC ID: %x != %x\n",
topo_info.boot_cpu_apic_id, apic_id);
pr_warn("Crash kernel detected. Disabling real BSP to prevent machine INIT\n");
pr_warn(FW_BUG "APIC enumeration order not specification compliant\n");
In the terminal run sudo dmesg --level=err,warn. do you see any of the above messages?
It would be worth seeing if there is a firmware update for the systems BIOS.
The BIOS is responsible for settings up the CPU and a mistake in the BIOS could be involved.
Thanks for the reply. I ran the command but can’t really recognize anything that helps…
This is probably important, but I don’t know what to do with that:
CPU topo: CPU limit of 1 reached. Ignoring further CPUs
[ 0.000000] Malformed early option 'acpi'
[ 0.042322] CPU topo: CPU limit of 1 reached. Ignoring further CPUs
[ 0.108881] ACPI: setting ELCR to 0200 (from 0000)
[ 0.379285] ACPI: \_SB_.LNKA: BIOS reported IRQ 0, using IRQ 11
[ 0.584319] ACPI: \_SB_.LNKB: BIOS reported IRQ 1, using IRQ 10
[ 0.586334] hpet_acpi_add: no address or irqs in _CRS
[ 0.598796] intel-lpss 0000:00:15.0: can't derive routing for PCI INT A
[ 0.598797] intel-lpss 0000:00:15.0: PCI INT A: not connected
[ 0.598823] intel-lpss 0000:00:15.0: probe with driver intel-lpss failed with error -2147483648
[ 0.611091] intel-lpss 0000:00:15.1: can't derive routing for PCI INT B
[ 0.611092] intel-lpss 0000:00:15.1: PCI INT B: not connected
[ 0.611114] intel-lpss 0000:00:15.1: probe with driver intel-lpss failed with error -2147483648
[ 0.623407] intel-lpss 0000:00:15.3: can't derive routing for PCI INT D
[ 0.623408] intel-lpss 0000:00:15.3: PCI INT D: not connected
[ 0.623433] intel-lpss 0000:00:15.3: probe with driver intel-lpss failed with error -2147483648
[ 0.649531] usb: port power management may be unreliable
[ 0.650721] device-mapper: core: CONFIG_IMA_DISABLE_HTABLE is disabled. Duplicate IMA measurements will not be recorded in the IMA log.
[ 0.654894] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
[ 2.997347] sd 8:0:0:0: [sdd] No Caching mode page found
[ 2.997349] sd 8:0:0:0: [sdd] Assuming drive cache: write through
[ 3.010643] GPT:Primary header thinks Alt. header is not at the end of the disk.
[ 3.010647] GPT:3715423 != 31494143
[ 3.010649] GPT:Alternate GPT header not at the end of the disk.
[ 3.010650] GPT:3715423 != 31494143
[ 3.010651] GPT: Use GNU Parted to correct GPT errors.
[ 3.189571] sd 6:0:0:0: [sdb] No Caching mode page found
[ 3.189574] sd 6:0:0:0: [sdb] Assuming drive cache: write through
[ 3.200233] GPT:Primary header thinks Alt. header is not at the end of the disk.
[ 3.200236] GPT:3691383 != 122138623
[ 3.200238] GPT:Alternate GPT header not at the end of the disk.
[ 3.200240] GPT:3691383 != 122138623
[ 3.200241] GPT: Use GNU Parted to correct GPT errors.
[ 4.316246] r8169 0000:02:00.0: can't disable ASPM; OS doesn't have ASPM control
[ 4.478237] i801_smbus 0000:00:1f.4: Transaction timeout
[ 4.685973] i801_smbus 0000:00:1f.4: Transaction timeout
[ 5.647392] block nvme0n1: No UUID available providing old NGUID
[ 6.801433] thermal thermal_zone2: failed to read out thermal zone (-61)
[ 8.185216] nvidia: loading out-of-tree module taints kernel.
[ 8.185222] nvidia: module license 'NVIDIA' taints kernel.
[ 8.185222] Disabling lock debugging due to kernel taint
[ 8.185225] nvidia: module license taints kernel.
[ 8.425393] snd_hda_intel 0000:01:00.1: azx_get_response timeout, switching to polling mode: last cmd=0x000f0000
[ 8.509434] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 550.90.07 Fri May 31 09:35:42 UTC 2024
[ 8.602195] nvidia_uvm: module uses symbols nvUvmInterfaceDisableAccessCntr from proprietary module nvidia, inheriting taint.
[ 11.559206] Bluetooth: hci0: Malformed MSFT vendor event: 0x02
[ 11.570078] Bluetooth: hci0: HCI LE Coded PHY feature bit is set, but its usage is not supported.
[ 57.076396] warning: `ThreadPoolForeg' uses wireless extensions which will stop working for Wi-Fi 7 hardware; use nl80211
After trying some other stuff I’m back to Fedora.
The problem still persists (even with the latest kernel) but I found a way to make it work.
noapic: the device boots but uses only 1 CPU
no boot parameter: the device freezes and doesn’t boot at all
pci=nobar: the device boots and everything seems to work
After some more research I found the parameter pci=nobar and it seems to work fine. The system boots and I can use all cores and the NVIDIA graphics card. Everything seems to work.
Are there any disadvantaged in using this boot parameter? Maybe some bad stuff I haven’t noticed yet?
Thanks for all the helpful replies. I hope this piece of information helps other people with a similar problem.
So “nobar” mean “No Base Address Register”. I was not aware of what that was. After looking around on the internet, that is the best explanation I found.