At what point you should stop trying to fix the system?

Title basically. When stuff goes wrong and you gen unbootable machine with no roll-back options available what is the turning point at which reinstalling the system will be a better choice? Everyone can reinstall their system, but it takes knowledge and skill to actually fix stuff.

My feeling is that should try to avoid reinstalling.

If you reinstall whenever there is a roadblock, there is no learning.

I have learned a tremendous amount over the years by helping other people solve their broken system issues. You donā€™t have to be some kind of Linux guru to solve most of these problems.

In most cases, it takes patience, the ability to read and the willingness to ask for help.

Hello! This is the post about an issue that I have encountered Amdgpu-install nuked my system
I think a messed up bad this time and I donā€™t have a lot of experience fixing stuff, but I think you are right about this oneā€¦

For me personally, it depends on where I am at the learning curve.

When I started with Linux, everything was alien and tough. I re-installed every time.
That lead me to appreciating live bootable images, and I also learned to backup the files I need before re-install (and how & where to find them)

Next phase was that Iā€™m reinstalling most of the time, but sometimes I am able to fix the system.
Longer usage time lead to bigger amount of precious files created.
I learned to split system and user-data partitions.
I also found out that copying the whole /home/ to the new installation works miraculously well. (It was much later I found out the caveats and limitations)

Iā€™ve got a job (more or less) requiring me to use Linux, and understand it.
But Iā€™ve also got a lot of colleagues willing to help me fix the system.
That was the time I started to re-install only when upgrading the hardware (work laptop)

Iā€™ve found that trying to reconstruct the fully functional system, after several years of intensive use, is tremendously hard.
Thatā€™s when I learned to log all essential changes and how-to. (how to backup wifi connections, install vpn, backup and restore DE settings, ā€¦)
I started to maintain a set of ā€˜setup scriptsā€™, which were able to update a freshly installed system to a customized workstation.

Soon Iā€™ve noticed that I also prefer a specific package set and that was just a step from creating a custom set of ā€˜installation scriptsā€™, which only installs the very basic system with the packages I want, without any bloat.

I am using many computes with vastly different use cases (work laptop, gaming PC, volunteering work laptops, ā€¦) and soon the installation scripts required differentiation between BIOS and EFI, MBR and GPT, LUKS and without LUKS, different GPU and wi-fi drivers, ā€¦

Iā€™ve started to utilizing BTRFS.
Snapshots allowed me to experiment with the installed system without fear. Iā€™ve tweaked my setup that booting from a different snapshot is a one-line command change. Nowadays I make a snapshot before every bigger change. On my gaming PC even before every package update.


Now I have a git repo of joined ā€˜os installationā€™ and ā€˜os configurationā€™ scripts.
I am able to install any combination I need in an automated way. Configure the system in an automated way. And maintain a ā€œfleetā€ of ~20 vastly different devices I run Fedora on.

I just hit ā€œ./autorun.shā€ and I can go make a coffee.

And every time I learn something new or tweak my system, I add it back to the repo.


That took me nearly a decade and helped me to gain a vast and deep understanding of many things.

And the wisdom I gathered on the way is this:

  • itā€™s fine to re-install every time. Especially if you are person who donā€™t want to learn these particular things. We all have lives full of stuff. I like tinkering with my system(s), you might not.
  • it takes time and a lot of patience. Especially when you donā€™t have any other hardware to learn on, that your ā€œproductionā€ one.
  • ARCH Linux wiki knows much. Communities know even more.
  • Try to tackle your problems starting with the most time or your energy consuming.
  • Focus on one thing at a time.
  • backup, backup, backup everything you donā€™t want to loose. Especially on systems youā€™re tinkering with :slight_smile:

And finally to answer your question:

At what point you should stop trying to fix the system?

Whenever you feel like it. Itā€™s your time and energy :slight_smile:

2 Likes

My general rule on this is: if rpm/dnf stop working, reinstall. At that stage, trying to fix the system is going to be very very difficult because one cannot even easily install/remove/modify system packages. If rpm/dnf are still functional, I look at fixing the system first.

Reinstalling using the live image takes me less time now than doing an upgrade, so I donā€™t mind. I have a separate /home partition, so I donā€™t lose any data/configuration, and I have scripts to install the stuff I need:

1 Like

Unpopular opinion: I reinstall whenever I can because Iā€™ve spent a lot of time/effort on configuration management automation for my servers and store 95% of all data on remote servers. Thus, getting a fresh Fedora Workstation install back to the way I like it often involves running an Ansible Playbook and a few other scripts.

To put it in cloud buzzword style: I treat my workstations like cattle too :slight_smile:

One of Quick Docs articles ā€˜how to report bugsā€™ saved my frustration from re-installation when apps opened from application launcher is locked up for about a minute - a cursor, windows, everything stops.

Bug reporting used to be daunting to me because providing relevant information took some practice and patience.

The issue was kwin wayland in KDE Plasma, which was detected by built-in alert system called ā€˜Crashed Processes Viewerā€™. One of the most reported issues on any support forums/Reddits is frozen screen.

ā€˜Crashed Processes Viewerā€™ in KDE Plasma has interactive debugger (coredumpctl fron-end) that opens up gdb prompt automatically for debugging and saving stack trace logs.

Processes I take for troubleshooting are;

  • Search Ask Fedora and Quick Docs
  • Scan through upstream issue board (in the case above, it is KDE Plasma repo in GItLab)
  • Check for already filed bugs (Follow through the guide in Quick Docs). If not reported, it is likely to be unique issue).
  • Try gdb debugging
  • If not fixed, report a bug with stack trace logs and reproducible steps
1 Like

Couldnā€™t agree more. I used to have that mentality of reinstalling on a whim. Iā€™ve tried to focus more on how to troubleshoot before throwing in the towel and starting fresh. Its been really beneficial in learning more on how to maintain systems.

2 Likes