Persistent Server Stability, Resource Management, and Deployment Issues for a Restaurant Menu Website on Fedora Server

I am hosting a restaurant menu website on a Fedora Server installation, and I am running into several server-side issues that are becoming increasingly difficult to diagnose as the project grows. The website itself is not especially complex from a feature standpoint, but it serves dynamic content, static media assets, and receives frequent updates. While the server generally works as expected under light load, I occasionally encounter sudden spikes in CPU and memory usage that cause slow responses or temporary unavailability. These spikes do not always correspond clearly to traffic increases, making it hard to determine whether the root cause is the web server configuration, background services, or Fedora-specific defaults related to system resource management.

One ongoing challenge involves configuring and tuning the web server stack on Fedora. I am using a standard setup with Nginx and a backend application service, but default configuration values do not always seem well suited for this workload. For example, worker processes, file descriptor limits, and connection handling settings sometimes appear too restrictive, leading to dropped connections when multiple users access the site simultaneously. Adjusting these values improves performance in some cases, but I worry about introducing instability or deviating too far from recommended Fedora practices. I am looking for guidance on how to properly tune a Fedora-based web server for a content-heavy website without compromising system stability or security.

System updates and package management add another layer of complexity. Fedora’s frequent updates are generally a benefit, but on a production server they sometimes introduce unexpected behavior changes. After certain system or library updates, the website has failed to start correctly, or services have required manual intervention to recover. Rolling back updates is not always straightforward, and I am unsure how best to balance keeping the system secure and up to date while minimizing the risk of downtime. Advice on update strategies, staging, or using Fedora tools to manage updates more safely would be very helpful.

Storage and file system performance also present challenges. The culver menu prices website relies on a growing number of image assets and cached files, and over time disk I/O appears to become a bottleneck. At times, page loads slow down noticeably even though CPU and memory usage remain within reasonable limits. I am using default file system settings and basic caching mechanisms, but I am unsure whether additional tuning or alternative storage configurations would improve performance. Understanding how Fedora handles disk caching and file system performance for web workloads would help me make more informed decisions.

Logging, monitoring, and troubleshooting are additional pain points. Fedora provides robust logging through systemd and journald, but correlating logs from the web server, application services, and the operating system can be overwhelming. When an issue occurs, there is often no single clear error message that points to the cause. I would like to know what tools or practices Fedora users recommend for monitoring server health, tracking resource usage over time, and quickly identifying the source of performance or stability issues in a production environment.

Finally, I am thinking ahead about scalability and long-term maintainability. As the website expands to support more content and potentially more locations, the server workload will increase accordingly. I want to ensure that the current Fedora Server setup can scale reliably, whether through vertical scaling, containerization, or other approaches supported by the Fedora ecosystem. Any insights from the Fedora community on best practices for running and maintaining web servers on Fedora, especially for dynamic and frequently updated sites, would be greatly appreciated. Sorry for long post

The tuning values you use will be highly dependent on your hardware resources and the work load you have.

If you are not doing so already looking adding prometheus instrumation of your server to get a view of how the system performs over time.

Start with setting up node_exporter to get a view into general resources like CPU, disk IO, paging etc.

Look into prometheus exporters for ngix specific data and look to add metrics to your code.

Now when you notice a spike you can look at a dashboard in grafana and see what other activity on your server correlates when the spike.

nginx is easy to tune any OS (last night I fixed slow page loading through WSL by caching statics on win32 nginx :stuck_out_tongue: but years had no-store responsive fine)

Make your own solutions as-needed :smiling_face_with_sunglasses:

Here’s everything I did for nginx/PHP-FPM for Friendica (ActivityPub instance); I have specific nginx and FPM pool settings and custom systemd service/timers.

More specifics might be helpful (I’ve done daily unattended updates for years rolling openSUSE TW and Fedora latest 5+ websites)

One idea might be to have images/static stuff on a RAM disk (symlink?) and nginx root pull from the RAM disk.

I’d either have 3+ machines and some kind of load balancing set-up so one can reboot seamless with others serving; or have fast reboots and reboot daily :smiling_face_with_sunglasses: (for a restaurant page the reboot could be at an odd morning time while closed)

$3.50 for a grilled cheese sandwich is interesting :stuck_out_tongue:

  • Manual check (middle-click open all websites in browser tabs; check response times/etc)
  • If anything’s broken: dmesg
  • If anything’s slow: htop