Practical Linux Performance Tuning for Production Servers


Out of the box, Linux distributions are tuned for a balance of power savings, desktop responsiveness, and general compatibility. If you are running high-traffic web servers, massive databases, or high-throughput network appliances, the default settings leave significant performance on the table.

Here is a guide to tuning critical subsystems in Linux.

1. CPU Power Management

Modern processors aggressively scale down their frequency to save power when idle. However, the time it takes to ramp up the CPU frequency when a sudden burst of requests arrives can add measurable latency.

Linux controls this via “Governors”.

  • powersave: Prioritizes low power consumption.
  • performance: Locks the CPU at its maximum frequency.

For database servers and high-frequency trading applications, lock the CPU to maximum performance:

# Apply to all cores instantly
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

To make this persistent, install cpufrequtils and configure it in /etc/default/cpufrequtils.

2. Networking (sysctl Tuning)

The Linux network stack is highly tunable via /etc/sysctl.conf. The defaults are generally designed for modest hardware and varied network conditions.

If you are dealing with thousands of concurrent connections (e.g., Nginx or HAProxy), you need to increase system limits.

TCP Connection Queue

Increase the number of connections allowed in the backlog (preventing dropped connections during sudden traffic spikes):

net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535

Ephemeral Ports

If your server makes many outgoing connections (like a reverse proxy), it might run out of local ports. Expand the range and allow fast recycling of TIME_WAIT sockets:

net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_reuse = 1

BBR Congestion Control

Google developed the BBR (Bottleneck Bandwidth and Round-trip propagation time) congestion control algorithm. It drastically improves throughput and reduces latency over long-distance, high-packet-loss links compared to the default cubic algorithm.

net.ipv4.tcp_congestion_control = bbr
net.core.default_qdisc = fq

Apply all changes instantly with sudo sysctl -p.

3. Storage I/O Schedulers

The kernel uses I/O schedulers to order disk read/write requests.

Historically, for spinning hard drives (HDDs), schedulers like cfq grouped requests by physical location to minimize the mechanical arm movement. For modern NVMe and SSD drives, mechanical movement is irrelevant. Using a legacy scheduler on an SSD wastes CPU cycles.

Check your current scheduler (replace sda with your drive):

cat /sys/block/sda/queue/scheduler

For NVMe/SSD drives, ensure you are using none or mq-deadline (Multi-Queue). To change it temporarily:

echo none | sudo tee /sys/block/sda/queue/scheduler

Make it permanent by adding elevator=none to your GRUB boot parameters.

4. File Descriptors

In Linux, “everything is a file,” including network sockets. A busy web server handling 10,000 concurrent users needs at least 10,000 open file descriptors.

Check the system-wide limit:

cat /proc/sys/fs/file-max

Increase it in /etc/sysctl.conf if necessary:

fs.file-max = 2097152

More importantly, check the per-user limits defined in /etc/security/limits.conf:

*       soft    nofile  100000
*       hard    nofile  100000

Measurement Before Modification

The golden rule of performance tuning is: Never change a parameter without measuring its impact. Use tools like perf, iostat, ss, and bcc to identify your actual bottleneck. Tuning network queues won’t help if your application is bottlenecked by CPU cache misses!