Slow performance: how to diagnose

Topic: Servers linux

Summary

When the server is slow: check load average, CPU, memory, disk I/O, and network; use top, iostat, and vmstat to find the bottleneck. Use this to decide whether to scale, optimize, or fix a bug.

Intent: Troubleshooting

Quick answer

  • Load average (uptime): high vs CPU count indicates load; top for CPU and MEM; iostat -x 2 for disk wait; vmstat 1 for si/so (swap).
  • CPU-bound: high load and high %CPU; I/O-bound: high iowait or high disk util; memory: high si/so or OOM; network: check throughput and errors (ip -s link, sar).
  • Fix: optimize app, add resources, or fix the leak/bug; set baselines and alert when load or utilisation exceeds threshold.

Prerequisites

Steps

  1. Check load and CPU

    uptime (load 1/5/15); nproc for CPU count; top -o %CPU; if load >> nproc and CPU high, CPU-bound; if load high and iowait high, I/O-bound.

  2. Check memory and swap

    free -h; vmstat 1 (si/so = swap in/out); if high si/so or OOM, memory pressure; see OOM guide.

  3. Check disk I/O

    iostat -x 2; look at %util and await; if high, disk or storage is the bottleneck; consider SSD, more IOPS, or reduce I/O.

  4. Act on findings

    Scale (vertical or horizontal), fix leak, optimize query or code, or add cache; document baseline and set alerts.

Summary

You will diagnose slow performance using load, top, vmstat, and iostat to identify CPU, memory, or I/O as the bottleneck, then fix or scale accordingly. Use this when the server is slow and you need a data-driven fix.

Prerequisites

  • Root or ability to run top, vmstat, iostat; optional sar for historical data.

Steps

Step 1: Check load and CPU

uptime
nproc
top -o %CPU

Step 2: Check memory and swap

free -h
vmstat 1 5

Step 3: Check disk I/O

iostat -x 2 5

Step 4: Act on findings

  • CPU: optimize or add CPU; memory: add RAM or fix leak; I/O: add disk performance or reduce I/O.

Verification

  • Bottleneck identified and addressed; load and utilisation return to acceptable levels; alerts in place.

Troubleshooting

All metrics look normal — Check network latency or external dependency; run the app under load and re-check; consider profiling the application.

Next steps

Continue to