You log in, run a simple command, and it takes a weirdly long time. Pages load slower than they used to. SSH feels sticky. Your monitoring says CPU is fine, RAM is fine, disk is fine. So… what’s going on?

Ubuntu servers can get slow in ways that don’t show up as obvious 100 percent CPU graphs. The “hidden” stuff. The slow bleed.

This post is a practical checklist. Seven common causes that make an Ubuntu server feel sluggish, how to confirm each one, and how to fix it fast without turning this into a weekend project.

Quick note: run commands with sudo where needed. And if this is production, do the obvious thing first. Take a snapshot or have a rollback plan. Some of these changes are safe, some are “safe if you know what you’re changing”.

Before you touch anything, get a 2 minute baseline

Do this once, save the output, then you can compare after each fix.

date uptime uname -a lsb_release -a 2>/dev/null || cat /etc/os-release
top -b -n1 | head -40 free -h df -hT iostat -xz 1 3 2>/dev/null || true vmstat 1 5 ss -s dmesg -T | tail -80 journalctl -p warning -n 80 --no-pager

If you can install one tool, install sysstat and iotop. They save hours.

sudo apt update sudo apt install -y sysstat iotop

Ok. Now the hidden causes.

1. DNS problems that make everything “randomly slow”

This one is sneaky because CPU and disk look fine. But anything that does a hostname lookup pauses. Package installs. Curl. Git. Apt. Even sudo can feel slow if it’s doing reverse lookups or waiting on DNS timeouts.

How to confirm

Test raw DNS latency.

resolvectl status 2>/dev/null || systemd-resolve --status 2>/dev/null cat /etc/resolv.conf
time getent hosts ubuntu.com time getent hosts your-internal-service.local

If getent takes a second or more, that’s already bad. If it sometimes takes 5 to 10 seconds, you found a big chunk of the “server is slow” feeling.

Also check whether systemd-resolved is struggling.

journalctl -u systemd-resolved -n 100 --no-pager

Fixes (pick what matches your setup)

1. Use known good resolvers (for a quick test, not always a forever solution):

    On systems using systemd-resolved, edit:

    sudo nano /etc/systemd/resolved.conf

    Set something like:

    ini [Resolve] DNS=1.1.1.1 8.8.8.8 FallbackDNS=9.9.9.9 DNSStubListener=yes

    Then restart:

    sudo systemctl restart systemd-resolved

    1. If /etc/resolv.conf is not pointing to the stub and is messy, check if it’s a symlink:

    ls -l /etc/resolv.conf

    On many Ubuntu versions it should link to:

    • /run/systemd/resolve/stub-resolv.conf (stub listener)
    • or /run/systemd/resolve/resolv.conf (direct)

    If it’s not, you might be fighting a config management tool or cloud-init. Fixing that depends on the platform, but the point is this: make DNS deterministic and fast.

    2. If SSH login is slow, also check reverse DNS and GSSAPI delays in SSH, but DNS is usually the root.

    Retest:

    time getent hosts ubuntu.com

    You want it basically instant.

    2. You’re swapping… but it doesn’t look dramatic in graphs

    A server can be “not out of RAM” and still swap enough to feel awful. Especially if you have bursts, Java, databases, Docker, or memory fragmentation. And the default swappiness can be too eager on some workloads.

    How to confirm

    free -h swapon --show vmstat 1 5

    Look at:

    • si and so in vmstat (swap in/out). If they’re non zero during slowness, that’s it.
    • In top, press M to sort by memory. Also check RES and VIRT.

    Also check if you have memory pressure warnings:

    dmesg -T | egrep -i "oom|out of memory|memory pressure|kswapd" | tail -50

    Fast fixes

    1. Reduce swappiness (common fix for general servers):

    cat /proc/sys/vm/swappiness sudo sysctl vm.swappiness=10

    Persist it:

    echo "vm.swappiness=10" | sudo tee /etc/sysctl.d/99-swappiness.conf sudo sysctl --system

    2. If swap is huge and constantly used, you probably need more RAM or fewer services. But as a quick “make it responsive now” move, you can clear swap (careful on production, do it during low traffic):

    sudo swapoff -a sudo swapon -a

    If the server goes from sluggish to snappy right after, you confirmed the diagnosis.

    3. On small VMs, adding a bit of swap can prevent OOM kills, but too much swap on slow disk can feel like death. Balance it.

    3. Disk I/O wait: the server is “idle” but actually stuck

    This one fools people. CPU usage looks low, but requests are slow. That’s often high I/O wait. The CPU is waiting on storage. Especially on shared cloud disks, overloaded SSDs, or when logs are going crazy.

    How to confirm

    Check iowait in top (the %wa field). Then go deeper:

    iostat -xz 1 5

    Look for:

    • %util near 100% on a device
    • high await
    • high svctm (less useful on modern kernels, but still a hint)
    • low throughput but high latency, classic shared storage pain

    Find the process hitting disk:

    sudo iotop -oPa

    Also check for disk full or near full. That can make writes much slower.

    df -h df -hi

    If inodes are 100 percent, that’s a nasty one. Disk can be “50% free” but no inodes left, and then everything breaks in weird ways.

    Fast fixes

    1. Stop the bleeding: identify the heaviest I/O process and decide if it’s expected.

    Common culprits:

    • runaway logging
    • backup jobs
    • database checkpoints
    • Docker overlay2 churn
    • antivirus scans
    • a broken app writing endlessly

    2. If logs are the culprit, rotate and limit them.

    Check journal size:

    journalctl --disk-usage

    Limit it:

    sudo nano /etc/systemd/journald.conf

    Set, for example:

    ini SystemMaxUse=500M RuntimeMaxUse=200M

    Restart:

    sudo systemctl restart systemd-journald

    Also check /var/log:

    sudo du -sh /var/log/* | sort -h | tail -20

    If the disk itself is the bottleneck and you’re on a cloud VM, sometimes the real fix is upgrading the disk type or provisioned IOPS. No command will magically turn slow storage into fast storage. But you can at least confirm it with iostat so you stop guessing.

    4. One noisy neighbor: CPU throttling, steal time, or power settings

    On virtualized hosts, your server can be “slow” because it’s not actually getting CPU when it needs it. This is very common on oversold VPS plans.

    How to confirm

    Check steal time:

    top

    If you see %st (steal) above a couple percent during load, the hypervisor is taking CPU away. On busy periods it might spike to 10, 20, more. That will feel like lag.

    You can also check:

    vmstat 1 5

    And inspect CPU frequency and throttling:

    lscpu | egrep -i "model name|mhz|cpu(s)" cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor 2>/dev/null || true dmesg -T | egrep -i "throttl|thermal|overheat" | tail -50

    For a more comprehensive understanding of troubleshooting CPU utilization, you may want to explore additional resources.

    Fixes

    1. If it’s steal time on a VPS, the fix is mostly business, not technical:

    • move to a less oversold provider
    • upgrade plan
    • move to dedicated CPU instances

    But you can still reduce spikes: cap background jobs, stagger cron, avoid heavy compactions during peak.

    2. If it’s thermal throttling on bare metal, clean fans, fix airflow, check BIOS settings. On servers, this is boring but real.

    3. If CPU governor is set to powersave on a machine that should be fast, change it. On servers it’s usually performance already, but check.

    Install tools:

    sudo apt install -y linux-tools-common linux-tools-generic

    Then use cpupower if available:

    sudo apt install -y linux-tools-$(uname -r) sudo cpupower frequency-set -g performance

    Persisting depends on your environment, so treat this as a “confirm it helps” step first.

    5. Network issues: packet loss, MTU problems, or NIC buffers

    Sometimes the server is fine. The network path is not. You notice it as slow API calls, slow SSH, slow database connections, timeouts, retransmits.

    How to confirm

    Basic link and error stats:

    ip -s link ethtool -S eth0 2>/dev/null | head -60

    Ping with loss and latency:

    ping -c 50 your-gateway-ip ping -c 50 1.1.1.1

    Look for packet loss. Even 1 to 2 percent can wreck throughput and make apps feel “sticky”.

    Check TCP retransmits:

    nstat -az | egrep -i "Retrans|Timeout|ListenOverflows|ListenDrops" ss -ti | head -50

    MTU mismatch is classic in VPNs, tunnels, some cloud networking. Quick test with do-not-fragment ping (adjust size):

    ping -M do -s 1472 -c 3 1.1.1.1

    If it fails, try smaller sizes (1464, 1452, 1412) until it works. If only smaller works, you might need to adjust MTU on the interface or tunnel.

    Fixes

    1. If errors are increasing on ip -s link, you likely have:
    • bad virtual NIC config
    • MTU mismatch
    • driver issue
    • physical issues (on bare metal)

    Try lowering MTU as a test (example 1400), but do this carefully and consistently across the path:

    sudo ip link set dev eth0 mtu 1400

    If it immediately improves VPN or overlay traffic, you’ve found it.

    1. If you’re seeing retransmits to one destination, trace the path:
    mtr -rw your-destination-host

    You might discover it’s not your server at all.

    6. Too many background services, timers, and cron jobs you forgot existed

    Ubuntu tends to accumulate stuff. Snap refreshes, unattended upgrades, logrotate, backup agents, monitoring agents, Docker prune scripts, database maintenance. Individually fine. Together, they create periodic slowdowns that feel mysterious.

    How to confirm

    Look at what’s running right now:

    systemctl --type=service --state=running ps aux --sort=-%cpu | head -20 ps aux --sort=-%mem | head -20

    Check timers (big one):

    systemctl list-timers --all | head -80

    If your server slows down at roughly the same time daily or weekly, it’s usually here.

    Also check apt activities:

    systemctl status unattended-upgrades --no-pager journalctl -u unattended-upgrades -n 200 --no-pager

    Fixes

    1. Reschedule heavy jobs off peak. Don’t disable security updates blindly, but you can control timing.

    For unattended upgrades, you can adjust periodic settings:

    sudo nano /etc/apt/apt.conf.d/20auto-upgrades

    And the main config:

    sudo nano /etc/apt/apt.conf.d/50unattended-upgrades
    1. If snap is causing CPU or disk spikes, check refresh timing:
    snap refresh --time

    You can set a maintenance window (example):

    sudo snap set system refresh.timer=sun,02:00-04:00
    1. If logrotate runs and makes disk go crazy, tune specific logs rather than disabling it.

    The goal is not “turn everything off”. It’s “stop surprise heavy work during busy hours”.

    7. File descriptor limits, conntrack tables, or backlog saturation (it looks like slowness)

    This is the kind of issue where the app is “up” but starts stalling. New connections hang. Requests queue. You restart the service and it magically fixes it for a while.

    Often it’s one of these:

    • file descriptors too low
    • listen backlog too low
    • conntrack table full (NAT, firewall heavy servers)
    • ephemeral port exhaustion on clients or reverse proxies

    How to confirm

    File descriptors:

    ulimit -n cat /proc/sys/fs/file-max cat /proc/sys/fs/file-nr sudo lsof | wc -l

    Per process (replace PID):

    cat /proc/PID/limits | egrep -i "open files|max processes"

    Network backlog and drops:

    ss -s netstat -s 2>/dev/null | egrep -i "listen|overflow|drop" | head -40

    Conntrack:

    cat /proc/sys/net/netfilter/nf_conntrack_max 2>/dev/null || true cat /proc/sys/net/netfilter/nf_conntrack_count 2>/dev/null || true dmesg -T | egrep -i "conntrack.*full" | tail -20

    If you see conntrack full messages, that’s a smoking gun. Connections get dropped, retries happen, everything feels slow and unreliable.

    Fixes

    1. Raise open file limits properly (systemd services need systemd overrides)

    For a systemd service, create an override:

    sudo systemctl edit yourservice

    Add:

    ini [Service] LimitNOFILE=1048576

    Then:

    sudo systemctl daemon-reload sudo systemctl restart yourservice

    For user sessions, /etc/security/limits.conf can help, but for daemons, systemd limits matter more.

    1. Increase backlog and somaxconn (useful for high traffic proxies like nginx, haproxy)
    sudo sysctl net.core.somaxconn=65535 echo "net.core.somaxconn=65535" | sudo tee /etc/sysctl.d/99-somaxconn.conf sudo sysctl --system
    1. Increase conntrack max (only if you actually need it and have RAM)

    Example:

    sudo sysctl net.netfilter.nf_conntrack_max=262144 echo "net.netfilter.nf_conntrack_max=262144" | sudo tee /etc/sysctl.d/99-conntrack.conf sudo sysctl --system

    If you’re doing heavy NAT, also consider tuning hashsize, but that’s a deeper rabbit hole.

    A simple “do this in order” troubleshooting flow

    If you’re in a rush and the server is slow right now, here’s the order that usually finds the issue fastest:

    1. Check DNS: time getent hosts google.com
    2. Check iowait: top then iostat -xz 1 3
    3. Check swapping: vmstat 1 5
    4. Check steal time: top look at %st
    5. Check packet loss: ping -c 50 gateway
    6. Check timers: systemctl list-timers --all
    7. Check limits: ss -s, conntrack count, open files

    You’re basically trying to answer: is the machine waiting on name resolution, waiting on disk, waiting on memory, waiting on CPU it can’t have, waiting on the network, or waiting because the kernel is dropping and queuing stuff.

    Let’s wrap it up

    Ubuntu server slowness is rarely one dramatic thing. It’s usually one hidden bottleneck that makes everything feel heavier than it should.

    If I had to bet without seeing your box, I’d bet on one of these:

    • DNS timeouts
    • I/O wait from storage or logs
    • mild but constant swapping
    • VPS steal time
    • network loss or MTU mismatch
    • background timers doing work at the worst possible time
    • file descriptor or conntrack saturation

    Run the checks, change one thing at a time, retest. And keep the “before” output. That’s how you stop guessing and actually fix the real cause.

    If you want, paste the outputs of top (first screen), iostat -xz 1 3, vmstat 1 5, time getent hosts google.com, and ss -s, and I’ll tell you which of the 7 it most likely is.

    FAQs (Frequently Asked Questions)

    Why does my Ubuntu server feel slow even when CPU, RAM, and disk usage look normal?

    Ubuntu servers can experience hidden performance issues that don’t show as high CPU or memory usage. Factors like DNS resolution delays, swapping, and disk I/O wait can cause sluggishness without obvious resource spikes. It’s important to check these less visible causes to diagnose and fix the slowness effectively.

    How can I quickly establish a performance baseline before troubleshooting my Ubuntu server?

    Run a set of diagnostic commands to capture your server’s current state, including date, uptime, system info, memory usage, disk space, I/O stats, network sockets, kernel messages, and warning logs. Installing tools like ‘sysstat’ and ‘iotop’ can also help gather detailed metrics for comparison after changes.

    Slow or misconfigured DNS can cause delays in hostname lookups affecting package installs, curl requests, git operations, and even sudo commands. Issues include slow DNS latency, systemd-resolved struggles, incorrect /etc/resolv.conf symlinks, or unreliable DNS servers leading to timeouts.

    How do I test and fix DNS problems causing server slowness on Ubuntu?

    Test DNS latency using commands like ‘getent hosts’ and check systemd-resolved status with ‘resolvectl’ or ‘systemd-resolve’. Fixes include configuring known reliable DNS servers (e.g., 1.1.1.1, 8.8.8.8) in /etc/systemd/resolved.conf and ensuring /etc/resolv.conf correctly points to the systemd stub resolver. Restart the resolver service after changes.

    Can swap usage affect my Ubuntu server’s responsiveness even if RAM isn’t fully utilized?

    Yes. Even moderate swap activity can degrade performance due to slower disk access compared to RAM. High swap-in/out values indicate swapping is impacting responsiveness. Reducing swappiness or clearing swap temporarily can improve performance if swapping is excessive.

    How do I identify and mitigate disk I/O wait causing sluggishness on an Ubuntu server?

    Check the ‘%wa’ field in ‘top’ for I/O wait time; high values indicate the CPU is waiting on disk operations. Use tools like ‘iostat’ or ‘iotop’ to analyze disk activity. Mitigation may involve optimizing disk usage, reducing logging noise, upgrading storage hardware, or balancing workloads to reduce contention.