Debian First Aid Kit

From Archive Debian Forums
Jump to navigation Jump to search

Debian First Aid Kit

All commands are verified on Debian 13.1 (Trixie) / 6.12.73+deb13-amd64 and

6.16.3+deb13-amd64 x64_64

Created : 2025-10-27 15:54:21

Last Updated : 2026-03-11 22:18:41 ID : 544000.5

This guide is preserved in its original form. For updated content and the forthcoming Debian book, visit www.debianfirstaid.org .

Table of Contents

  1. Issues
  2. Package Management Issues
  3. Disk & Filesystem Issues
  4. Performance Issues
  5. Service & Application Errors
  6. Permission & Access Issues
  7. Hardware IssuesSystem Freezes & Crashes
  8. Boot Problems
  9. Network
  10. Quick Diagnostic Commands
  11. Useful Aliases & Shortcuts
  12. Tips for Effective Troubleshooting

1. System Freezes & Crashes

Check System Logs

# View logs from previous boot (after freeze/crash)
journalctl -b -1

# List all available boots
journalctl --list-boots

# Show only kernel messages from previous boot
journalctl -b -1 -k

# Show errors and critical messages only
journalctl -b -1 -p err

# Save logs to file for analysis
journalctl -b -1 > ~/crash-log.txt

Common Freeze Causes to Look For

  • Kernel panics: Search for "kernel panic" or "Oops"
  • Out of Memory (OOM): Search for "Out of memory" or "oom-killer"
  • Hardware errors: Look for "MCE" (Machine Check Exception) or "hardware error"
  • Driver issues: Check for module/driver failures
  • Overheating: Check system temperatures

Check System Resources

# View memory usage
free -h

# Check disk space
df -h

# Monitor system resources in real-time
htop
# or
top
(I prefer btop for better presentation)
You would need to install it. sudo apt install btop

# Check for disk errors
sudo dmesg | grep -i error
These are permanent errors due to incomplete/buggy ACPI tables in the BIOS, but they are harmless :
0.686554] ACPI Error: No handler for Region [ECRM] (00000000201accc4) [EmbeddedControl] (20250404/evregion-131)
0.686577] ACPI Error: Region EmbeddedControl (ID=3) has no handler (20250404/exfldio-261)
0.686594] ACPI Error: Aborting method \_SB.GPIO._EVT due to previous error (AE_NOT_EXIST) (20250404/psparse-529)

2. Boot Problems

Check Boot Process

# View systemd boot analysis
systemd-analyze blame

# See what failed during boot
systemctl --failed

# Check specific service status
systemctl status <service-name> e.g NetworkManager.service

Access Recovery Mode

  1. Reboot and hold Shift to access GRUB menu (depending on your grub timing settings)
  2. Select "Advanced options"
  3. Choose recovery mode
  4. Select "root" for root shell access

Common Boot Fixes

# Repair filesystem errors
Once you identify a device with lsblk
sudo fsck /dev/sdXN

# Reinstall GRUB bootloader
sudo grub-install /dev/sdX
sudo update-grub

# Check fstab for mount errors
cat /etc/fstab

3. Network Issues

Diagnose Network Connection

# Check network interfaces
ip addr show

# Test connectivity
ping -c 4 8.8.8.8
or
ping -c 6 2a00:1450:4007:809::200e

# Check DNS resolution
nslookup google.com

# View routing table
ip route show

# Check active connections
ss -tuln

Restart Network Service

# For systems with NetworkManager
sudo systemctl restart NetworkManager

# For systems with networking service
sudo systemctl restart networking

# Bring interface down and up
sudo ip link set eth0 down
sudo ip link set eth0 up

If you need to prove to your server host something that is beyond your control, you can always get out the big guns with MTR.

MTR (It’s Traceroute on Steroids)

What is MTR?

MTR combines the functionality of ping and traceroute into a single real-time network diagnostic tool. It continuously monitors the path between your system and a destination, providing detailed statistics about latency and packet loss at each hop.

Installation

sudo apt install mtr

Basic Usage

# Basic MTR (interactive mode)
mtr google.com

# Report mode (run 10 cycles and exit)
mtr --report google.com

# Specify number of pings
mtr --report-cycles 50 google.com

# Use TCP instead of ICMP
mtr --tcp google.com

# Use UDP
mtr --udp google.com

# No DNS resolution (faster, shows IPs only)
mtr --no-dns google.com

# Show both hostnames and IPs
mtr --show-ips google.com

Understanding MTR Output

Sample Output

HOST: hostname                    Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 192.168.1.1                0.0%    10    1.2   1.5   1.0   2.3   0.4
  2.|-- 10.0.0.1                   0.0%    10    8.5   9.2   7.8  12.1   1.3
  3.|-- isp-gateway.net            0.0%    10   15.3  16.1  14.2  19.8   1.8
  4.|-- ???                       100.0%   10    0.0   0.0   0.0   0.0   0.0
  5.|-- google.com                 0.0%    10   25.4  26.8  24.1  32.5   2.4

Column Meanings

  • HOST: Hostname or IP address of each hop in the route
  • Loss%: Percentage of packets lost at this hop
  • Snt: Number of packets sent to this hop
  • Last: Latency of the most recent packet (milliseconds)
  • Avg: Average latency across all packets (milliseconds)
  • Best: Lowest latency recorded (milliseconds)
  • Wrst: Highest latency recorded (milliseconds)
  • StDev: Standard deviation - measures latency consistency (lower is better)

Interpreting Results

Healthy Network

  • Loss% = 0% on all hops
  • Stable latency (low StDev values)
  • Gradual latency increase as hop count increases
  • Consistent response times

Problem Indicators

1. High Packet Loss at Specific Hop

5.|-- problem-router.net        25.0%    10   45.3  48.2  42.1  65.8  8.4

Analysis:

  • If loss continues to destination: Real problem at this router
  • If loss only at this hop but NOT beyond: Router may be rate-limiting ICMP (false positive, not a real problem)

Rule of thumb: If packet loss appears at hop N but hops N+1, N+2, etc. show 0% loss, it's usually just ICMP rate limiting.

2. High Latency at Specific Hop

3.|-- slow-link.net              0.0%    10  150.3 155.2 148.1 165.8  5.4

Indicates:

  • Network bottleneck or congested link
  • Geographical distance (intercontinental hops)
  • Slow routing equipment

3. No Response (???)

4.|-- ???                       100.0%   10    0.0   0.0   0.0   0.0   0.0

Possible causes:

  • Router configured to not respond to ICMP/traceroute packets
  • Firewall blocking diagnostic packets
  • Not necessarily a problem if later hops respond normally

4. High Jitter (StDev)

6.|-- unstable.net               0.0%    10   35.3  52.8  28.1  95.2  24.7

Indicates:

  • Inconsistent latency (high StDev of 24.7ms)
  • Network congestion or instability
  • Poor for real-time applications (VoIP, gaming, video calls)

5. Sudden Latency Spike

1.|-- 192.168.1.1                0.0%    10    1.2   1.5   1.0   2.3   0.4
2.|-- 10.0.0.1                   0.0%    10    8.5   9.2   7.8  12.1   1.3
3.|-- problematic-hop.net        0.0%    10  180.5 185.2 178.1 195.8  6.4
4.|-- next-hop.net               0.0%    10  182.3 187.8 180.5 198.2  6.8

Problem identified: Hop 3 introduces ~170ms of latency (jump from 9ms to 180ms)

Advanced Usage

Report Mode with Different Output Formats

# CSV format for logging and analysis
mtr --report --csv google.com > network-report.csv

# JSON output for parsing
mtr --report --json google.com

# XML format
mtr --report --xml google.com

# Wide report (no abbreviations)
mtr --report-wide google.com

Protocol Selection

# Use ICMP (default, requires no special permissions)
mtr google.com

# Use UDP (alternative to ICMP)
mtr --udp google.com

# Use TCP (useful for firewall testing)
mtr --tcp google.com

# Test specific TCP port
sudo mtr --tcp --port 443 google.com
sudo mtr --tcp --port 22 remote-server.com

Timing and Duration

# Specify interval between pings (default 1 second)
mtr --interval 0.5 google.com

# Extended test with 100 cycles
mtr --report-cycles 100 google.com

# Continuous monitoring (Ctrl+C to stop)
mtr google.com

# Quick 10-cycle report
mtr --report-cycles 10 google.com

Advanced Options

# Show Autonomous System (AS) numbers
mtr --aslookup google.com

# Set maximum number of hops
mtr --max-ttl 20 google.com

# Set packet size
mtr --psize 1000 google.com

# Show both IP and hostname
mtr --show-ips google.com

# Specify source address (multiple network interfaces)
mtr --address 192.168.1.100 google.com

# IPv4 only
mtr -4 google.com

# IPv6 only
mtr -6 google.com

Interactive Mode Commands

When running MTR in interactive mode (just mtr hostname), use these keys:

Key Function
h Display help
d Toggle display mode (cycle through different views)
n Toggle between hostnames and IP addresses
r Reset all statistics
p Pause/unpause the display
q Quit MTR
u Switch between ICMP, UDP, and TCP modes
y Switch between IPv4 and IPv6
o Toggle field display options
j Toggle latency display

Practical Troubleshooting Scenarios

Scenario 1: Diagnosing Slow Website

# Run extended test to get accurate statistics
mtr --report-cycles 100 --no-dns example.com

# Look for:
# - High average latency at specific hops
# - Packet loss at destination
# - High StDev values (jitter)

Scenario 2: Testing if Firewall Blocks SSH

# Test SSH port (22) connectivity
sudo mtr --tcp --port 22 --report-cycles 50 remote-server.com

# If last hop shows 100% loss but earlier hops are fine:
# - Port 22 might be filtered
# - Try standard ICMP test for comparison

Scenario 3: ISP Performance Issues

# Test path to reliable external server
mtr --report-cycles 100 8.8.8.8

# Compare with another DNS server
mtr --report-cycles 100 1.1.1.1

# If issues appear in first 3-4 hops: likely ISP problem
# If issues appear later: problem is beyond your ISP

Scenario 4: VPN Troubleshooting

# Test before connecting to VPN
mtr --report-cycles 50 --no-dns google.com > before-vpn.txt

# Test after connecting to VPN
mtr --report-cycles 50 --no-dns google.com > after-vpn.txt

# Compare the two files to see VPN impact
diff before-vpn.txt after-vpn.txt

Scenario 5: Gaming/Streaming Performance

# Test for jitter (important for real-time applications)
mtr --report-cycles 200 game-server.com

# Look for:
# - Low average latency (< 50ms for gaming)
# - Low StDev (< 5ms preferred)
# - Zero packet loss

Scenario 6: Intermittent Connectivity

# Long-running test to catch intermittent issues
mtr --report-cycles 500 --interval 1 target.com > long-test.txt

# Monitor in real-time for several minutes
mtr target.com
# Watch for sudden spikes in Loss% or latency

Continuous Monitoring

Log Network Performance Over Time

# Create timestamped reports every hour
while true; do
    timestamp=$(date +%Y%m%d-%H%M%S)
    mtr --report --report-cycles 50 google.com > "mtr-$timestamp.txt"
    sleep 3600
done

Monitor Multiple Destinations

# Create a simple monitoring script
#!/bin/bash
echo "=== MTR Report $(date) ===" > daily-network-report.txt
echo "" >> daily-network-report.txt

echo "Google DNS:" >> daily-network-report.txt
mtr --report --report-cycles 50 --no-dns 8.8.8.8 >> daily-network-report.txt
echo "" >> daily-network-report.txt

echo "Cloudflare DNS:" >> daily-network-report.txt
mtr --report --report-cycles 50 --no-dns 1.1.1.1 >> daily-network-report.txt
echo "" >> daily-network-report.txt

echo "Your Server:" >> daily-network-report.txt
mtr --report --report-cycles 50 your-server.com >> daily-network-report.txt

Useful Aliases for .bashrc

# Quick network path analysis
alias mtrreport='mtr --report --report-cycles 50 --no-dns'

# Monitor connection to Google DNS
alias netcheck='mtr --report-cycles 20 8.8.8.8'

# Extended network test
alias mtrlong='mtr --report-cycles 100'

# TCP port 443 test (HTTPS)
alias mtrhttps='sudo mtr --tcp --port 443 --report-cycles 30'

# Quick comparison of major DNS providers
alias dnstest='echo "Google:" && mtr --report-cycles 20 8.8.8.8 && echo -e "\nCloudflare:" && mtr --report-cycles 20 1.1.1.1'

After adding to ~/.bashrc:

source ~/.bashrc

Troubleshooting Tips

1. Permission Issues

If you get permission errors with TCP mode:

# Use sudo for TCP on privileged ports
sudo mtr --tcp --port 443 example.com

# Or set capabilities (one-time setup)
sudo setcap cap_net_raw+ep /usr/bin/mtr-packet

2. False Positives

Common false positive: Packet loss at intermediate hops but NOT at the destination.

Example:

3.|-- router.isp.net            20.0%    50   15.3  16.1  14.2  19.8   1.8
4.|-- next-hop.net               0.0%    50   18.5  19.2  17.8  22.1   1.3
5.|-- destination.com            0.0%    50   25.4  26.8  24.1  32.5   2.4

This is OK! Hop 3 shows 20% loss, but hops 4 and 5 show 0% loss. The router at hop 3 is rate-limiting ICMP responses, but actual traffic flows normally.

3. DNS Resolution Delays

If MTR seems slow to start:

# Skip DNS resolution for faster results
mtr --no-dns target.com

# Resolve names afterward if needed
host 203.0.113.1

4. Comparing Results

# Run multiple tests and compare
mtr --report --report-cycles 50 example.com > test1.txt
sleep 60
mtr --report --report-cycles 50 example.com > test2.txt
diff test1.txt test2.txt

When to Use MTR vs Other Tools

Tool Best For Limitations
MTR Continuous monitoring, identifying problem hops, detailed statistics Requires installation
ping Quick connectivity test, simple latency check Only tests endpoint
traceroute One-time path discovery No continuous monitoring
ss/netstat Local connection status Doesn't test remote paths

Best Practices

  1. Run enough cycles: Use at least 50-100 cycles for accurate statistics
  2. Use --no-dns: Faster and avoids DNS resolution issues during testing
  3. Check multiple times: Network conditions vary; test at different times
  4. Compare protocols: Try ICMP, UDP, and TCP if one shows issues
  5. Document findings: Save reports with timestamps for trend analysis
  6. Test known-good hosts: Use 8.8.8.8 or 1.1.1.1 to verify your network first
  7. Be patient: Let MTR run for at least 30-60 seconds before drawing conclusions

Reading Between the Lines

Good Network Health Example

HOST: hostname                    Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 192.168.1.1                0.0%    50    1.1   1.2   0.9   2.1   0.2
  2.|-- 10.0.0.1                   0.0%    50    8.2   8.5   7.5  10.2   0.5
  3.|-- isp-gateway.net            0.0%    50   15.1  15.5  14.0  18.3   0.8
  4.|-- google.com                 0.0%    50   24.8  25.2  23.5  28.1   1.1

✅ No packet loss, consistent latency, low jitter

Problem Network Example

HOST: hostname                    Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 192.168.1.1                0.0%    50    1.2   1.5   1.0   2.3   0.3
  2.|-- 10.0.0.1                   5.0%    50   45.3  52.8  8.1  245.2  45.7
  3.|-- ???                       100.0%   50    0.0   0.0   0.0   0.0   0.0
  4.|-- destination.com           15.0%    50   95.4 125.8  48.2 385.5  78.2

❌ Packet loss at hop 2 and destination, high jitter, very high worst-case latency

Summary

MTR is your Swiss Army knife for network diagnostics. Key takeaways:

  • Use --report-cycles 50+ for reliable data
  • Watch for packet loss at the destination (intermediate losses may be false positives)
  • High StDev indicates unstable connection
  • High Avg latency shows slow links
  • Use --no-dns for faster results
  • Compare ICMP, UDP, and TCP modes if issues appear
  • Test at different times of day for comprehensive analysis

4. Package Management Issues

Fix Broken Packages

# Update package lists
sudo apt update

# Fix broken dependencies
sudo apt --fix-broken install

# Reconfigure packages
sudo dpkg --configure -a
(if no output, there is nothing to do)

# Clean package cache
sudo apt clean
sudo apt autoclean

# Remove unused packages
sudo apt autoremove

Handle Held or Locked Packages

# If apt is locked, find the process
sudo lsof /var/lib/dpkg/lock-frontend

# Force remove lock (use carefully)
sudo rm /var/lib/dpkg/lock-frontend
sudo rm /var/lib/apt/lists/lock

# Reconfigure dpkg
sudo dpkg --configure -a

5. Disk & Filesystem Issues

Check Disk Health

# Check disk space
df -h

# Check inode usage
df -i

# View disk I/O statistics
iostat -x 1
(Make sure you have sysstat which includes useful performance monitoring tools other than iostat - disk I/O statistics 
  • mpstat - CPU statistics
  • sar - system activity reporter
  • pidstat - process statistics
  • cifsiostat - CIFS statistics

# Show stats in MB instead of KB iostat -xm 2

# Monitor specific device iostat -x sda 1

# Check for disk errors in dmesg
sudo dmesg | grep -i "error\|fail"

# SMART disk health (if smartmontools installed)
sudo smartctl -a /dev/sda

Repair Filesystem

# Unmount the partition first
sudo umount /dev/sdXN

# Run filesystem check
sudo fsck /dev/sdXN

# For ext4 specifically
sudo e2fsck -f /dev/sdXN

6. Performance Issues

Identify Resource Hogs

# CPU usage by process
top -o %CPU

# Memory usage by process
top -o %MEM

# Disk usage by directory
du -sh /* | sort -h

# Find large files
find / -type f -size +100M 2>/dev/null

# Check running processes
ps aux --sort=-%mem | head -20

System Temperature Monitoring

# Install sensors (if not installed)
sudo apt install lm-sensors
sudo sensors-detect

# View temperatures
sensors

# Real-time temperature monitoring
watch -n 2 sensors
I have it as an alias in ~/.bashrc
Go to 11. Useful Aliases & Shortcuts

7. Service & Application Errors

Debug Service Problems

# Check service status
sudo systemctl status service-name

# View service logs
sudo journalctl -u service-name

# Restart a service
sudo systemctl restart service-name

# Enable service at boot
sudo systemctl enable service-name

# View recent service failures
journalctl -p err -b

Application Crash Investigation

# Check for core dumps
ls -lh /var/crash/

# View application-specific logs
ls /var/log/

# Check syslog for application errors
sudo tail -f /var/log/syslog

8. Permission & Access Issues

Fix Common Permission Problems

# Check file ownership
ls -l /path/to/file

# Change ownership
sudo chown michael:michael /path/to/file
user:group

# Change permissions
sudo chmod 644 /path/to/file

# Recursively fix permissions
sudo chown -R user:group /path/to/directory

User & Authentication Issues

# Check user information
id username

# View user login history
last -a

# Check failed login attempts
sudo journalctl | grep "authentication failure"

# Reset user password
sudo passwd username

9. Hardware Issues

Identify Hardware

# List all hardware
sudo lshw -short
(May not be installed by default)
sudo apt install lshw

# PCI devices
lspci -v

# USB devices
lsusb -v

# CPU information
lscpu

# Memory information
sudo dmidecode --type memory

Check Hardware Errors

# Kernel ring buffer (hardware messages)
dmesg | less
(If no output, good, no errors)
q to quit

# Search for specific hardware issues
dmesg | grep -i "error\|fail\|warn"

# Check for USB issues
dmesg | grep -i usb

10. Quick Diagnostic Commands

System Information at a Glance

# Uptime and load average
uptime

# Kernel version
uname -r

# Debian version
cat /etc/debian_version

# System summary
sudo inxi -Fxz

Emergency Toolkit

# Create a diagnostic report
sudo journalctl -b > ~/system-report.txt
dmesg >> ~/system-report.txt
systemctl --failed >> ~/system-report.txt
df -h >> ~/system-report.txt
free -h >> ~/system-report.txt

# Watch logs in real-time
sudo journalctl -f

# Monitor system resources continuously
watch -n 1 'free -h && df -h'

11. Useful Aliases & Shortcuts

Add these to your ~/.bashrc for quick access to common troubleshooting commands:

# Monitor system temperatures in real-time
alias temps="watch -n 2 'for i in /sys/class/hwmon/hwmon*/; do echo -n \"\$(cat \${i}name): \"; cat \${i}temp*_input 2>/dev/null | while read temp; do echo \"scale=1; \$temp/1000\" | bc; done | tr \"\n\" \" \"; echo \"°C\"; done'"
or run the watch command in the shell without the opening and closing double quotes.

# Quick system status
alias sysstat='echo "=== CPU ===" && uptime && echo -e "\n=== Memory ===" && free -h && echo -e "\n=== Disk ===" && df -h && echo -e "\n=== Top Processes ===" && ps aux --sort=-%mem | head -10'
(It’s a messy mayout, but I’m terrible with awk. Feel free to improve the layoput for me)


# View last boot logs
alias lastboot='journalctl -b -1'

# Check failed services
alias failedservices='systemctl --failed'

# Monitor logs in real-time
alias watchlog='sudo journalctl -f'

# Quick network status
alias netstat='ip addr show && echo -e "\n=== Routes ===" && ip route show'

After adding these, run:

source ~/.bashrc

Tips for Your Troubleshooting

  1. First check logs: journalctl and dmesg are your best friends
  2. Work through the sections: Change one thing at a time
  3. Document changes: Keep notes on what you've tried
  4. Search for error messages: Copy exact error messages into search engines or AI
  5. Check recent changes: What you did before it happened? Install something, update packages, kernel?
  6. Make backups: Before major changes, backup important data
  7. Use verbose mode: Add -v or -vv flags to commands for more detail
  8. Check forums: Debian forum, Reddit, Stack Exchange, and mailing lists

Remember: If in doubt, search for the specific error message along with "Debian" and the version number. e.g. Debian 13 or point release if needed, Debian 13.1

This work was contributed by distro-nix on Debian User Forums on 2025-10-27 23:38:22

I welcome comments, suggestions or resources : dev@divsmart.com .

index.php?title=Category:Troubleshooting index.php?title=Category:Guides index.php?title=Category:Administration index.php?title=Category:Guides index.php?title=Category:Full Paper