| # Check Failed Systemd Units | |
| You are helping the user identify and diagnose failed systemd units (services, mounts, timers, etc.). | |
| ## Task | |
| 1. **List all failed units:** | |
| ```bash | |
| # Show failed units | |
| systemctl --failed | |
| # More detailed output | |
| systemctl --failed --all | |
| # Include user units | |
| systemctl --user --failed | |
| ``` | |
| 2. **Get detailed status of failed units:** | |
| ```bash | |
| # For each failed unit, get details | |
| for unit in $(systemctl --failed --no-legend | awk '{print $1}'); do | |
| echo "=== $unit ===" | |
| systemctl status "$unit" --no-pager -l | |
| echo "" | |
| done | |
| ``` | |
| 3. **Check recent failures:** | |
| ```bash | |
| # Units that failed in last boot | |
| systemctl list-units --failed --state=failed | |
| # Check boot log for failures | |
| journalctl -b -p err | grep -i "failed" | |
| ``` | |
| 4. **Analyze specific failed unit:** | |
| ```bash | |
| # Status with full output | |
| systemctl status UNIT_NAME -l --no-pager | |
| # Recent logs for the unit | |
| journalctl -u UNIT_NAME -n 50 --no-pager | |
| # Logs from current boot | |
| journalctl -b -u UNIT_NAME --no-pager | |
| # All logs for the unit | |
| journalctl -u UNIT_NAME --since "24 hours ago" --no-pager | |
| ``` | |
| 5. **Check unit dependencies:** | |
| ```bash | |
| # What this unit depends on | |
| systemctl list-dependencies UNIT_NAME | |
| # What depends on this unit | |
| systemctl list-dependencies --reverse UNIT_NAME | |
| # Check if dependencies failed | |
| systemctl list-dependencies UNIT_NAME --all | while read dep; do | |
| systemctl is-failed "$dep" 2>/dev/null | grep -q "^failed" && echo "FAILED: $dep" | |
| done | |
| ``` | |
| 6. **Common failure patterns:** | |
| ```bash | |
| # Mount failures | |
| systemctl --failed | grep ".mount" | |
| # Service failures | |
| systemctl --failed | grep ".service" | |
| # Timer failures | |
| systemctl --failed | grep ".timer" | |
| # Network-related failures | |
| systemctl --failed | grep -E "network|dhcp|dns" | |
| ``` | |
| 7. **Attempt to diagnose failure reason:** | |
| ```bash | |
| # Exit code and signal | |
| systemctl show UNIT_NAME | grep -E "ExecMainStatus|ExecMainCode|Result" | |
| # Unit file location and settings | |
| systemctl cat UNIT_NAME | |
| # Check if unit file exists and is valid | |
| systemctl show UNIT_NAME -p LoadState,ActiveState,SubState,Result | |
| ``` | |
| 8. **Try to restart failed units:** | |
| ```bash | |
| # Ask user if they want to attempt restart | |
| # List failed units | |
| failed_units=$(systemctl --failed --no-legend | awk '{print $1}') | |
| # For each unit, ask to restart | |
| for unit in $failed_units; do | |
| echo "Attempting to restart: $unit" | |
| sudo systemctl restart "$unit" | |
| systemctl is-active --quiet "$unit" && echo "✓ $unit restarted successfully" || echo "✗ $unit restart failed" | |
| done | |
| ``` | |
| 9. **Check for masked units:** | |
| ```bash | |
| # List masked units | |
| systemctl list-unit-files | grep masked | |
| # Check if failed unit is masked | |
| systemctl is-enabled UNIT_NAME | |
| ``` | |
| 10. **Generate failure report:** | |
| ```bash | |
| cat > /tmp/failed-units-report.txt << EOF | |
| Failed Units Report - $(date) | |
| ====================================== | |
| Failed Units Summary: | |
| $(systemctl --failed --no-pager) | |
| Detailed Status: | |
| EOF | |
| for unit in $(systemctl --failed --no-legend | awk '{print $1}'); do | |
| echo "" >> /tmp/failed-units-report.txt | |
| echo "=== $unit ===" >> /tmp/failed-units-report.txt | |
| systemctl status "$unit" --no-pager -l >> /tmp/failed-units-report.txt 2>&1 | |
| echo "" >> /tmp/failed-units-report.txt | |
| echo "Recent Logs:" >> /tmp/failed-units-report.txt | |
| journalctl -u "$unit" -n 20 --no-pager >> /tmp/failed-units-report.txt 2>&1 | |
| echo "" >> /tmp/failed-units-report.txt | |
| done | |
| cat /tmp/failed-units-report.txt | |
| ``` | |
| ## Present Summary to User | |
| Provide: | |
| - Number of failed units | |
| - List of failed unit names and types | |
| - Failure reasons (exit codes, signals) | |
| - Recent log entries for each | |
| - Recommended actions | |
| ## Common Failed Units & Solutions | |
| **NetworkManager-wait-online.service:** | |
| - Usually safe to ignore or disable if not needed | |
| - `sudo systemctl disable NetworkManager-wait-online.service` | |
| **ModemManager.service:** | |
| - May fail if no modem hardware present | |
| - Can disable: `sudo systemctl disable ModemManager.service` | |
| **bluetooth.service:** | |
| - Check firmware: `journalctl -u bluetooth | grep -i firmware` | |
| - Restart: `sudo systemctl restart bluetooth` | |
| **systemd-resolved.service:** | |
| - Check config: `/etc/systemd/resolved.conf` | |
| - DNS issues: `resolvectl status` | |
| **Mount units (*.mount):** | |
| - Check fstab: `cat /etc/fstab` | |
| - Verify device exists: `lsblk` | |
| - Check mount point permissions | |
| **User services:** | |
| - Check user journal: `journalctl --user -u UNIT_NAME` | |
| - May need `loginctl enable-linger USER` | |
| ## Cleanup Actions | |
| ```bash | |
| # Reset failed state | |
| sudo systemctl reset-failed | |
| # Disable permanently failed units (ask first!) | |
| sudo systemctl disable UNIT_NAME | |
| # Mask unit to prevent activation | |
| sudo systemctl mask UNIT_NAME | |
| # Unmask unit | |
| sudo systemctl unmask UNIT_NAME | |
| # Reload systemd configuration | |
| sudo systemctl daemon-reload | |
| ``` | |
| ## Notes | |
| - Not all failures are critical - some are expected | |
| - Check if service is actually needed before disabling | |
| - Some failures may be due to hardware not present (modems, bluetooth) | |
| - Mount failures can prevent boot - be careful with fstab changes | |
| - User units are separate from system units | |
| - Use `systemctl reset-failed` to clear failed state after fixing | |