File size: 5,400 Bytes
279efce
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
# Check Failed Systemd Units

You are helping the user identify and diagnose failed systemd units (services, mounts, timers, etc.).

## Task

1. **List all failed units:**
   ```bash
   # Show failed units
   systemctl --failed

   # More detailed output
   systemctl --failed --all

   # Include user units
   systemctl --user --failed
   ```

2. **Get detailed status of failed units:**
   ```bash
   # For each failed unit, get details
   for unit in $(systemctl --failed --no-legend | awk '{print $1}'); do
     echo "=== $unit ==="
     systemctl status "$unit" --no-pager -l
     echo ""
   done
   ```

3. **Check recent failures:**
   ```bash
   # Units that failed in last boot
   systemctl list-units --failed --state=failed

   # Check boot log for failures
   journalctl -b -p err | grep -i "failed"
   ```

4. **Analyze specific failed unit:**
   ```bash
   # Status with full output
   systemctl status UNIT_NAME -l --no-pager

   # Recent logs for the unit
   journalctl -u UNIT_NAME -n 50 --no-pager

   # Logs from current boot
   journalctl -b -u UNIT_NAME --no-pager

   # All logs for the unit
   journalctl -u UNIT_NAME --since "24 hours ago" --no-pager
   ```

5. **Check unit dependencies:**
   ```bash
   # What this unit depends on
   systemctl list-dependencies UNIT_NAME

   # What depends on this unit
   systemctl list-dependencies --reverse UNIT_NAME

   # Check if dependencies failed
   systemctl list-dependencies UNIT_NAME --all | while read dep; do
     systemctl is-failed "$dep" 2>/dev/null | grep -q "^failed" && echo "FAILED: $dep"
   done
   ```

6. **Common failure patterns:**
   ```bash
   # Mount failures
   systemctl --failed | grep ".mount"

   # Service failures
   systemctl --failed | grep ".service"

   # Timer failures
   systemctl --failed | grep ".timer"

   # Network-related failures
   systemctl --failed | grep -E "network|dhcp|dns"
   ```

7. **Attempt to diagnose failure reason:**
   ```bash
   # Exit code and signal
   systemctl show UNIT_NAME | grep -E "ExecMainStatus|ExecMainCode|Result"

   # Unit file location and settings
   systemctl cat UNIT_NAME

   # Check if unit file exists and is valid
   systemctl show UNIT_NAME -p LoadState,ActiveState,SubState,Result
   ```

8. **Try to restart failed units:**
   ```bash
   # Ask user if they want to attempt restart
   # List failed units
   failed_units=$(systemctl --failed --no-legend | awk '{print $1}')

   # For each unit, ask to restart
   for unit in $failed_units; do
     echo "Attempting to restart: $unit"
     sudo systemctl restart "$unit"
     systemctl is-active --quiet "$unit" && echo "✓ $unit restarted successfully" || echo "✗ $unit restart failed"
   done
   ```

9. **Check for masked units:**
   ```bash
   # List masked units
   systemctl list-unit-files | grep masked

   # Check if failed unit is masked
   systemctl is-enabled UNIT_NAME
   ```

10. **Generate failure report:**
    ```bash
    cat > /tmp/failed-units-report.txt << EOF
    Failed Units Report - $(date)
    ======================================

    Failed Units Summary:
    $(systemctl --failed --no-pager)

    Detailed Status:
    EOF

    for unit in $(systemctl --failed --no-legend | awk '{print $1}'); do
      echo "" >> /tmp/failed-units-report.txt
      echo "=== $unit ===" >> /tmp/failed-units-report.txt
      systemctl status "$unit" --no-pager -l >> /tmp/failed-units-report.txt 2>&1
      echo "" >> /tmp/failed-units-report.txt
      echo "Recent Logs:" >> /tmp/failed-units-report.txt
      journalctl -u "$unit" -n 20 --no-pager >> /tmp/failed-units-report.txt 2>&1
      echo "" >> /tmp/failed-units-report.txt
    done

    cat /tmp/failed-units-report.txt
    ```

## Present Summary to User

Provide:
- Number of failed units
- List of failed unit names and types
- Failure reasons (exit codes, signals)
- Recent log entries for each
- Recommended actions

## Common Failed Units & Solutions

**NetworkManager-wait-online.service:**
- Usually safe to ignore or disable if not needed
- `sudo systemctl disable NetworkManager-wait-online.service`

**ModemManager.service:**
- May fail if no modem hardware present
- Can disable: `sudo systemctl disable ModemManager.service`

**bluetooth.service:**
- Check firmware: `journalctl -u bluetooth | grep -i firmware`
- Restart: `sudo systemctl restart bluetooth`

**systemd-resolved.service:**
- Check config: `/etc/systemd/resolved.conf`
- DNS issues: `resolvectl status`

**Mount units (*.mount):**
- Check fstab: `cat /etc/fstab`
- Verify device exists: `lsblk`
- Check mount point permissions

**User services:**
- Check user journal: `journalctl --user -u UNIT_NAME`
- May need `loginctl enable-linger USER`

## Cleanup Actions

```bash
# Reset failed state
sudo systemctl reset-failed

# Disable permanently failed units (ask first!)
sudo systemctl disable UNIT_NAME

# Mask unit to prevent activation
sudo systemctl mask UNIT_NAME

# Unmask unit
sudo systemctl unmask UNIT_NAME

# Reload systemd configuration
sudo systemctl daemon-reload
```

## Notes

- Not all failures are critical - some are expected
- Check if service is actually needed before disabling
- Some failures may be due to hardware not present (modems, bluetooth)
- Mount failures can prevent boot - be careful with fstab changes
- User units are separate from system units
- Use `systemctl reset-failed` to clear failed state after fixing