meccatronis commited on
Commit
d01b180
·
verified ·
1 Parent(s): bbf822e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +299 -0
README.md ADDED
@@ -0,0 +1,299 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # GPU Monitoring and Fan Control System
2
+
3
+ A comprehensive GPU monitoring and fan control system for AMD GPUs with real-time monitoring, advanced fan control, web interface, and system integration.
4
+
5
+ ## Features
6
+
7
+ ### 🖥️ Desktop Monitoring
8
+ - **Real-time GPU monitoring** with PyQt5 desktop overlay
9
+ - **System tray integration** for minimal footprint monitoring
10
+ - **Configurable display modes** (overlay, tray, full dashboard)
11
+ - **Multiple GPU support** with automatic detection
12
+
13
+ ### 🌡️ Advanced Fan Control
14
+ - **Multiple temperature curves** (Silent, Balanced, Performance, Custom)
15
+ - **Profile-based control** with hotkey switching
16
+ - **Safety limits** and automatic fallback modes
17
+ - **Manual override** capabilities
18
+
19
+ ### 🌐 Web Interface
20
+ - **Remote monitoring** via web browser
21
+ - **Real-time charts** and historical data
22
+ - **Mobile-responsive** design
23
+ - **API endpoints** for integration
24
+
25
+ ### 📊 Data Logging & Analysis
26
+ - **Historical data storage** with SQLite
27
+ - **Performance analytics** and trend analysis
28
+ - **Export capabilities** (CSV, JSON)
29
+ - **Alert system** for temperature thresholds
30
+
31
+ ### 🔧 System Integration
32
+ - **Systemd service** for automatic startup
33
+ - **Configuration management** with JSON profiles
34
+ - **Autostart integration** for desktop environments
35
+ - **Permission handling** and error recovery
36
+
37
+ ## Installation
38
+
39
+ ### Prerequisites
40
+ ```bash
41
+ # Install required packages
42
+ sudo apt update
43
+ sudo apt install python3 python3-pip python3-venv
44
+ sudo apt install python3-pyqt5 python3-pyqt5.qtopengl
45
+ sudo apt install python3-matplotlib python3-flask
46
+ ```
47
+
48
+ ### Setup
49
+ ```bash
50
+ # Create project directory
51
+ mkdir -p ~/gpu_monitoring_system
52
+ cd ~/gpu_monitoring_system
53
+
54
+ # Create virtual environment
55
+ python3 -m venv venv
56
+ source venv/bin/activate
57
+
58
+ # Install Python dependencies
59
+ pip install -r requirements.txt
60
+
61
+ # Run setup script
62
+ ./setup.sh
63
+ ```
64
+
65
+ ## Usage
66
+
67
+ ### Desktop Monitor
68
+ ```bash
69
+ # Start desktop monitoring overlay
70
+ python3 gpu_monitor_desktop.py
71
+
72
+ # Start with system tray
73
+ python3 gpu_monitor_desktop.py --tray
74
+
75
+ # Start full dashboard
76
+ python3 gpu_monitor_desktop.py --dashboard
77
+ ```
78
+
79
+ ### Fan Control
80
+ ```bash
81
+ # Start fan control with default profile
82
+ python3 gpu_fan_controller.py
83
+
84
+ # Start with specific profile
85
+ python3 gpu_fan_controller.py --profile performance
86
+
87
+ # Start with custom configuration
88
+ python3 gpu_fan_controller.py --config custom.json
89
+ ```
90
+
91
+ ### Web Interface
92
+ ```bash
93
+ # Start web server
94
+ python3 web_interface.py
95
+
96
+ # Access at http://localhost:5000
97
+ ```
98
+
99
+ ### System Service
100
+ ```bash
101
+ # Install as system service
102
+ sudo ./install_service.sh
103
+
104
+ # Start service
105
+ sudo systemctl start gpu-monitoring
106
+
107
+ # Enable auto-start
108
+ sudo systemctl enable gpu-monitoring
109
+ ```
110
+
111
+ ## Configuration
112
+
113
+ ### Fan Control Profiles
114
+ Create custom fan control profiles in `config/fan_profiles.json`:
115
+
116
+ ```json
117
+ {
118
+ "silent": {
119
+ "name": "Silent",
120
+ "description": "Quiet operation with lower temperatures",
121
+ "curve": {
122
+ "min_temp": 40,
123
+ "max_temp": 65,
124
+ "min_pwm": 120,
125
+ "max_pwm": 220
126
+ },
127
+ "safety": {
128
+ "emergency_temp": 85,
129
+ "emergency_pwm": 255
130
+ }
131
+ },
132
+ "performance": {
133
+ "name": "Performance",
134
+ "description": "Maximum cooling for high performance",
135
+ "curve": {
136
+ "min_temp": 35,
137
+ "max_temp": 55,
138
+ "min_pwm": 180,
139
+ "max_pwm": 255
140
+ },
141
+ "safety": {
142
+ "emergency_temp": 80,
143
+ "emergency_pwm": 255
144
+ }
145
+ }
146
+ }
147
+ ```
148
+
149
+ ### Monitoring Configuration
150
+ Configure monitoring settings in `config/monitoring.json`:
151
+
152
+ ```json
153
+ {
154
+ "update_interval": 1.0,
155
+ "display_mode": "overlay",
156
+ "show_gpu_load": true,
157
+ "show_temperature": true,
158
+ "show_fan_speed": true,
159
+ "show_power": true,
160
+ "show_vram": true,
161
+ "alerts": {
162
+ "enabled": true,
163
+ "temp_warning": 75,
164
+ "temp_critical": 85,
165
+ "power_warning": 200
166
+ }
167
+ }
168
+ ```
169
+
170
+ ## Monitoring Data
171
+
172
+ The system collects and stores the following data:
173
+
174
+ ### GPU Metrics
175
+ - **Temperature**: Core temperature in Celsius
176
+ - **Load**: GPU utilization percentage
177
+ - **Fan Speed**: RPM and PWM percentage
178
+ - **Power**: Current power draw in watts
179
+ - **VRAM**: Used and total memory
180
+ - **Clocks**: Core and memory clock speeds
181
+
182
+ ### System Metrics
183
+ - **CPU Usage**: Overall system load
184
+ - **Memory Usage**: System RAM utilization
185
+ - **Disk Usage**: Storage space monitoring
186
+ - **Network**: Bandwidth usage
187
+
188
+ ## Web Interface Features
189
+
190
+ ### Dashboard
191
+ - Real-time metric display
192
+ - Historical charts with configurable time ranges
193
+ - System health overview
194
+ - Alert status and history
195
+
196
+ ### Charts
197
+ - Temperature trends over time
198
+ - Fan speed and PWM curves
199
+ - Power consumption patterns
200
+ - GPU utilization history
201
+
202
+ ### Configuration
203
+ - Fan profile management
204
+ - Alert threshold configuration
205
+ - Display settings
206
+ - Data export options
207
+
208
+ ## API Endpoints
209
+
210
+ ### Monitoring Data
211
+ - `GET /api/status` - Current GPU status
212
+ - `GET /api/history` - Historical data
213
+ - `GET /api/metrics` - All available metrics
214
+
215
+ ### Fan Control
216
+ - `POST /api/fan/profile` - Set fan profile
217
+ - `POST /api/fan/manual` - Manual fan control
218
+ - `GET /api/fan/status` - Current fan status
219
+
220
+ ### System
221
+ - `GET /api/system` - System information
222
+ - `POST /api/alerts` - Configure alerts
223
+ - `GET /api/logs` - System logs
224
+
225
+ ## Troubleshooting
226
+
227
+ ### Permission Issues
228
+ If you encounter permission errors with GPU monitoring:
229
+
230
+ ```bash
231
+ # Check GPU permissions
232
+ ls -la /sys/class/drm/card*/device/hwmon/
233
+
234
+ # Add user to video group
235
+ sudo usermod -a -G video $USER
236
+
237
+ # Or run with sudo for fan control
238
+ sudo python3 gpu_fan_controller.py
239
+ ```
240
+
241
+ ### Missing Dependencies
242
+ ```bash
243
+ # Install missing PyQt5
244
+ pip install PyQt5 PyQt5-sip
245
+
246
+ # Install missing matplotlib
247
+ pip install matplotlib
248
+
249
+ # Install missing Flask
250
+ pip install flask
251
+ ```
252
+
253
+ ### Service Issues
254
+ ```bash
255
+ # Check service status
256
+ sudo systemctl status gpu-monitoring
257
+
258
+ # View service logs
259
+ sudo journalctl -u gpu-monitoring -f
260
+
261
+ # Restart service
262
+ sudo systemctl restart gpu-monitoring
263
+ ```
264
+
265
+ ## Development
266
+
267
+ ### Adding New GPU Support
268
+ 1. Update `gpu_detector.py` with new GPU detection logic
269
+ 2. Add temperature sensor paths for new GPU models
270
+ 3. Test with `python3 test_gpu_detection.py`
271
+
272
+ ### Custom Fan Curves
273
+ 1. Create new profile in `config/fan_profiles.json`
274
+ 2. Test with `python3 gpu_fan_controller.py --profile new_profile`
275
+ 3. Validate temperature response and stability
276
+
277
+ ### Web Interface Extensions
278
+ 1. Add new routes in `web_interface.py`
279
+ 2. Create templates in `templates/` directory
280
+ 3. Add static assets in `static/` directory
281
+
282
+ ## License
283
+
284
+ This project is licensed under the MIT License - see the LICENSE file for details.
285
+
286
+ ## Contributing
287
+
288
+ 1. Fork the repository
289
+ 2. Create a feature branch
290
+ 3. Make your changes
291
+ 4. Add tests for your changes
292
+ 5. Submit a pull request
293
+
294
+ ## Support
295
+
296
+ For support and questions:
297
+ - Create an issue on GitHub
298
+ - Check the troubleshooting section
299
+ - Review the configuration examples