| # GPU Monitoring and Fan Control System |
|
|
| A comprehensive GPU monitoring and fan control system for AMD GPUs with real-time monitoring, advanced fan control, web interface, and system integration. |
|
|
| ## Features |
|
|
| ### 🖥️ Desktop Monitoring |
| - **Real-time GPU monitoring** with PyQt5 desktop overlay |
| - **System tray integration** for minimal footprint monitoring |
| - **Configurable display modes** (overlay, tray, full dashboard) |
| - **Multiple GPU support** with automatic detection |
|
|
| ### 🌡️ Advanced Fan Control |
| - **Multiple temperature curves** (Silent, Balanced, Performance, Custom) |
| - **Profile-based control** with hotkey switching |
| - **Safety limits** and automatic fallback modes |
| - **Manual override** capabilities |
|
|
| ### 🌐 Web Interface |
| - **Remote monitoring** via web browser |
| - **Real-time charts** and historical data |
| - **Mobile-responsive** design |
| - **API endpoints** for integration |
|
|
| ### 📊 Data Logging & Analysis |
| - **Historical data storage** with SQLite |
| - **Performance analytics** and trend analysis |
| - **Export capabilities** (CSV, JSON) |
| - **Alert system** for temperature thresholds |
|
|
| ### 🔧 System Integration |
| - **Systemd service** for automatic startup |
| - **Configuration management** with JSON profiles |
| - **Autostart integration** for desktop environments |
| - **Permission handling** and error recovery |
|
|
| ## Installation |
|
|
| ### Prerequisites |
| ```bash |
| # Install required packages |
| sudo apt update |
| sudo apt install python3 python3-pip python3-venv |
| sudo apt install python3-pyqt5 python3-pyqt5.qtopengl |
| sudo apt install python3-matplotlib python3-flask |
| ``` |
|
|
| ### Setup |
| ```bash |
| # Create project directory |
| mkdir -p ~/gpu_monitoring_system |
| cd ~/gpu_monitoring_system |
| |
| # Create virtual environment |
| python3 -m venv venv |
| source venv/bin/activate |
| |
| # Install Python dependencies |
| pip install -r requirements.txt |
| |
| # Run setup script |
| ./setup.sh |
| ``` |
|
|
| ## Usage |
|
|
| ### Desktop Monitor |
| ```bash |
| # Start desktop monitoring overlay |
| python3 gpu_monitor_desktop.py |
| |
| # Start with system tray |
| python3 gpu_monitor_desktop.py --tray |
| |
| # Start full dashboard |
| python3 gpu_monitor_desktop.py --dashboard |
| ``` |
|
|
| ### Fan Control |
| ```bash |
| # Start fan control with default profile |
| python3 gpu_fan_controller.py |
| |
| # Start with specific profile |
| python3 gpu_fan_controller.py --profile performance |
| |
| # Start with custom configuration |
| python3 gpu_fan_controller.py --config custom.json |
| ``` |
|
|
| ### Web Interface |
| ```bash |
| # Start web server |
| python3 web_interface.py |
| |
| # Access at http://localhost:5000 |
| ``` |
|
|
| ### System Service |
| ```bash |
| # Install as system service |
| sudo ./install_service.sh |
| |
| # Start service |
| sudo systemctl start gpu-monitoring |
| |
| # Enable auto-start |
| sudo systemctl enable gpu-monitoring |
| ``` |
|
|
| ## Configuration |
|
|
| ### Fan Control Profiles |
| Create custom fan control profiles in `config/fan_profiles.json`: |
|
|
| ```json |
| { |
| "silent": { |
| "name": "Silent", |
| "description": "Quiet operation with lower temperatures", |
| "curve": { |
| "min_temp": 40, |
| "max_temp": 65, |
| "min_pwm": 120, |
| "max_pwm": 220 |
| }, |
| "safety": { |
| "emergency_temp": 85, |
| "emergency_pwm": 255 |
| } |
| }, |
| "performance": { |
| "name": "Performance", |
| "description": "Maximum cooling for high performance", |
| "curve": { |
| "min_temp": 35, |
| "max_temp": 55, |
| "min_pwm": 180, |
| "max_pwm": 255 |
| }, |
| "safety": { |
| "emergency_temp": 80, |
| "emergency_pwm": 255 |
| } |
| } |
| } |
| ``` |
|
|
| ### Monitoring Configuration |
| Configure monitoring settings in `config/monitoring.json`: |
|
|
| ```json |
| { |
| "update_interval": 1.0, |
| "display_mode": "overlay", |
| "show_gpu_load": true, |
| "show_temperature": true, |
| "show_fan_speed": true, |
| "show_power": true, |
| "show_vram": true, |
| "alerts": { |
| "enabled": true, |
| "temp_warning": 75, |
| "temp_critical": 85, |
| "power_warning": 200 |
| } |
| } |
| ``` |
|
|
| ## Monitoring Data |
|
|
| The system collects and stores the following data: |
|
|
| ### GPU Metrics |
| - **Temperature**: Core temperature in Celsius |
| - **Load**: GPU utilization percentage |
| - **Fan Speed**: RPM and PWM percentage |
| - **Power**: Current power draw in watts |
| - **VRAM**: Used and total memory |
| - **Clocks**: Core and memory clock speeds |
|
|
| ### System Metrics |
| - **CPU Usage**: Overall system load |
| - **Memory Usage**: System RAM utilization |
| - **Disk Usage**: Storage space monitoring |
| - **Network**: Bandwidth usage |
|
|
| ## Web Interface Features |
|
|
| ### Dashboard |
| - Real-time metric display |
| - Historical charts with configurable time ranges |
| - System health overview |
| - Alert status and history |
|
|
| ### Charts |
| - Temperature trends over time |
| - Fan speed and PWM curves |
| - Power consumption patterns |
| - GPU utilization history |
|
|
| ### Configuration |
| - Fan profile management |
| - Alert threshold configuration |
| - Display settings |
| - Data export options |
|
|
| ## API Endpoints |
|
|
| ### Monitoring Data |
| - `GET /api/status` - Current GPU status |
| - `GET /api/history` - Historical data |
| - `GET /api/metrics` - All available metrics |
|
|
| ### Fan Control |
| - `POST /api/fan/profile` - Set fan profile |
| - `POST /api/fan/manual` - Manual fan control |
| - `GET /api/fan/status` - Current fan status |
|
|
| ### System |
| - `GET /api/system` - System information |
| - `POST /api/alerts` - Configure alerts |
| - `GET /api/logs` - System logs |
|
|
| ## Troubleshooting |
|
|
| ### Permission Issues |
| If you encounter permission errors with GPU monitoring: |
|
|
| ```bash |
| # Check GPU permissions |
| ls -la /sys/class/drm/card*/device/hwmon/ |
| |
| # Add user to video group |
| sudo usermod -a -G video $USER |
| |
| # Or run with sudo for fan control |
| sudo python3 gpu_fan_controller.py |
| ``` |
|
|
| ### Missing Dependencies |
| ```bash |
| # Install missing PyQt5 |
| pip install PyQt5 PyQt5-sip |
| |
| # Install missing matplotlib |
| pip install matplotlib |
| |
| # Install missing Flask |
| pip install flask |
| ``` |
|
|
| ### Service Issues |
| ```bash |
| # Check service status |
| sudo systemctl status gpu-monitoring |
| |
| # View service logs |
| sudo journalctl -u gpu-monitoring -f |
| |
| # Restart service |
| sudo systemctl restart gpu-monitoring |
| ``` |
|
|
| ## Development |
|
|
| ### Adding New GPU Support |
| 1. Update `gpu_detector.py` with new GPU detection logic |
| 2. Add temperature sensor paths for new GPU models |
| 3. Test with `python3 test_gpu_detection.py` |
|
|
| ### Custom Fan Curves |
| 1. Create new profile in `config/fan_profiles.json` |
| 2. Test with `python3 gpu_fan_controller.py --profile new_profile` |
| 3. Validate temperature response and stability |
|
|
| ### Web Interface Extensions |
| 1. Add new routes in `web_interface.py` |
| 2. Create templates in `templates/` directory |
| 3. Add static assets in `static/` directory |
|
|
| ## License |
|
|
| This project is licensed under the MIT License - see the LICENSE file for details. |
|
|
| ## Contributing |
|
|
| 1. Fork the repository |
| 2. Create a feature branch |
| 3. Make your changes |
| 4. Add tests for your changes |
| 5. Submit a pull request |
|
|
| ## Support |
|
|
| For support and questions: |
| - Create an issue on GitHub |
| - Check the troubleshooting section |
| - Review the configuration examples |