# GPU Monitoring and Fan Control System A comprehensive GPU monitoring and fan control system for AMD GPUs with real-time monitoring, advanced fan control, web interface, and system integration. ## Features ### 🖥️ Desktop Monitoring - **Real-time GPU monitoring** with PyQt5 desktop overlay - **System tray integration** for minimal footprint monitoring - **Configurable display modes** (overlay, tray, full dashboard) - **Multiple GPU support** with automatic detection ### 🌡️ Advanced Fan Control - **Multiple temperature curves** (Silent, Balanced, Performance, Custom) - **Profile-based control** with hotkey switching - **Safety limits** and automatic fallback modes - **Manual override** capabilities ### 🌐 Web Interface - **Remote monitoring** via web browser - **Real-time charts** and historical data - **Mobile-responsive** design - **API endpoints** for integration ### 📊 Data Logging & Analysis - **Historical data storage** with SQLite - **Performance analytics** and trend analysis - **Export capabilities** (CSV, JSON) - **Alert system** for temperature thresholds ### 🔧 System Integration - **Systemd service** for automatic startup - **Configuration management** with JSON profiles - **Autostart integration** for desktop environments - **Permission handling** and error recovery ## Installation ### Prerequisites ```bash # Install required packages sudo apt update sudo apt install python3 python3-pip python3-venv sudo apt install python3-pyqt5 python3-pyqt5.qtopengl sudo apt install python3-matplotlib python3-flask ``` ### Setup ```bash # Create project directory mkdir -p ~/gpu_monitoring_system cd ~/gpu_monitoring_system # Create virtual environment python3 -m venv venv source venv/bin/activate # Install Python dependencies pip install -r requirements.txt # Run setup script ./setup.sh ``` ## Usage ### Desktop Monitor ```bash # Start desktop monitoring overlay python3 gpu_monitor_desktop.py # Start with system tray python3 gpu_monitor_desktop.py --tray # Start full dashboard python3 gpu_monitor_desktop.py --dashboard ``` ### Fan Control ```bash # Start fan control with default profile python3 gpu_fan_controller.py # Start with specific profile python3 gpu_fan_controller.py --profile performance # Start with custom configuration python3 gpu_fan_controller.py --config custom.json ``` ### Web Interface ```bash # Start web server python3 web_interface.py # Access at http://localhost:5000 ``` ### System Service ```bash # Install as system service sudo ./install_service.sh # Start service sudo systemctl start gpu-monitoring # Enable auto-start sudo systemctl enable gpu-monitoring ``` ## Configuration ### Fan Control Profiles Create custom fan control profiles in `config/fan_profiles.json`: ```json { "silent": { "name": "Silent", "description": "Quiet operation with lower temperatures", "curve": { "min_temp": 40, "max_temp": 65, "min_pwm": 120, "max_pwm": 220 }, "safety": { "emergency_temp": 85, "emergency_pwm": 255 } }, "performance": { "name": "Performance", "description": "Maximum cooling for high performance", "curve": { "min_temp": 35, "max_temp": 55, "min_pwm": 180, "max_pwm": 255 }, "safety": { "emergency_temp": 80, "emergency_pwm": 255 } } } ``` ### Monitoring Configuration Configure monitoring settings in `config/monitoring.json`: ```json { "update_interval": 1.0, "display_mode": "overlay", "show_gpu_load": true, "show_temperature": true, "show_fan_speed": true, "show_power": true, "show_vram": true, "alerts": { "enabled": true, "temp_warning": 75, "temp_critical": 85, "power_warning": 200 } } ``` ## Monitoring Data The system collects and stores the following data: ### GPU Metrics - **Temperature**: Core temperature in Celsius - **Load**: GPU utilization percentage - **Fan Speed**: RPM and PWM percentage - **Power**: Current power draw in watts - **VRAM**: Used and total memory - **Clocks**: Core and memory clock speeds ### System Metrics - **CPU Usage**: Overall system load - **Memory Usage**: System RAM utilization - **Disk Usage**: Storage space monitoring - **Network**: Bandwidth usage ## Web Interface Features ### Dashboard - Real-time metric display - Historical charts with configurable time ranges - System health overview - Alert status and history ### Charts - Temperature trends over time - Fan speed and PWM curves - Power consumption patterns - GPU utilization history ### Configuration - Fan profile management - Alert threshold configuration - Display settings - Data export options ## API Endpoints ### Monitoring Data - `GET /api/status` - Current GPU status - `GET /api/history` - Historical data - `GET /api/metrics` - All available metrics ### Fan Control - `POST /api/fan/profile` - Set fan profile - `POST /api/fan/manual` - Manual fan control - `GET /api/fan/status` - Current fan status ### System - `GET /api/system` - System information - `POST /api/alerts` - Configure alerts - `GET /api/logs` - System logs ## Troubleshooting ### Permission Issues If you encounter permission errors with GPU monitoring: ```bash # Check GPU permissions ls -la /sys/class/drm/card*/device/hwmon/ # Add user to video group sudo usermod -a -G video $USER # Or run with sudo for fan control sudo python3 gpu_fan_controller.py ``` ### Missing Dependencies ```bash # Install missing PyQt5 pip install PyQt5 PyQt5-sip # Install missing matplotlib pip install matplotlib # Install missing Flask pip install flask ``` ### Service Issues ```bash # Check service status sudo systemctl status gpu-monitoring # View service logs sudo journalctl -u gpu-monitoring -f # Restart service sudo systemctl restart gpu-monitoring ``` ## Development ### Adding New GPU Support 1. Update `gpu_detector.py` with new GPU detection logic 2. Add temperature sensor paths for new GPU models 3. Test with `python3 test_gpu_detection.py` ### Custom Fan Curves 1. Create new profile in `config/fan_profiles.json` 2. Test with `python3 gpu_fan_controller.py --profile new_profile` 3. Validate temperature response and stability ### Web Interface Extensions 1. Add new routes in `web_interface.py` 2. Create templates in `templates/` directory 3. Add static assets in `static/` directory ## License This project is licensed under the MIT License - see the LICENSE file for details. ## Contributing 1. Fork the repository 2. Create a feature branch 3. Make your changes 4. Add tests for your changes 5. Submit a pull request ## Support For support and questions: - Create an issue on GitHub - Check the troubleshooting section - Review the configuration examples