--- title: Server Failure Predictor emoji: 🦀 colorFrom: green colorTo: green sdk: gradio sdk_version: 5.49.1 app_file: app.py pinned: false license: mit short_description: Predicts the probability of server failure --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference #### Server Health Sentinel AI Predicting Thermal Runaway Before It Happens This project is a Proof-of-Concept (PoC) for AIOps (Artificial Intelligence for IT Operations). It demonstrates how machine learning can move beyond simple "threshold-based" monitoring to predictive failure analysis. ### Try the Demo Adjust the sliders in the Live Telemetry Simulation panel to see how the model reacts to different stress scenarios. Scenario A (Idle): Low CPU, Low Temp -> System Normal Scenario B (Gaming/Load): High Sustained CPU, High Temp -> CRITICAL FAILURE IMMINENT Scenario C (Cool Down): Low Current CPU but High Sustained Load + High Temp -> CRITICAL (Predicting residual heat) ### The Model: Random Forest Classifier Unlike simple if/else logic, this system uses a Random Forest Classifier (an ensemble of 100 decision trees) to weigh multiple factors simultaneously. It was trained on a custom dataset of 10,000+ telemetry points collected from a high-performance Linux gaming laptop (HP Victus 15 / Ryzen 5 5600H) under various real-world conditions: Idle/Web Browsing (Baseline) Compilation/Workloads (CPU Spikes) Gaming (Sekiro: Shadows Die Twice) (Sustained CPU+GPU Thermal Stress) ### Feature Engineering The model doesn't just look at current stats. It relies on engineered trend features to understand context: Rolling Averages: A 1-minute sustained load is more dangerous than a 1-second spike. Thermal Inertia: Combining current temp with recent load history to predict "heat soak." Rate of Change: How fast is the temperature climbing? ### Performance AUC Score: ~0.99 (Highly accurate on test set) False Positive Rate: <0.5% False Negative Rate: <1.0% ### Tech Stack Training: Scikit-Learn, Pandas, Psutil Deployment: Gradio, Hugging Face Spaces Hardware Target: x86_64 Linux Systems