--- license: mit language: - en library_name: scikit-learn tags: - predictive-maintenance - random-forest - binary-classification - engine-maintenance datasets: - nasa-cmapss metrics: - accuracy - f1 - f2 - roc-auc --- # Engine Predictive Maintenance Model ## Model Overview This is a **Tuned Random Forest Classifier** trained for predictive engine maintenance with SMOTE oversampling to handle class imbalance and achieve high recall for failure detection. ## Model Details - **Model Type**: Random Forest Classifier with SMOTE Pipeline - **Framework**: scikit-learn, imbalanced-learn - **Task**: Binary Classification (Engine Condition: Good/Failing) - **Input Features**: 14 engineered sensor features (RPM, pressure, temperature, etc.) - **Output**: Probability of engine failure (0-1) ## Model Performance ### Test Set Metrics | Metric | Score | |--------|-------| | Accuracy | 0.6340 | | Precision | 0.7456 | | Recall | 0.6366 | | F1 Score | 0.6868 | | F2 Score | 0.6558 | | ROC-AUC | 0.6893 | | Brier Score | 0.2195 | ## Key Insights - **High Recall (0.6366)**: Detects ~64% of actual failures - **Competitive Precision (0.7456)**: ~75% of predictions are correct - **Strong AUC (0.6893)**: Good discrimination between failure and non-failure cases ## Intended Use This model is designed for: - **Predictive Maintenance**: Identify engines at risk of failure before breakdown - **Condition Monitoring**: Support data-driven maintenance decision-making - **Fleet Management**: Optimize maintenance scheduling and resource allocation - **Risk Assessment**: Provide failure probability scores for maintenance prioritization ## Limitations - Trained on historical engine data with specific sensor configurations - Performance may vary with new sensor types or operating conditions - Model requires regular retraining with updated failure data - Does not capture temporal degradation patterns (time-series) - Assumes consistent sensor calibration and operating conditions ## Training Data - **Dataset**: Engine Predictive Maintenance Dataset - **Total Samples**: 19,581 engines - **Training Samples**: 13,674 (70%) - **Test Samples**: 3,907 (20%) - **Features**: 14 engineered features (6 raw + 8 derived) - **Class Distribution**: Imbalanced (Good: ~63%, Failure: ~37%) ## Training Procedure 1. Data preprocessing and feature engineering 2. Train-test split (70-20-10) 3. SMOTE oversampling on training data to handle class imbalance 4. Hyperparameter tuning via GridSearchCV with 5-fold cross-validation 5. Model evaluation on held-out test set ## Hyperparameters - **n_estimators**: 400 - **max_depth**: 12 - **min_samples_leaf**: 4 - **SMOTE k_neighbors**: 5 - **Random state**: 42 ## Recommendations 1. **Threshold Tuning**: Adjust decision threshold based on cost of false positives vs. false negatives 2. **Continuous Monitoring**: Track model performance in production and retrain quarterly with new data 3. **Feature Importance**: Use SHAP or feature importance analysis to identify critical sensors 4. **Ensemble Approaches**: Consider combining with other models (XGBoost, LightGBM) for robust predictions 5. **Domain Expertise**: Combine predictions with expert knowledge for final maintenance decisions ## Citation If you use this model, please cite: ``` @misc{predictive-maintenance-model-2026, title={Engine Predictive Maintenance Model}, author={GreatLearning Capstone Team}, year={2026}, howpublished={Hugging Face Hub}, url={https://huggingface.co/models/nilanjanadevc/engine-predictive-maintenance-model} } ``` ## License This model is released under the MIT License. See LICENSE file for details. ## Contact & Support For questions or issues: - GitHub: [Check repository](https://github.com/nilanjanadevc/predictive-engine-maintainence-mlops) - Hugging Face: [@nilanjanadevc](https://huggingface.co/nilanjanadevc)