nilanjanadevc's picture
Add model documentation
1a07743 verified
---
license: mit
language:
- en
library_name: scikit-learn
tags:
- predictive-maintenance
- random-forest
- binary-classification
- engine-maintenance
datasets:
- nasa-cmapss
metrics:
- accuracy
- f1
- f2
- roc-auc
---
# Engine Predictive Maintenance Model
## Model Overview
This is a **Tuned Random Forest Classifier** trained for predictive engine maintenance with SMOTE oversampling to handle class imbalance and achieve high recall for failure detection.
## Model Details
- **Model Type**: Random Forest Classifier with SMOTE Pipeline
- **Framework**: scikit-learn, imbalanced-learn
- **Task**: Binary Classification (Engine Condition: Good/Failing)
- **Input Features**: 14 engineered sensor features (RPM, pressure, temperature, etc.)
- **Output**: Probability of engine failure (0-1)
## Model Performance
### Test Set Metrics
| Metric | Score |
|--------|-------|
| Accuracy | 0.6340 |
| Precision | 0.7456 |
| Recall | 0.6366 |
| F1 Score | 0.6868 |
| F2 Score | 0.6558 |
| ROC-AUC | 0.6893 |
| Brier Score | 0.2195 |
## Key Insights
- **High Recall (0.6366)**: Detects ~64% of actual failures
- **Competitive Precision (0.7456)**: ~75% of predictions are correct
- **Strong AUC (0.6893)**: Good discrimination between failure and non-failure cases
## Intended Use
This model is designed for:
- **Predictive Maintenance**: Identify engines at risk of failure before breakdown
- **Condition Monitoring**: Support data-driven maintenance decision-making
- **Fleet Management**: Optimize maintenance scheduling and resource allocation
- **Risk Assessment**: Provide failure probability scores for maintenance prioritization
## Limitations
- Trained on historical engine data with specific sensor configurations
- Performance may vary with new sensor types or operating conditions
- Model requires regular retraining with updated failure data
- Does not capture temporal degradation patterns (time-series)
- Assumes consistent sensor calibration and operating conditions
## Training Data
- **Dataset**: Engine Predictive Maintenance Dataset
- **Total Samples**: 19,581 engines
- **Training Samples**: 13,674 (70%)
- **Test Samples**: 3,907 (20%)
- **Features**: 14 engineered features (6 raw + 8 derived)
- **Class Distribution**: Imbalanced (Good: ~63%, Failure: ~37%)
## Training Procedure
1. Data preprocessing and feature engineering
2. Train-test split (70-20-10)
3. SMOTE oversampling on training data to handle class imbalance
4. Hyperparameter tuning via GridSearchCV with 5-fold cross-validation
5. Model evaluation on held-out test set
## Hyperparameters
- **n_estimators**: 400
- **max_depth**: 12
- **min_samples_leaf**: 4
- **SMOTE k_neighbors**: 5
- **Random state**: 42
## Recommendations
1. **Threshold Tuning**: Adjust decision threshold based on cost of false positives vs. false negatives
2. **Continuous Monitoring**: Track model performance in production and retrain quarterly with new data
3. **Feature Importance**: Use SHAP or feature importance analysis to identify critical sensors
4. **Ensemble Approaches**: Consider combining with other models (XGBoost, LightGBM) for robust predictions
5. **Domain Expertise**: Combine predictions with expert knowledge for final maintenance decisions
## Citation
If you use this model, please cite:
```
@misc{predictive-maintenance-model-2026,
title={Engine Predictive Maintenance Model},
author={GreatLearning Capstone Team},
year={2026},
howpublished={Hugging Face Hub},
url={https://huggingface.co/models/nilanjanadevc/engine-predictive-maintenance-model}
}
```
## License
This model is released under the MIT License. See LICENSE file for details.
## Contact & Support
For questions or issues:
- GitHub: [Check repository](https://github.com/nilanjanadevc/predictive-engine-maintainence-mlops)
- Hugging Face: [@nilanjanadevc](https://huggingface.co/nilanjanadevc)