---
license: mit
language:
  - en
library_name: scikit-learn
tags:
  - predictive-maintenance
  - random-forest
  - binary-classification
  - engine-maintenance
datasets:
  - nasa-cmapss
metrics:
  - accuracy
  - f1
  - f2
  - roc-auc
---

# Engine Predictive Maintenance Model

## Model Overview
This is a **Tuned Random Forest Classifier** trained for predictive engine maintenance with SMOTE oversampling to handle class imbalance and achieve high recall for failure detection.

## Model Details
- **Model Type**: Random Forest Classifier with SMOTE Pipeline
- **Framework**: scikit-learn, imbalanced-learn
- **Task**: Binary Classification (Engine Condition: Good/Failing)
- **Input Features**: 14 engineered sensor features (RPM, pressure, temperature, etc.)
- **Output**: Probability of engine failure (0-1)

## Model Performance

### Test Set Metrics

| Metric | Score |
|--------|-------|
| Accuracy | 0.6340 |
| Precision | 0.7456 |
| Recall | 0.6366 |
| F1 Score | 0.6868 |
| F2 Score | 0.6558 |
| ROC-AUC | 0.6893 |
| Brier Score | 0.2195 |

## Key Insights
- **High Recall (0.6366)**: Detects ~64% of actual failures
- **Competitive Precision (0.7456)**: ~75% of predictions are correct
- **Strong AUC (0.6893)**: Good discrimination between failure and non-failure cases

## Intended Use

This model is designed for:
- **Predictive Maintenance**: Identify engines at risk of failure before breakdown
- **Condition Monitoring**: Support data-driven maintenance decision-making
- **Fleet Management**: Optimize maintenance scheduling and resource allocation
- **Risk Assessment**: Provide failure probability scores for maintenance prioritization

## Limitations

- Trained on historical engine data with specific sensor configurations
- Performance may vary with new sensor types or operating conditions
- Model requires regular retraining with updated failure data
- Does not capture temporal degradation patterns (time-series)
- Assumes consistent sensor calibration and operating conditions

## Training Data

- **Dataset**: Engine Predictive Maintenance Dataset
- **Total Samples**: 19,581 engines
- **Training Samples**: 13,674 (70%)
- **Test Samples**: 3,907 (20%)
- **Features**: 14 engineered features (6 raw + 8 derived)
- **Class Distribution**: Imbalanced (Good: ~63%, Failure: ~37%)

## Training Procedure

1. Data preprocessing and feature engineering
2. Train-test split (70-20-10)
3. SMOTE oversampling on training data to handle class imbalance
4. Hyperparameter tuning via GridSearchCV with 5-fold cross-validation
5. Model evaluation on held-out test set

## Hyperparameters
- **n_estimators**: 400
- **max_depth**: 12
- **min_samples_leaf**: 4
- **SMOTE k_neighbors**: 5
- **Random state**: 42

## Recommendations

1. **Threshold Tuning**: Adjust decision threshold based on cost of false positives vs. false negatives
2. **Continuous Monitoring**: Track model performance in production and retrain quarterly with new data
3. **Feature Importance**: Use SHAP or feature importance analysis to identify critical sensors
4. **Ensemble Approaches**: Consider combining with other models (XGBoost, LightGBM) for robust predictions
5. **Domain Expertise**: Combine predictions with expert knowledge for final maintenance decisions

## Citation

If you use this model, please cite:

```
@misc{predictive-maintenance-model-2026,
  title={Engine Predictive Maintenance Model},
  author={GreatLearning Capstone Team},
  year={2026},
  howpublished={Hugging Face Hub},
  url={https://huggingface.co/models/nilanjanadevc/engine-predictive-maintenance-model}
}
```

## License

This model is released under the MIT License. See LICENSE file for details.

## Contact & Support

For questions or issues:
- GitHub: [Check repository](https://github.com/nilanjanadevc/predictive-engine-maintainence-mlops)
- Hugging Face: [@nilanjanadevc](https://huggingface.co/nilanjanadevc)