Add model documentation

1a07743 verified 5 days ago

3.89 kB

	---
	license: mit
	language:
	- en
	library_name: scikit-learn
	tags:
	- predictive-maintenance
	- random-forest
	- binary-classification
	- engine-maintenance
	datasets:
	- nasa-cmapss
	metrics:
	- accuracy
	- f1
	- f2
	- roc-auc
	---

	# Engine Predictive Maintenance Model

	## Model Overview
	This is a Tuned Random Forest Classifier trained for predictive engine maintenance with SMOTE oversampling to handle class imbalance and achieve high recall for failure detection.

	## Model Details
	- Model Type: Random Forest Classifier with SMOTE Pipeline
	- Framework: scikit-learn, imbalanced-learn
	- Task: Binary Classification (Engine Condition: Good/Failing)
	- Input Features: 14 engineered sensor features (RPM, pressure, temperature, etc.)
	- Output: Probability of engine failure (0-1)

	## Model Performance

	### Test Set Metrics

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Accuracy \| 0.6340 \|
	\| Precision \| 0.7456 \|
	\| Recall \| 0.6366 \|
	\| F1 Score \| 0.6868 \|
	\| F2 Score \| 0.6558 \|
	\| ROC-AUC \| 0.6893 \|
	\| Brier Score \| 0.2195 \|

	## Key Insights
	- High Recall (0.6366): Detects ~64% of actual failures
	- Competitive Precision (0.7456): ~75% of predictions are correct
	- Strong AUC (0.6893): Good discrimination between failure and non-failure cases

	## Intended Use

	This model is designed for:
	- Predictive Maintenance: Identify engines at risk of failure before breakdown
	- Condition Monitoring: Support data-driven maintenance decision-making
	- Fleet Management: Optimize maintenance scheduling and resource allocation
	- Risk Assessment: Provide failure probability scores for maintenance prioritization

	## Limitations

	- Trained on historical engine data with specific sensor configurations
	- Performance may vary with new sensor types or operating conditions
	- Model requires regular retraining with updated failure data
	- Does not capture temporal degradation patterns (time-series)
	- Assumes consistent sensor calibration and operating conditions

	## Training Data

	- Dataset: Engine Predictive Maintenance Dataset
	- Total Samples: 19,581 engines
	- Training Samples: 13,674 (70%)
	- Test Samples: 3,907 (20%)
	- Features: 14 engineered features (6 raw + 8 derived)
	- Class Distribution: Imbalanced (Good: ~63%, Failure: ~37%)

	## Training Procedure

	1. Data preprocessing and feature engineering
	2. Train-test split (70-20-10)
	3. SMOTE oversampling on training data to handle class imbalance
	4. Hyperparameter tuning via GridSearchCV with 5-fold cross-validation
	5. Model evaluation on held-out test set

	## Hyperparameters
	- n_estimators: 400
	- max_depth: 12
	- min_samples_leaf: 4
	- SMOTE k_neighbors: 5
	- Random state: 42

	## Recommendations

	1. Threshold Tuning: Adjust decision threshold based on cost of false positives vs. false negatives
	2. Continuous Monitoring: Track model performance in production and retrain quarterly with new data
	3. Feature Importance: Use SHAP or feature importance analysis to identify critical sensors
	4. Ensemble Approaches: Consider combining with other models (XGBoost, LightGBM) for robust predictions
	5. Domain Expertise: Combine predictions with expert knowledge for final maintenance decisions

	## Citation

	If you use this model, please cite:

	```
	@misc{predictive-maintenance-model-2026,
	title={Engine Predictive Maintenance Model},
	author={GreatLearning Capstone Team},
	year={2026},
	howpublished={Hugging Face Hub},
	url={https://huggingface.co/models/nilanjanadevc/engine-predictive-maintenance-model}
	}
	```

	## License

	This model is released under the MIT License. See LICENSE file for details.

	## Contact & Support

	For questions or issues:
	- GitHub: [Check repository](https://github.com/nilanjanadevc/predictive-engine-maintainence-mlops)
	- Hugging Face: [@nilanjanadevc](https://huggingface.co/nilanjanadevc)