CBC Reference Model: Turbofan Remaining Useful Life (C-MAPSS FD001)
Pre-trained reference model for the CBC MLOps 100-Day Track (Capstone 4). Published twin of ML Development Capstone 4. This is the one deep-learning reference model — it ships as a torch
state_dict+model.py, NOT a single joblib.
Model details
- Type: single-layer LSTM (hidden 64) over a 30-cycle window of 15 normalized sensors -> scalar RUL (capped at 125).
- Framework: pytorch 2.12.0+cpu · Serialization:
manufacturing_lstm.pt(state_dict) +manufacturing_meta.joblib(normalization + arch). Reconstruct with the shippedmodel.py. - The LSTM is the rare counterpoint to the classical models: run-to-failure multivariate sensor sequences are the data shape deep learning is for. It beats an XGBoost baseline (RMSE 18.45) overall and decisively near failure.
Intended use
Decision-support estimate of operational cycles remaining, to prioritize inspection/maintenance. NOT an automated ground-or-fly authority. Teaching/reference artifact.
Training data
NASA C-MAPSS FD001 (100 train + 100 test run-to-failure engines, single operating condition). 15 non-flat sensors used. Simulated, no PII. NASA Open Data (public domain).
Metrics (test = last cycle of each test engine vs RUL_FD001, scored once)
| Model | Test RMSE | Near-failure RUL[0,50) |
|---|---|---|
| XGBoost baseline | 18.45 | — |
| LSTM (deployed) | 14.88 | 4.78 |
The LSTM wins overall and is decisively better in the operationally critical near-failure band.
How to load and predict
from huggingface_hub import snapshot_download
import sys, json
d = snapshot_download("careerbytecode/mlops-ref-manufacturing-rul")
sys.path.insert(0, d + "/model"); sys.path.insert(0, d)
from model import load_model, predict_rul # needs torch installed
model, meta = load_model(d + "/model")
sample = json.load(open(d + "/sample_input.json"))
print(predict_rul(model, meta, sample["window"])) # predicted RUL (cycles)
Serving requires torch and the shipped model.py (the class definition) — a joblib load alone will not work.
Limitations
- Trained on FD001 (one operating condition); FD002/FD004 (six conditions) need condition-aware normalization.
- RUL capped at 125: cannot distinguish a very-healthy from a merely-healthy engine, by design.
- Needs the full 30-cycle window + the training normalization stats to serve; a single reading is not enough. Simulated data — expect drift on real telemetry. Reference/teaching artifact only.
© 2015-2026 CareerByteCode. All rights reserved. | CC BY-NC-SA 4.0 (docs), MIT (code) | Authored by Raghavendra R, Platform Owner CareerByteCode, Solution Architect