Hybrid ML-MCDM Framework for EV Battery End-of-Life Routing in India
Trained machine-learning and decision-theory artefacts for the Hybrid ML-MCDM Framework for EV Battery End-of-Life Routing in India research project. These weights power the live Streamlit demo that walks a single cell through anomaly gating → State-of-Health prediction → Remaining-Useful-Life estimation → regulatory-aware routing → Digital Product Passport emission.
The release bundles eight components that operate as an ensemble on a curated corpus of 1,581 cells across 7 chemistries (NMC · LFP · NCA · LCO · Zn-ion · Na-ion · other), drawn from public lab datasets (BatteryLife, NASA-PCOE, CALCE, Stanford) and supplemented with 130 synthetic Indian-context cells generated via PyBaMM (NMC) and BLAST-Lite (LFP).
Components
| Component | Artefact | Role |
|---|---|---|
| Tree-based anomaly check | isolation_forest/isolation_forest.pkl |
Isolation Forest (Liu et al. 2008) — flags cells out of training distribution |
| Neural-network anomaly check | vae_anomaly/best.pt |
Variational Autoencoder, latent dim 12, β-annealed |
| Anomaly-gate featurizer | anomaly_shared/feature_scaler.pkl + feature_meta.json |
StandardScaler + categorical metadata, 38 features |
| Global SoH regressor | xgboost_soh/xgboost_soh_audited.json |
XGBoost over 32 audited features (capacity-derived columns excluded) |
| Per-chemistry SoH router | per_chemistry/* |
Seven chemistry-specific XGBoost specialists + dispatch manifest |
| RUL regressor (classical) | xgboost_rul/xgboost_rul_audited_uncensored.json |
XGBoost trained on uncensored cells only |
| RUL regressor (deep learning) | tcn_rul/best.pt |
Temporal Convolutional Network (Bai et al. 2018), 60-cycle sequence input |
| TCN featurizer | tcn_rul/feature_scaler.pkl + feature_meta.json |
Numeric-only scaler matching the TCN's training-time pipeline |
Featurizers (scalers + category metadata + per-feature medians for NaN imputation) are bundled alongside each predictor so inference is fully reproducible without re-running the training pipeline.
Headline numbers
Validated on held-out test cells (no overlap with training):
- Audited XGBoost SoH — test RMSE 2.43 percentage points · R² 0.996
- Chemistry-router SoH (aggregate) — test RMSE 1.89 pp · lifts grade-accuracy on minority chemistries by up to +5.33 pp (NCA), +3.37 pp (Zn-ion), +4.60 pp ("other") vs the global model alone
- XGBoost RUL — 1.92 % RMSE-of-range on the uncensored test partition
- TCN RUL — 2.23 % RMSE-of-range on the same partition
- Isolation Forest + VAE anomaly gates — calibrated at the 5th-percentile train-error threshold; flag rate on held-out test ≈ 5.7 %
Per-chemistry breakdowns and the corresponding manifest files are included alongside each predictor.
How to load
from huggingface_hub import hf_hub_download
import xgboost as xgb
import joblib
REPO_ID = "cmpunkmannu/hybrid-ml-mcdm-battery-eol"
# Global audited SoH predictor
model_path = hf_hub_download(repo_id=REPO_ID, filename="xgboost_soh/xgboost_soh_audited.json")
scaler_path = hf_hub_download(repo_id=REPO_ID, filename="xgboost_soh/feature_scaler_audited.pkl")
soh_model = xgb.XGBRegressor()
soh_model.load_model(model_path)
scaler = joblib.load(scaler_path)
# Featurize one cell at one cycle → (1, 32) scaled array → predict
# soh_pred = float(soh_model.predict(scaler.transform(x))[0])
Per-chemistry specialists follow the same pattern, dispatched by the cell's detected chemistry; see the live demo's frontend/components/soh_grade.py for the full inference pipeline.
Intended use
Research and demonstration. The framework targets:
- Academic reproducibility — every reported metric can be reproduced from these artefacts plus the public training corpus indices
- Regulatory exploration — the multi-criteria routing engine (Fuzzy BWM + TOPSIS) supports five weight regimes operationalising the EU Battery Regulation 2023/1542, GBA Battery Pass v1.2, and India BWMR 2022 / 2024 / 2025 amendments
- Demonstration of end-to-end ML → MCDM → Digital Product Passport pipelines for electric-vehicle battery EoL decisions
These artefacts are not deployment-ready for live BMS data, fleet management systems, or regulator-submission pipelines. The training corpus is curated lab data plus a small (130-cell) synthetic Indian-context augmentation; generalisation to real fleet data remains future work.
Training corpus
1,581 cells across 7 chemistries · 2.9 million cycle-level rows. Real cells from:
- BatteryLife (multi-source aggregation, 850+ cells)
- NASA-PCOE Random-Walk + Recommissioning batteries
- CALCE (CS, CX, INR series)
- Stanford (Severson / Attia)
- Plus 130 synthetic Indian-context cells: PyBaMM electrochemical (4 climates × NMC) + BLAST-Lite semi-empirical (4 climates × LFP)
Splits are leakage-controlled (cells never overlap across train/val/test) and chemistry-stratified.
Citation
Manuscript submitted to MDPI Batteries / World Electric Vehicle Journal. Citation will be updated here when the paper lands.
If you use these models in research before publication, please cite:
@misc{kumar2026hybrid,
author = {Kumar, Rishabh},
title = {Hybrid ML-MCDM Framework for EV Battery End-of-Life Routing in India},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/cmpunkmannu/hybrid-ml-mcdm-battery-eol}}
}
License
Released under the MIT License. The training corpus is itself drawn from public datasets each released under their own permissive licences (BatteryArchive variants, CC-BY-4.0, MIT, and CC0).
Contact
Rishabh Kumar — rishabhkumards07@gmail.com · linkedin.com/in/rishabh-kumar-815601230 · github.com/Rishabhmannu