Lithium-ion Battery State of Charge (SoC) Estimator
A robust machine learning model based on the RandomForestRegressor architecture optimized for real-time State of Charge (SoC) estimation in Battery Management Systems (BMS). The model is trained on the McMaster LG 18650HG2 dataset, leveraging structural dynamic features to mitigate the classic tree-extrapolation blindspot.
Model Summary
- Architecture: RandomForestRegressor (scikit-learn)
- Target Variable: State of Charge (SoC, scaled 0.0 to 1.0)
- Input Dimensions: 18 continuous features (electrical and thermal parameters)
- Data Source: McMaster LG 18650HG2 Li-ion Battery Dataset (LA92 / Dynamic Driving Profiles)
Features & Feature Engineering
The model relies on a sequence of raw telemetry values alongside rolling statistical window properties to capture electrochemical polarization and relaxation trends:
- Direct Attributes: Voltage (V), Current (A), Temperature (°C)
- State Tracking: Cumulative Ah (Coulomb counting tracking proxy)
- Transient Radians: dV/dt, dI/dt gradients
- Temporal Windows: 5, 10, and 20 sample rolling means and standard deviations for both voltage and current.
Final Performance Evaluation
Evaluated chronologically on unseen out-of-sample drive cycles to replicate practical application behaviors:
- R² Score: ≥ 0.92 (High linear variance tracking match)
- Mean Absolute Error (MAE): < 3% SoC Error across the full operating spectrum
- Target Status: All verification limits and accuracy benchmarks successfully met.
Implementation & Usage
To load and generate inference on live telemetry samples, make sure you pull both the model and the corresponding feature scaler pipeline:
import joblib
import pandas as pd
from huggingface_hub import hf_hub_download
# 1. Download components from the Hub
repo_id = "UNUSUALxd/battery-soc-random-forest"
model_path = hf_hub_download(repo_id=repo_id, filename="rf_soc_estimator.joblib")
scaler_path = hf_hub_download(repo_id=repo_id, filename="minmax_scaler.joblib")
# 2. Load into memory
model = joblib.load(model_path)
scaler = joblib.load(scaler_path)
# 3. Format input frame (Must match original feature mapping order exactly)
features_order = [
'Voltage', 'Current', 'Temperature', 'Cumulative_Ah', 'dV_dt', 'dI_dt',
'V_roll_mean_5', 'V_roll_std_5', 'I_roll_mean_5', 'I_roll_std_5',
'V_roll_mean_10', 'V_roll_std_10', 'I_roll_mean_10', 'I_roll_std_10',
'V_roll_mean_20', 'V_roll_std_20', 'I_roll_mean_20', 'I_roll_std_20'
]
# Create your raw DataFrame matching the column ordering above
# sample_df = pd.DataFrame([your_raw_values], columns=features_order)
# 4. Scale features and run estimator
# scaled_features = scaler.transform(sample_df)
# predicted_soc = model.predict(scaled_features)[0]
# print(f"Estimated SoC: {predicted_soc * 100:.2f}%")