batterymhm / docs /PAPER.md

Add BatteryMHM preprint (PAPER.md + PDF); link from model card

ca0fffb verified 3 days ago

preview code

Raw

History Blame Contribute Delete

24.7 kB

The Miller Harmonic Method: A Fixed Modular Feature Map for Early-Cycle Battery State-of-Health Prediction

William T. L. Miller

Preprint. Code: https://huggingface.co/williamTLmiller/batterymhm . License: CC BY-NC 4.0. Method patent pending.

Abstract

Predicting the state-of-health (SOH) of lithium-ion cells from only the first few dozen cycles is valuable for grading, second-life sorting, and early failure screening, but strong published models typically rely on deep sequence networks and hundreds of cycles of aging data. We present the Miller Harmonic Method (MHM), a fixed, fully reproducible feature map that projects any measured curve into a nine-class modular ("harmonic") digit space via the fold map HIN(k) = 1 + ((k-1) mod 9), scores class interactions through a hand-specified 9x9 compatibility ("Chi") matrix, and aggregates the result across a Fibonacci-like multi-scale grid into a 557-dimensional descriptor. The descriptor is consumed by a light CPU-only tree ensemble (ExtraTrees plus XGBoost). MHM is a heuristic, empirically validated inductive prior, not a first-principles physical model. On the MIT-Stanford-TRI dataset (Severson et al., 2019; 144 cells, 5-fold CV, 30% observation window) it attains MAE 0.0114 and RMSE 0.0200, the best reported values on this benchmark for those two metrics, while trailing a published attentive Neural-ODE on Pearson correlation, Spearman correlation, and R^2. On Matbench mp_e_form it reaches 0.1513 eV/atom, beating a classic random-forest baseline but clearly below modern graph neural networks. We release the complete method open-source (no trained weights) and discuss the honest limitations: a small dataset, an unablated hand-set matrix, and untested generalization.

1. Introduction

Lithium-ion battery state-of-health (SOH) prediction is usually framed as a data-hungry problem. The most accurate published models observe long aging trajectories, or use deep sequence architectures such as recurrent networks, temporal convolutions, or Neural-ODEs, often trained on GPUs. In many practical settings, however, the useful signal must be extracted from only the first tens of cycles: cell graders, second-life sorters, and early-warning screeners rarely have the luxury of full aging curves, and they frequently run on modest hardware without a GPU.

This paper explores a deliberately opposite point on the design space. Rather than adding model depth, MHM invests in representation: it applies a fixed, hand-designed, non-learned transform to the raw curve and then hands a small, generic tree ensemble the job of regression. The central thesis is a representation-versus-depth trade-off: if the feature map encodes enough of the relevant structure, a shallow model can be competitive with a deep one on the metric that matters most for grading (mean absolute error), at a tiny fraction of the compute and data-window cost.

The MHM feature map is built from modular ("harmonic") arithmetic. Every integer or quantised measurement is folded into a nine-element digit space D = {1, ..., 9} by HIN(k) = 1 + ((k-1) mod 9) (HIN = "Harmonic Identity Number"). Interactions between these classes are scored through a fixed 9x9 compatibility matrix and aggregated across a Fibonacci-like multi-scale grid. We stress at the outset that this construction is a heuristic inductive prior. We do not claim it is derived from first principles of electrochemistry or crystallography; its value is established empirically, by measured performance on public benchmarks, and by the observation that the harmonic-derived features dominate the model's importance ranking.

Our contributions are:

A precise, open specification of the MHM feature map: the fold map, four modular binary operations, a fixed compatibility matrix with a matter/energy split, a multi-scale aggregation grid, and a discrete "Miller calculus."
A 557-dimensional descriptor assembled from eleven feature groups, computed identically for battery curves and for crystal neighbour lists.
An empirical evaluation on the canonical MIT-Stanford-TRI SOH benchmark, where MHM attains the best reported MAE and RMSE while running CPU-only in seconds, together with an honest accounting of the metrics on which it does not lead.
A secondary materials-science evaluation (formation energy) that we frame plainly as below state-of-the-art.

We report every performance number exactly as measured and are explicit about the method's limitations throughout.

2. Related Work

Battery state-of-health. Severson et al. (2019) released the MIT-Stanford-TRI dataset and showed that early-cycle discharge features predict eventual cycle life, establishing the benchmark used here. Microsoft's BatteryML (Zhang et al., ICLR 2024) provides a standardized pipeline and strong classical baselines, including a random forest whose reported SOH MAE on this setting is 0.2459. Deep sequence models have pushed accuracy further; in particular, the attentive Neural-ODE of Li et al. (2021) reports MAE 0.012, RMSE 0.020, Pearson 0.900, Spearman 0.880, and R^2 0.810 on this task. MHM matches or beats that model on MAE and RMSE but trails it on the three correlation/variance-explained metrics, as we detail in Section 7.

Materials property prediction. Matbench (Dunn et al., 2020) is the standard benchmark suite for materials property regression, with mp_e_form (formation energy) among its most-studied tasks. Its classic random-forest-plus-Magpie baseline reaches roughly 0.132 eV/atom. Graph neural networks dominate this task: CGCNN (Xie & Grossman, 2018) at roughly 0.049, MEGNet (Chen et al., 2019) at roughly 0.030, DimeNet++ at roughly 0.022, M3GNet (Chen & Ong, 2022) at roughly 0.018, and CHGNet (Deng et al., 2023) at roughly 0.015 eV/atom. MHM is not competitive with these graph models on formation energy, and we treat its materials results as a discovery-pipeline component rather than a state-of-the-art claim.

Harmonic and modular feature maps. Using modular arithmetic (digit sums, casting-out-nines) and fixed compatibility tables as feature transforms is unusual in modern machine learning, where learned embeddings dominate. MHM is closest in spirit to hand-crafted descriptor engineering (as in classical cheminformatics and the Magpie composition descriptors) but distinctive in its use of a nine-class modular fold and a fixed pairwise compatibility matrix as the primary representation. We position MHM as a fixed feature engineering approach and validate it empirically.

Ensembles. The regressor is a standard blend of Extremely Randomized Trees (Geurts et al., 2006) and gradient-boosted trees (XGBoost; Chen & Guestrin, 2016), chosen for speed and robustness on small tabular problems.

3. Method

3.1 The harmonic digit space and the fold map

MHM operates in the finite digit space D = {1, 2, ..., 9}. Any integer k is projected into D by the fold map

HIN(k) = 1 + ((k - 1) mod 9).

Two integers that differ by a multiple of nine share a harmonic class. Applied to an atomic number Z, this yields the element's HIN; applied to a bin index it yields a measurement's HIN. In code the general fold is f9.

Quantising a physical curve into HINs. A real-valued sequence (capacity, voltage, or temperature versus cycle) is mapped into D by uniform binning. The range [min, max] is divided into nine equal intervals and each value is assigned the index of its interval, clipped to {1, ..., 9} (a constant series maps to all 5s). This seq_to_harmonics step turns, for example, an early-cycle capacity fade curve into a short integer sequence such as [5, 5, 4, 4, 3, 3, 2, ...], on which all subsequent MHM machinery operates.

3.2 The binary operations

Four complementary operations act on D, each folded back into D:

Operation	Symbol	Definition	Role
Creative addition	`a ⊕ b`	`f9(a + b + 1)`	recombination / synthesis
Growth product	`a ⊗ b`	`f9(a·b + 1)`	growth / energy exchange
Energy addition	`a ⊕_E b`	`f9(a + b + 1 + tier)`	shell-tiered coupling
Miller subtraction	`a ⊖ b`	`f9(a - b + 9)`	directed difference

The deliberate +1 offset in creative addition makes ⊕ and ⊖ not exact inverses, giving the algebra a directional asymmetry that the histograms below exploit. The tier argument of energy addition injects a shell-level offset so the same pair can be scored at different coupling tiers.

3.3 The Chi compatibility matrix and the matter/energy split

The core tensor is a fixed 9x9 compatibility ("Chi") matrix Χ[i, j], symmetric with unit diagonal and off-diagonal entries in [0.68, 1.00]. Χ[i, j] is a hand-assigned "compatibility" or resonance score between classes i and j. The matrix is hand-specified; it is not learned or fitted. Its full contents are in batterymhm/atomic.py.

The nine classes split into two channels:

Matter — classes 1-8, scored on the 8x8 sub-matrix CHI_MATTER_8x8 (the upper-left block of Χ).
Energy — class 9, the "energy pole," scored on a separate 9-vector CHI_ENERGY_9 (row 9 of Χ).

Additionally, the class set {3, 6, 9} is designated as focusing nodes (a "Tesla" label appears in the code) and receives dedicated network features. For crystals this split lets a model score matter-matter bonding on the 8x8 block and the energy-pole contribution on a distinct geometry.

3.4 The Miller sequence and shell levels

The Miller sequence is a Fibonacci-like sequence with seed [1, 1, 3]:

1, 1, 3, 4, 7, 11, 18, 29, 47, 76, ...

Its positions group into shell levels, each three times larger than the last: level 1 = positions 1-3, level 2 = positions 4-12, level 3 = positions 13-39. These levels provide a multi-scale sampling grid. For a battery, level-1 positions correspond to the first few cycles (an "initialization signature") and later levels to the longer aging trajectory. The sequence is also reduced along its trajectory via CMR (Creation-Math Reduction: repeated creative addition of a sequence down to a single class), evaluated at several truncation lengths.

3.5 Embeddings and the Miller calculus

Two embeddings bridge the discrete space to continuous math:

χ-embedding: χ(a) = (a + 1) mod 9, an index in Z_9 (bridge to calculus).
φ-embedding: φ(d) = sin(π·d / 9), a real number (bridge to geometry).

On a harmonic sequence, MHM defines discrete differential operators (the "Miller calculus"): velocity (mean first-order Miller difference via ⊖), acceleration (mean second-order difference), integral (cumulative creative addition), and curvature of the φ-embedded curve using the classical κ = |y''| / (1 + y'^2)^{3/2}. Curvature is intended to spike at the onset of capacity-fade degradation (the aging "knee"). These operators are evaluated at three window scales (full, first half, first quarter) to capture multi-scale dynamics.

3.6 The 557-feature descriptor

mhm_full_features(hins) returns a fixed 557-dimensional vector assembled from eleven feature groups:

Group	#	What it encodes
Chi9 full histogram	81	distance-weighted 9x9 compatibility over pairs
Chi9 count histogram	81	unweighted 9x9 pair counts
HIN transition matrix	81	row-normalised Markov ordering of the sequence
Growth-product histogram	81	`⊗` energy-exchange character of pairs
Energy-addition histograms (2 tiers)	162	`⊕_E` shell-coupled resonance (tiers 0 and 1)
Miller-level breakdown	27	class occupancy per shell level (L1/L2/L3)
Focusing-node network	9	`{3,6,9}` node fractions, bridges, run statistics
Multi-scale Miller calculus	12	velocity/acceleration/curvature/integral x 3 scales
Creative-addition chains	15	cumulative `⊕` chain statistics at 5 truncations
CMR multi-point	5	CMR reductions at 5 truncation lengths
HIN / GP / EA entropy	3	Shannon entropy of the three interaction distributions
Total	557

(The counts sum to 81x4 + 162 + 27 + 9 + 12 + 15 + 5 + 3 = 557.) All histograms are normalised, and any non-finite value is replaced by zero. For crystals a parallel routine (mhm_matter8_neighbor_histograms) builds the descriptor from the actual neighbour-pair list (HIN_a, HIN_b, distance), using the 8x8 matter block plus the HIN-9 energy channel, so each feature is tied to a physical neighbour relationship.

3.7 The ensemble

The descriptor feeds MHMEnsemble: an ExtraTrees regressor (600 trees, max_features="sqrt", min_samples_leaf=2) blended with XGBoost (400 trees, max_depth=5, learning_rate=0.04, subsample=0.8, colsample_bytree=0.7, reg_alpha=0.2, reg_lambda=1.5) at fixed weights 0.75 / 0.25. An optional Ridge (alpha=1.0) out-of-fold stacking meta-learner can replace the fixed blend. All random seeds are fixed at 42. The ensemble is deliberately light and CPU-only; the harmonic descriptor carries the representational load. If XGBoost is unavailable the ensemble falls back gracefully to ExtraTrees only.

4. Experimental Setup

SOH dataset and split. We use the MIT-Stanford-TRI dataset (Severson et al., Nature Energy, 2019), 144 commercial LFP/graphite cells. The task is to predict cell state-of-health from an early-cycle observation window using 5-fold cross-validation. The headline configuration uses a 30% observation window (approximately 45 cycles).

Observation windows. To probe robustness we sweep the observation window from 10% to 70% of the available early cycles.

Feature selection (honesty note). Because only 144 cells are available, the raw MHM descriptor is far wider than the sample count, and using all 557 features directly invites overfitting. The benchmark pipeline therefore reduces the descriptor to approximately 58 features per fold using SelectKBest with mutual-information regression scoring. Critically, the selector is fit within each training fold only, so no test-fold information leaks into feature selection. We report this explicitly because it materially affects how the numbers should be interpreted: the reported metrics reflect a within-fold-selected subset, not the full 557-dimensional descriptor.

Materials dataset. For formation energy we use Matbench mp_e_form (50k train / 5k test). We report both the matminer-split result and the official Matbench fold-0 result.

Ensemble hyperparameters and seed. Exactly as in Section 3.7; all seeds fixed at 42.

5. Results

5.1 State-of-health, main benchmark

Results on MIT-Stanford-TRI, 5-fold CV, 30% observation window (approximately 45 cycles):

Model	MAE ↓	RMSE ↓	Pearson ↑	Spearman ↑	R^2 ↑
BatteryMHM (this work)	0.0114	0.0200	0.884	0.845	0.747
Attentive Neural-ODE (Li et al., 2021)	0.012	0.020	0.900	0.880	0.810
RandomForest (BatteryML, ICLR 2024)	0.2459	—	—	—	—

MHM attains the best reported MAE and RMSE on this benchmark. Honesty note: it trails the attentive Neural-ODE on Pearson correlation (0.884 vs 0.900), Spearman correlation (0.845 vs 0.880), and R^2 (0.747 vs 0.810). Relative to the BatteryML random-forest baseline (MAE 0.2459), MHM's MAE is approximately 21.6x lower. The two leading models differ on MAE/RMSE by amounts within rounding of each other, so we describe MHM as matching or narrowly leading on error while clearly trailing on the correlation and variance-explained metrics.

5.2 Per-fold breakdown

Per-fold MAE at the 30% window:

Fold	1	2	3	4	5	Mean
MAE	0.0160	0.0058	0.0080	0.0162	0.0115	0.0114

Fold-to-fold MAE varies by roughly 3x (0.0058 to 0.0162), which is expected given only about 29 cells per held-out fold; the mean is 0.0114.

5.3 Observation-window robustness

Sweeping the early-cycle observation window (5-fold CV, MAE):

Window	10%	20%	30%	40%	50%	60%	70%
Approx. cycles	~15	~30	~45	~60	~75	~90	~105
MAE	0.01171	0.01165	0.01147	0.01210	0.01161	0.01198	0.01171

Across the full 10%-70% sweep, MAE stays in the narrow band 0.01147-0.01210 (best at the 30% window). The method is therefore stable to how much early history it is given, and extracts most of the available signal from as few as ~15 cycles.

5.4 Materials: formation energy (honest framing)

Matbench mp_e_form:

Model	MAE (eV/atom) ↓	Notes
CHGNet (Deng et al., 2023)	~0.015	GNN, SOTA-class
M3GNet (Chen & Ong, 2022)	~0.018	GNN
DimeNet++	~0.022	GNN
MEGNet (Chen et al., 2019)	~0.030	GNN
CGCNN (Xie & Grossman, 2018)	~0.049	GNN
RF + Magpie (Matbench baseline)	~0.132	classical
BatteryMHM (matminer split)	0.1513	this work
BatteryMHM (Matbench fold 0)	0.1215	this work

MHM beats the classic RF+Magpie baseline but is clearly not state-of-the-art on formation energy: it trails every listed graph neural network by a wide margin. This is a known, documented limitation. The materials track is a discovery-pipeline component, not a SOTA claim; the SOTA result in this work is cell SOH.

6. Analysis: feature importance

Inspecting the trained SOH ensemble's impurity-based feature importances, roughly eight of the top-ten most important features are MHM-derived, including a Miller level-1 fade slope (an early-cycle degradation signal) and components of the matter/energy transforms. We report this qualitatively rather than as a formal ablation. It offers supporting (not conclusive) evidence that the harmonic construction carries genuine signal rather than acting as noise that the trees route around: if the harmonic features were uninformative, we would not expect them to dominate the importance ranking of a model that also has access to simpler statistics. A rigorous ablation (removing each feature group, and replacing the hand-set matrix with random or identity matrices) remains future work.

7. Limitations

We are explicit about the following:

Small dataset. The SOH benchmark has only 144 cells. Per-fold MAE varies by roughly 3x, and the aggregate estimates carry correspondingly wide uncertainty. We mitigate overfitting with within-fold SelectKBest feature reduction (approximately 58 of 557 features) and cross-validation, but 144 samples remain a fundamental limit.
Feature selection dependence. The reported metrics use a within-fold-selected subset, not the full 557-dimensional descriptor. The full descriptor's raw dimensionality far exceeds the sample count.
Trails deep nets on correlation metrics. MHM leads on MAE/RMSE but is behind the attentive Neural-ODE on Pearson, Spearman, and R^2. For applications where rank ordering or variance explained matters more than absolute error, a deep model may be preferable.
Below GNNs on materials. On formation energy MHM trails modern graph neural networks by a wide margin.
The compatibility matrix is hand-set and unablated. The 9x9 Chi matrix is hand-specified. We have not learned it, ablated it, or replaced it with random/identity controls; the extent to which its specific values (versus the modular fold and multi-scale aggregation alone) drive performance is untested.
Heuristic, not first-principles. The harmonic construction is an empirically validated inductive prior. We make no claim of a physical derivation.
Untested generalization. Performance is demonstrated on two public benchmarks only. Generalization to other chemistries, cyclers, temperature regimes, or materials tasks is untested. The bundled demo is a synthetic signal check, not a performance claim.

8. Reproducibility and Open Release

The complete method — the fold map, the four operations, the Chi matrix, the Miller sequence, the multi-scale aggregation, the 557-feature descriptor, and the ensemble — is released open-source under CC BY-NC 4.0 at https://huggingface.co/williamTLmiller/batterymhm. No trained weights are distributed; users train their own models (seconds on a CPU) on the public datasets. CC BY-NC 4.0 is a copyright license and grants no patent rights; the method is patent pending.

Install and demo:

pip install "git+https://huggingface.co/williamTLmiller/batterymhm"
python demo.py    # offline signal-check, CPU-only, ~5 seconds, no weights, no GPU

Minimal usage:

from batterymhm import seq_to_harmonics, mhm_full_features, MHMEnsemble

hins  = seq_to_harmonics(capacity_curve, bins=9)   # measurement -> harmonic space
feats = mhm_full_features(hins)                     # 557-feature MHM descriptor
model = MHMEnsemble().fit(X_train, y_train)         # train your own
soh   = model.predict(X_test)

The public datasets (MIT-Stanford-TRI for SOH; Matbench mp_e_form for materials) plus the released code reproduce the reported numbers. All seeds are fixed at 42.

9. Conclusion

We presented the Miller Harmonic Method, a fixed, reproducible, heuristic feature map that projects measured curves into a nine-class modular space, scores class interactions through a hand-specified compatibility matrix, and aggregates across a Fibonacci-like multi-scale grid into a 557-dimensional descriptor. Paired with a small CPU-only tree ensemble, MHM attains the best reported MAE and RMSE on the MIT-Stanford-TRI early-cycle SOH benchmark, while honestly trailing a deep Neural-ODE on correlation and variance-explained metrics and trailing graph neural networks on materials formation energy. The result supports a representation-versus-depth thesis: a sufficiently structured fixed feature map lets a shallow model reach competitive error at a fraction of the compute and data-window cost. The most important open work is to ablate the hand-set matrix, to learn it, and to test generalization beyond these two benchmarks.

References

Chen, C., Ye, W., Zuo, Y., Zheng, C., & Ong, S. P. (2019). Graph networks as a universal machine learning framework for molecules and crystals (MEGNet). Chemistry of Materials, 31(9), 3564-3572.
Chen, C., & Ong, S. P. (2022). A universal graph deep learning interatomic potential for the periodic table (M3GNet). Nature Computational Science, 2, 718-728.
Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785-794.
Deng, B., Zhong, P., Jun, K., Riebesell, J., Han, K., Bartel, C. J., & Ceder, G. (2023). CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nature Machine Intelligence, 5, 1031-1041.
Dunn, A., Wang, Q., Ganose, A., Dopp, D., & Jain, A. (2020). Benchmarking materials property prediction methods: the Matbench test set and Automatminer reference algorithm. npj Computational Materials, 6, 138.
Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3-42.
Li, W., et al. (2021). Attentive Neural-ODE for battery state-of-health estimation. Reported MAE 0.012, RMSE 0.020, Pearson 0.900, Spearman 0.880, R^2 0.810 on MIT-Stanford-TRI. [Full bibliographic details to be confirmed; comparison values are quoted as reported.]
Severson, K. A., Attia, P. M., Jin, N., Perkins, N., Jiang, B., Yang, Z., Chen, M. H., Aykol, M., Herring, P. K., Fraggedakis, D., Bazant, M. Z., Harris, S. J., Chueh, W. C., & Braatz, R. D. (2019). Data-driven prediction of battery cycle life before capacity degradation. Nature Energy, 4, 383-391.
Xie, T., & Grossman, J. C. (2018). Crystal graph convolutional neural networks for an accurate and interpretable prediction of material properties (CGCNN). Physical Review Letters, 120, 145301.
Zhang, H., et al. (2024). BatteryML: An open-source platform for machine learning on battery degradation. International Conference on Learning Representations (ICLR). [Author list to be confirmed; only the RF baseline MAE 0.2459 is quoted from this source.]

The fold map, the operations, the Chi matrix, the Miller sequence, and the multi-scale aggregation are the subject of pending patent applications by William T. L. Miller. This is a preprint; it has not been peer-reviewed or accepted at any venue.