# Project Versioning & Structure ## Current Active Version: v2.0 All active models, artifacts, and features use the **v2.0** versioning scheme. ### Directory Structure ``` artifacts/ ├── v1/ ← Legacy models (cross-battery split) │ ├── models/ │ │ ├── classical/ │ │ ├── deep/ │ │ └── ensemble/ │ ├── scalers/ │ ├── figures/ │ ├── results/ │ ├── logs/ │ └── features/ │ ├── v2/ ← Current production (intra-battery split) ✓ │ ├── models/ │ │ ├── classical/ ← 14 classical ML models │ │ ├── deep/ ← 8 deep learning models │ │ └── ensemble/ ← Weighted ensemble │ ├── scalers/ ← Feature scalers for linear models │ ├── figures/ ← All validation visualizations (PNG, HTML) │ ├── results/ ← CSV/JSON results and feature matrices │ ├── logs/ ← Training logs │ └── features/ ← Feature engineering artifacts ``` ### V2 Key Changes from V1 | Aspect | V1 | V2 | |--------|----|----| | **Data Split** | Cross-battery (groups of batteries) | Intra-battery chronological (first 80% cycles per battery) | | **Train/Test Contamination** | ⚠️ YES (same batteries in both) | ✓ NO (different time periods per battery) | | **Generalization** | Poor (batteries see same time periods) | Better (true temporal split) | | **Test Realism** | Interpolation (within-cycle prediction) | Extrapolation (future cycles) | | **Classical Models** | 6 standard models | 14 models (added ExtraTrees, GradientBoosting, KNN ×3) | | **Deep Models** | 8 models | Retraining in progress | | **Ensemble** | RF + XGB + LGB (v1 trained) | RF + XGB + LGB (v2 trained when available) | ### Model Statistics #### Classical Models (V2) - **Total:** 14 models - **Target Metric:** Within-±5% SOH accuracy ≥ 95% - **Current Pass Rate:** See `artifacts/v2/results/v2_validation_report.html` #### Configuration **Active version is set in** `src/utils/config.py`: ```python ACTIVE_VERSION: str = "v2" ``` **API defaults to v2:** ```python registry = registry_v2 # Default registry (v2.0.0 models) ``` ### Migration Checklist ✓ - ✓ Created versioned artifact directories under `artifacts/v2/` - ✓ Moved all v2 models to `artifacts/v2/models/classical/` etc. - ✓ Moved all results to `artifacts/v2/results/` - ✓ Moved all figures to `artifacts/v2/figures/` - ✓ Moved all scalers to `artifacts/v2/scalers/` - ✓ Updated notebooks (NB03-09) to use `get_version_paths('v2')` - ✓ Updated API to default to v2 registry - ✓ Organized scripts into `scripts/data/`, `scripts/models/` - ✓ Moved tests to `tests/` folder - ✓ Cleaned up legacy artifact directories ### File Locations | Content | Path | |---------|------| | Models (classical) | `artifacts/v2/models/classical/*.joblib` | | Models (deep) | `artifacts/v2/models/deep/*.pth` | | Models (ensemble) | `artifacts/v2/models/ensemble/*.joblib` | | Scalers | `artifacts/v2/scalers/*.joblib` | | Results CSV | `artifacts/v2/results/*.csv` | | Feature matrix | `artifacts/v2/results/battery_features.csv` | | Visualizations | `artifacts/v2/figures/*.{png,html}` | | Logs | `artifacts/v2/logs/*.log` | ### Running Scripts ```bash # Run v2 model validation test python tests/test_v2_models.py # Run quick prediction test python tests/test_predictions.py # Retrain classical models (WARNING: takes ~30 min) python scripts/models/retrain_classical.py # Generate/patch notebooks (one-time utilities) python scripts/data/write_nb03_v2.py python scripts/data/patch_dl_notebooks_v2.py ``` ### Next Steps 1. ✓ Verify v2 model accuracy meets thresholds 2. ✓ Update research paper with v2 results 3. ✓ Complete research notes for all notebooks 4. ✓ Test cycle recommendation engine 5. Deploy v2 to production ### Version History | Version | Date | Status | Notes | |---------|------|--------|-------| | v1.0 | 2025-Q1 | ✓ Complete | Classical + Deep models, cross-battery split | | v2.0 | 2026-02-25 | ✓ Active | Intra-battery split, improved generalization | | v3.0 | TBD | -- | Physics-informed models, uncertainty quantification |