HoangTrungNguyen's picture
Upload decode_reimplementation_outputs/README.md with huggingface_hub
ca17005 verified
|
Raw
History Blame Contribute Delete
2.1 kB
# DECODE Re-implementation
This folder is used by `scripts/decode_reimplementation.py`.
## Goal
Re-implement the training flow from the DECODE paper for the local I-BLEND data:
1. Resample energy to 10-minute intervals.
2. Convert original power readings in W to interval energy in Wh.
3. Merge occupancy and calendar/schedule features.
4. Optionally merge weather if the local weather range overlaps the energy range.
5. Add historical energy features, including the previous three same-working-day-class values at the same time instant.
6. Normalize with Min-Max scaling.
7. Split chronologically into 70% train, 15% validation, and 15% test.
8. Train Ridge Regression, Decision Tree, Random Forest, and optionally LSTM.
The LSTM backend prefers PyTorch when available and falls back to TensorFlow/Keras.
## Install Dependencies
Base ML models:
```bash
python3 -m pip install pandas numpy scikit-learn
```
LSTM model, choose one backend:
```bash
python3 -m pip install torch
python3 -m pip install tensorflow
```
## Run Examples
Train the 7 paper-style building targets:
```bash
python3 scripts/decode_reimplementation.py --mode paper_buildings
```
Train the 9 local meter targets:
```bash
python3 scripts/decode_reimplementation.py --mode meters
```
Quick smoke test on one target without LSTM:
```bash
python3 scripts/decode_reimplementation.py --mode paper_buildings --target Academic --max-rows 20000 --skip-lstm
```
Run one meter with LSTM:
```bash
python3 scripts/decode_reimplementation.py --mode meters --target Boys_main --epochs 20 --batch-size 64
```
## Outputs
- `processed/<target>_train_ready.csv`: feature-engineered supervised table.
- `results_paper_buildings.csv`: model metrics for 7 building-level targets.
- `results_meters.csv`: model metrics for 9 meter-level targets.
- `run_config_*.json`: run metadata and paths.
## Notes
The local weather file currently starts in 2018, while energy data ends in 2017. Because of this, the script does not force weather features into the model unless `--include-weather` is passed and an overlap is detected.