Spaces:
Sleeping
Sleeping
File size: 2,997 Bytes
6a714c3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 | # Output Directory Refactoring - Implementation Status
## Completed
### 1. Created `OutputManager` class
- **File**: `scheduler/utils/output_manager.py`
- **Features**:
- Single run directory with timestamp-based ID
- Clean hierarchy: `eda/` `training/` `simulation/` `reports/`
- Property-based access to all output paths
- Config saved to run root for reproducibility
### 2. Integrated into Pipeline
- **File**: `court_scheduler_rl.py`
- **Changes**:
- `PipelineConfig` no longer has `output_dir` field
- `InteractivePipeline` uses `OutputManager` instance
- All `self.output_dir` references replaced with `self.output.{property}`
- Pipeline compiles successfully
## Completed Tasks
### 1. Remove Duplicate Model Saving (DONE)
- Removed duplicate model save in court_scheduler_rl.py
- Implemented `OutputManager.create_model_symlink()` method
- Model saved once to `outputs/runs/{run_id}/training/agent.pkl`
- Symlink created at `models/latest.pkl`
### 2. Update EDA Output Paths (DONE)
- Modified `src/eda_config.py` with:
- `set_output_paths()` function to configure from OutputManager
- Private getter functions (`_get_run_dir()`, `_get_params_dir()`, etc.)
- Fallback to legacy paths when running standalone
- Updated all EDA modules (eda_load_clean.py, eda_exploration.py, eda_parameters.py)
- Pipeline calls `set_output_paths()` before running EDA steps
- EDA outputs now write to `outputs/runs/{run_id}/eda/`
### 3. Fix Import Errors (DONE)
- Fixed syntax errors in EDA imports (removed parentheses from function names)
- All modules compile without errors
### 4. Test End-to-End (DONE)
```bash
uv run python court_scheduler_rl.py quick
```
**Status**: SUCCESS (Exit code: 0)
- All outputs in `outputs/runs/run_20251126_055943/`
- No scattered files
- Models symlinked correctly at `models/latest.pkl`
- Pipeline runs without errors
- Clean directory structure verified with `tree` command
## New Directory Structure
```
outputs/
βββ runs/
βββ run_20251126_123456/
βββ config.json
βββ eda/
β βββ figures/
β βββ params/
β βββ data/
βββ training/
β βββ cases.csv
β βββ agent.pkl
β βββ stats.json
βββ simulation/
β βββ readiness/
β βββ rl/
βββ reports/
βββ EXECUTIVE_SUMMARY.md
βββ COMPARISON_REPORT.md
βββ visualizations/
models/
βββ latest.pkl -> ../outputs/runs/run_20251126_123456/training/agent.pkl
```
## Benefits Achieved
1. **Single source of truth**: All run artifacts in one directory
2. **Reproducibility**: Config saved with outputs
3. **No duplication**: Files written once, not copied
4. **Clear hierarchy**: Logical organization by pipeline phase
5. **Easy cleanup**: Delete entire run directory
6. **Version control**: Run IDs sortable by timestamp
|