RoyAalekh commited on
Commit
8d2e8fa
·
1 Parent(s): 909f5b5

docs: Update enhancement plan with OpenAI Codex review findings

Browse files

Added 4 critical issues identified by Codex code review:

Priority 1 (High) - New Issues:
- EDA memory exhaustion on large datasets (50K+ cases)
- Headless rendering failure in CI/CD pipelines
- Missing parameter fallback blocks fresh environments
- RL reward computation inconsistency (fresh agent instances)

Solutions:
- Add EDA sampling and streaming for memory efficiency
- Detect headless mode, use static renderer
- Bundle baseline parameters as fallback
- Extract shared reward logic to standalone module

Updated implementation timeline to include new fixes in Weeks 1-3.
Maintains focus on critical state management and ripeness bugs.

Files changed (1) hide show
  1. docs/ENHANCEMENT_PLAN.md +78 -5
docs/ENHANCEMENT_PLAN.md CHANGED
@@ -122,7 +122,65 @@
122
  - scheduler/simulation/engine.py (configurable frequency)
123
  - scheduler/data/config.py (add parameter)
124
 
125
- ## Priority 4: Enhanced Scheduling Constraints (P2 - Medium)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
126
 
127
  ### 4.1 Judge Blocking & Availability
128
  **Problem**: No per-judge blocked dates
@@ -189,10 +247,25 @@
189
 
190
  ## Implementation Order
191
 
192
- 1. **Week 1**: Fix state bugs (1.1, 1.2, 1.3) + tests
193
- 2. **Week 2**: Strengthen ripeness (2.1, 2.2) + re-enable inflow (3.1, 3.2)
194
- 3. **Week 3**: Enhanced constraints (4.1, 4.2, 4.3)
195
- 4. **Week 4**: Comprehensive testing + ripeness learning feedback (2.3)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
196
 
197
  ## Success Criteria
198
 
 
122
  - scheduler/simulation/engine.py (configurable frequency)
123
  - scheduler/data/config.py (add parameter)
124
 
125
+ ## Priority 4: EDA and Configuration Robustness (P1 - High)
126
+
127
+ ### 4.0.1 Fix EDA Memory Issues
128
+ **Problem**: EDA converts full Parquet to pandas, risks memory exhaustion
129
+ **Impact**: Pipeline fails on large datasets (>50K cases)
130
+
131
+ **Solution**:
132
+ - Add sampling parameter: `eda_sample_size: Optional[int]` (default None = full)
133
+ - Stream data instead of loading all at once
134
+ - Downcast numeric columns before conversion
135
+ - Add memory monitoring and warnings
136
+
137
+ **Files**:
138
+ - src/eda_exploration.py (add sampling)
139
+ - src/eda_config.py (memory limits)
140
+
141
+ ### 4.0.2 Fix Headless Rendering
142
+ **Problem**: Plotly renderer defaults to "browser", fails in CI/CD
143
+ **Impact**: Cannot run EDA in automated pipelines
144
+
145
+ **Solution**:
146
+ - Detect headless environment (check DISPLAY env var)
147
+ - Default to "png" or "svg" renderer in headless mode
148
+ - Add `--renderer` CLI flag to override
149
+
150
+ **Files**:
151
+ - src/eda_exploration.py (renderer detection)
152
+ - court_scheduler_rl.py (add CLI flag)
153
+
154
+ ### 4.0.3 Fix Missing Parameters Fallback
155
+ **Problem**: get_latest_params_dir raises when no params exist
156
+ **Impact**: Fresh environments can't run simulations
157
+
158
+ **Solution**:
159
+ - Bundle baseline parameters in `scheduler/data/defaults/`
160
+ - Fallback to bundled params if no EDA run found
161
+ - Add `--use-defaults` flag to force baseline params
162
+ - Log warning when using defaults vs EDA-derived
163
+
164
+ **Files**:
165
+ - scheduler/data/config.py (fallback logic)
166
+ - scheduler/data/defaults/ (new directory with baseline params)
167
+
168
+ ### 4.0.4 Fix RL Reward Computation
169
+ **Problem**: Rewards computed with fresh agent instance, divorced from training
170
+ **Impact**: Learning signals inconsistent with policy behavior
171
+
172
+ **Solution**:
173
+ - Extract reward logic to standalone function: `compute_reward(case, action, outcome)`
174
+ - Share reward function between training environment and agent
175
+ - Remove agent re-instantiation in environment
176
+ - Validate reward consistency in tests
177
+
178
+ **Files**:
179
+ - rl/rewards.py (new - shared reward logic)
180
+ - rl/simple_agent.py (use shared rewards)
181
+ - rl/training.py (use shared rewards)
182
+
183
+ ## Priority 5: Enhanced Scheduling Constraints (P2 - Medium)
184
 
185
  ### 4.1 Judge Blocking & Availability
186
  **Problem**: No per-judge blocked dates
 
247
 
248
  ## Implementation Order
249
 
250
+ 1. **Week 1**: Fix critical bugs
251
+ - State management (1.1, 1.2, 1.3)
252
+ - Configuration robustness (4.0.3 - parameter fallback)
253
+ - Unit tests for above
254
+
255
+ 2. **Week 2**: Strengthen core systems
256
+ - Ripeness detection (2.1, 2.2 - UNKNOWN status, multi-signal)
257
+ - RL reward alignment (4.0.4 - shared reward logic)
258
+ - Re-enable inflow (3.1, 3.2)
259
+
260
+ 3. **Week 3**: Robustness and constraints
261
+ - EDA scaling (4.0.1 - memory management)
262
+ - Headless rendering (4.0.2 - CI/CD compatibility)
263
+ - Enhanced constraints (5.1, 5.2, 5.3)
264
+
265
+ 4. **Week 4**: Testing and polish
266
+ - Comprehensive integration tests
267
+ - Ripeness learning feedback (2.3)
268
+ - All edge cases documented
269
 
270
  ## Success Criteria
271