Spaces:
Sleeping
Sleeping
docs: Add comprehensive enhancement plan for bug fixes
Browse filesBased on code analysis, identified critical issues:
Priority 0 (Critical):
- Override state pollution - flags persist across runs
- Ripeness defaults to RIPE - optimistic bias risks scheduling unready cases
- Override auditability - in-place mutations lose rejection tracking
Priority 1 (High):
- Re-enable case inflow for realistic long-term simulations
- Configurable ripeness re-evaluation frequency
- Comprehensive test coverage
Priority 2 (Medium):
- Judge availability blocking
- Per-case gap overrides for urgent cases
- Dynamic courtroom capacity
4-week implementation roadmap with clear success criteria.
Addresses state management bugs, ripeness detection weaknesses,
and simulation realism issues.
- docs/ENHANCEMENT_PLAN.md +233 -0
docs/ENHANCEMENT_PLAN.md
ADDED
|
@@ -0,0 +1,233 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Court Scheduling System - Bug Fixes & Enhancements
|
| 2 |
+
|
| 3 |
+
## Priority 1: Fix State Management Bugs (P0 - Critical)
|
| 4 |
+
|
| 5 |
+
### 1.1 Fix Override State Pollution
|
| 6 |
+
**Problem**: Override flags persist across runs, priority overrides don't clear
|
| 7 |
+
**Impact**: Cases keep boosted priority in subsequent schedules
|
| 8 |
+
|
| 9 |
+
**Solution**:
|
| 10 |
+
- Add `clear_overrides()` method to Case class
|
| 11 |
+
- Call after each scheduling day or at simulation reset
|
| 12 |
+
- Store overrides in separate tracking dict instead of mutating case objects
|
| 13 |
+
- Alternative: Use immutable override context passed to scheduler
|
| 14 |
+
|
| 15 |
+
**Files**:
|
| 16 |
+
- scheduler/core/case.py (add clear method)
|
| 17 |
+
- scheduler/control/overrides.py (refactor to non-mutating approach)
|
| 18 |
+
- scheduler/simulation/engine.py (call clear after scheduling)
|
| 19 |
+
|
| 20 |
+
### 1.2 Preserve Override Auditability
|
| 21 |
+
**Problem**: Invalid overrides removed in-place from input list
|
| 22 |
+
**Impact**: Caller loses original override list, can't audit rejections
|
| 23 |
+
|
| 24 |
+
**Solution**:
|
| 25 |
+
- Validate into separate collections: `valid_overrides`, `rejected_overrides`
|
| 26 |
+
- Return structured result: `OverrideResult(applied, rejected_with_reasons)`
|
| 27 |
+
- Keep original override list immutable
|
| 28 |
+
- Log all rejections with clear error messages
|
| 29 |
+
|
| 30 |
+
**Files**:
|
| 31 |
+
- scheduler/control/overrides.py (refactor apply_overrides)
|
| 32 |
+
- scheduler/core/algorithm.py (update override handling)
|
| 33 |
+
|
| 34 |
+
### 1.3 Track Override Outcomes Explicitly
|
| 35 |
+
**Problem**: Applied overrides in list, rejected as None in unscheduled
|
| 36 |
+
**Impact**: Hard to distinguish "not selected" from "override rejected"
|
| 37 |
+
|
| 38 |
+
**Solution**:
|
| 39 |
+
- Create `OverrideAudit` dataclass: (override_id, status, reason, timestamp)
|
| 40 |
+
- Return audit log from schedule_day: `result.override_audit`
|
| 41 |
+
- Separate tracking: `cases_not_selected`, `overrides_accepted`, `overrides_rejected`
|
| 42 |
+
|
| 43 |
+
**Files**:
|
| 44 |
+
- scheduler/core/algorithm.py (add audit tracking)
|
| 45 |
+
- scheduler/control/overrides.py (structured audit log)
|
| 46 |
+
|
| 47 |
+
## Priority 2: Strengthen Ripeness Detection (P0 - Critical)
|
| 48 |
+
|
| 49 |
+
### 2.1 Require Positive Evidence for RIPE
|
| 50 |
+
**Problem**: Defaults to RIPE when signals ambiguous
|
| 51 |
+
**Impact**: Schedules cases that may not be ready
|
| 52 |
+
|
| 53 |
+
**Solution**:
|
| 54 |
+
- Add `UNKNOWN` status to RipenessStatus enum
|
| 55 |
+
- Require explicit RIPE signals: stage progression, document check, age threshold
|
| 56 |
+
- Default to UNKNOWN (not RIPE) when data insufficient
|
| 57 |
+
- Add confidence score: `ripeness_confidence: float` (0.0-1.0)
|
| 58 |
+
|
| 59 |
+
**Files**:
|
| 60 |
+
- scheduler/core/ripeness.py (add UNKNOWN, confidence scoring)
|
| 61 |
+
- scheduler/simulation/engine.py (filter UNKNOWN cases)
|
| 62 |
+
|
| 63 |
+
### 2.2 Enrich Ripeness Signals
|
| 64 |
+
**Problem**: Only uses keyword search and basic stage checks
|
| 65 |
+
**Impact**: Misses nuanced bottlenecks
|
| 66 |
+
|
| 67 |
+
**Solution**:
|
| 68 |
+
- Add signals:
|
| 69 |
+
- Filing age relative to case type median
|
| 70 |
+
- Adjournment reason history (recurring "summons pending")
|
| 71 |
+
- Outstanding task list (if available in data)
|
| 72 |
+
- Party/lawyer attendance rate
|
| 73 |
+
- Document submission completeness
|
| 74 |
+
- Multi-signal scoring: weighted combination
|
| 75 |
+
- Configurable thresholds per signal
|
| 76 |
+
|
| 77 |
+
**Files**:
|
| 78 |
+
- scheduler/core/ripeness.py (add signal extraction)
|
| 79 |
+
- scheduler/data/config.py (ripeness thresholds)
|
| 80 |
+
|
| 81 |
+
### 2.3 Add Learning Feedback Loop
|
| 82 |
+
**Problem**: Static heuristics don't improve
|
| 83 |
+
**Impact**: Classification errors persist
|
| 84 |
+
|
| 85 |
+
**Solution** (Future Enhancement):
|
| 86 |
+
- Track ripeness prediction vs actual outcomes
|
| 87 |
+
- Cases marked RIPE but adjourned → false positive signal
|
| 88 |
+
- Cases marked UNRIPE but later heard successfully → false negative
|
| 89 |
+
- Adjust thresholds based on historical accuracy
|
| 90 |
+
- Log classification performance metrics
|
| 91 |
+
|
| 92 |
+
**Files**:
|
| 93 |
+
- scheduler/monitoring/ripeness_metrics.py (new)
|
| 94 |
+
- scheduler/core/ripeness.py (adaptive thresholds)
|
| 95 |
+
|
| 96 |
+
## Priority 3: Re-enable Simulation Inflow (P1 - High)
|
| 97 |
+
|
| 98 |
+
### 3.1 Parameterize Case Filing
|
| 99 |
+
**Problem**: New filings commented out, no caseload growth
|
| 100 |
+
**Impact**: Unrealistic long-term simulations
|
| 101 |
+
|
| 102 |
+
**Solution**:
|
| 103 |
+
- Add `enable_inflow: bool` to CourtSimConfig
|
| 104 |
+
- Add `filing_rate_multiplier: float` (default 1.0 for historical rate)
|
| 105 |
+
- Expose inflow controls in pipeline config
|
| 106 |
+
- Surface inflow metrics in simulation results
|
| 107 |
+
|
| 108 |
+
**Files**:
|
| 109 |
+
- scheduler/simulation/engine.py (uncomment + gate filings)
|
| 110 |
+
- court_scheduler_rl.py (add config parameters)
|
| 111 |
+
|
| 112 |
+
### 3.2 Make Ripeness Re-evaluation Configurable
|
| 113 |
+
**Problem**: Fixed 7-day re-evaluation may be too infrequent
|
| 114 |
+
**Impact**: Stale classifications drive multiple days
|
| 115 |
+
|
| 116 |
+
**Solution**:
|
| 117 |
+
- Add `ripeness_eval_frequency_days: int` to config (default 7)
|
| 118 |
+
- Consider adaptive frequency: more frequent when backlog high
|
| 119 |
+
- Log ripeness re-evaluation events
|
| 120 |
+
|
| 121 |
+
**Files**:
|
| 122 |
+
- scheduler/simulation/engine.py (configurable frequency)
|
| 123 |
+
- scheduler/data/config.py (add parameter)
|
| 124 |
+
|
| 125 |
+
## Priority 4: Enhanced Scheduling Constraints (P2 - Medium)
|
| 126 |
+
|
| 127 |
+
### 4.1 Judge Blocking & Availability
|
| 128 |
+
**Problem**: No per-judge blocked dates
|
| 129 |
+
**Impact**: Schedules hearings when judge unavailable
|
| 130 |
+
|
| 131 |
+
**Solution**:
|
| 132 |
+
- Add `blocked_dates: list[date]` to Judge entity
|
| 133 |
+
- Add `availability_override: dict[date, bool]` for one-time changes
|
| 134 |
+
- Filter eligible courtrooms by judge availability
|
| 135 |
+
|
| 136 |
+
**Files**:
|
| 137 |
+
- scheduler/core/judge.py (add availability fields)
|
| 138 |
+
- scheduler/core/algorithm.py (check availability)
|
| 139 |
+
|
| 140 |
+
### 4.2 Per-Case Gap Overrides
|
| 141 |
+
**Problem**: Global MIN_GAP_BETWEEN_HEARINGS, no exceptions
|
| 142 |
+
**Impact**: Urgent cases can't be expedited
|
| 143 |
+
|
| 144 |
+
**Solution**:
|
| 145 |
+
- Add `min_gap_override: Optional[int]` to Case
|
| 146 |
+
- Apply in eligibility check: `gap = case.min_gap_override or MIN_GAP`
|
| 147 |
+
- Track override applications in metrics
|
| 148 |
+
|
| 149 |
+
**Files**:
|
| 150 |
+
- scheduler/core/case.py (add field)
|
| 151 |
+
- scheduler/core/algorithm.py (use override in eligibility)
|
| 152 |
+
|
| 153 |
+
### 4.3 Courtroom Capacity Changes
|
| 154 |
+
**Problem**: Fixed daily capacity, no dynamic adjustments
|
| 155 |
+
**Impact**: Can't model half-days, special sessions
|
| 156 |
+
|
| 157 |
+
**Solution**:
|
| 158 |
+
- Add `capacity_overrides: dict[date, int]` to Courtroom
|
| 159 |
+
- Apply in allocation: check date-specific capacity first
|
| 160 |
+
- Support judge preferences (e.g., "Property cases Mondays")
|
| 161 |
+
|
| 162 |
+
**Files**:
|
| 163 |
+
- scheduler/core/courtroom.py (add override dict)
|
| 164 |
+
- scheduler/simulation/allocator.py (check overrides)
|
| 165 |
+
|
| 166 |
+
## Priority 5: Testing & Validation (P1 - High)
|
| 167 |
+
|
| 168 |
+
### 5.1 Unit Tests for Bug Fixes
|
| 169 |
+
**Coverage**:
|
| 170 |
+
- Override state clearing
|
| 171 |
+
- Ripeness UNKNOWN handling
|
| 172 |
+
- Inflow rate calculations
|
| 173 |
+
- Constraint validation
|
| 174 |
+
|
| 175 |
+
**Files**:
|
| 176 |
+
- tests/test_overrides.py (new)
|
| 177 |
+
- tests/test_ripeness.py (expand)
|
| 178 |
+
- tests/test_simulation.py (inflow tests)
|
| 179 |
+
|
| 180 |
+
### 5.2 Integration Tests
|
| 181 |
+
**Scenarios**:
|
| 182 |
+
- Full pipeline with overrides applied
|
| 183 |
+
- Ripeness transitions over time
|
| 184 |
+
- Blocked judge dates respected
|
| 185 |
+
- Capacity overrides honored
|
| 186 |
+
|
| 187 |
+
**Files**:
|
| 188 |
+
- tests/integration/test_scheduling_pipeline.py (new)
|
| 189 |
+
|
| 190 |
+
## Implementation Order
|
| 191 |
+
|
| 192 |
+
1. **Week 1**: Fix state bugs (1.1, 1.2, 1.3) + tests
|
| 193 |
+
2. **Week 2**: Strengthen ripeness (2.1, 2.2) + re-enable inflow (3.1, 3.2)
|
| 194 |
+
3. **Week 3**: Enhanced constraints (4.1, 4.2, 4.3)
|
| 195 |
+
4. **Week 4**: Comprehensive testing + ripeness learning feedback (2.3)
|
| 196 |
+
|
| 197 |
+
## Success Criteria
|
| 198 |
+
|
| 199 |
+
**Bug Fixes**:
|
| 200 |
+
- Override state doesn't leak between runs
|
| 201 |
+
- All override decisions auditable
|
| 202 |
+
- Rejected overrides tracked with reasons
|
| 203 |
+
|
| 204 |
+
**Ripeness**:
|
| 205 |
+
- UNKNOWN status used when confidence low
|
| 206 |
+
- False positive rate < 15% (marked RIPE but adjourned)
|
| 207 |
+
- Multi-signal scoring operational
|
| 208 |
+
|
| 209 |
+
**Simulation Realism**:
|
| 210 |
+
- Inflow configurable and metrics tracked
|
| 211 |
+
- Long runs show realistic caseload patterns
|
| 212 |
+
- Ripeness re-evaluation frequency tunable
|
| 213 |
+
|
| 214 |
+
**Constraints**:
|
| 215 |
+
- Judge blocked dates respected 100%
|
| 216 |
+
- Per-case gap overrides functional
|
| 217 |
+
- Capacity changes applied correctly
|
| 218 |
+
|
| 219 |
+
**Quality**:
|
| 220 |
+
- 90%+ test coverage for bug fixes
|
| 221 |
+
- Integration tests pass
|
| 222 |
+
- All edge cases documented
|
| 223 |
+
|
| 224 |
+
## Background
|
| 225 |
+
|
| 226 |
+
This plan addresses critical bugs and architectural improvements identified through code analysis:
|
| 227 |
+
|
| 228 |
+
1. **State Management**: Override flags persist across runs, causing silent bias
|
| 229 |
+
2. **Ripeness Defaults**: System defaults to RIPE when uncertain, risking premature scheduling
|
| 230 |
+
3. **Closed Simulation**: No case inflow, making long-term runs unrealistic
|
| 231 |
+
4. **Limited Auditability**: In-place mutations make debugging and QA difficult
|
| 232 |
+
|
| 233 |
+
See commit history for OutputManager refactoring and Windows compatibility fixes already completed.
|