RoyAalekh commited on
Commit
c39a084
·
1 Parent(s): eadbc29

Submission ready

Browse files
README.md CHANGED
@@ -1,313 +1,58 @@
1
  # Code4Change: Intelligent Court Scheduling System
2
 
3
- Data-driven court scheduling system with ripeness classification, multi-courtroom simulation, and intelligent case prioritization for Karnataka High Court.
4
 
5
- ## Project Overview
6
 
7
- This project delivers a **comprehensive** court scheduling system featuring:
8
- - **EDA & Parameter Extraction**: Analysis of 739K+ hearings to derive scheduling parameters
9
- - **Ripeness Classification**: Data-driven bottleneck detection (filtering unripe cases)
10
- - **Simulation Engine**: Multi-year court operations simulation with realistic outcomes
11
- - **Multiple Scheduling Policies**: FIFO, Age-based, Readiness-based, and RL-based
12
- - **Reinforcement Learning**: Tabular Q-learning achieving performance parity with heuristics
13
- - **Load Balancing**: Dynamic courtroom allocation with low inequality
14
- - **Configurable Pipeline**: Modular training and evaluation framework
15
 
16
- ## Key Achievements
17
 
18
- **81.4% Disposal Rate** - Significantly exceeds baseline expectations
19
- **Perfect Courtroom Balance** - Gini 0.002 load distribution
20
- **97.7% Case Coverage** - Near-zero case abandonment
21
- **Smart Bottleneck Detection** - 40.8% unripe cases filtered to save judicial time
22
- **Judge Control** - Complete override system for judicial autonomy
23
- **Production Ready** - Full cause list generation and audit capabilities
24
-
25
- ## Dataset
26
-
27
- - **Cases**: 134,699 unique civil cases with 24 attributes
28
- - **Hearings**: 739,670 individual hearings with 31 attributes
29
- - **Timespan**: 2000-2025 (disposed cases only)
30
- - **Scope**: Karnataka High Court, Bangalore Bench
31
-
32
- ## System Architecture
33
-
34
- ### 1. EDA & Parameter Extraction (`src/`)
35
- - Stage transition probabilities by case type
36
- - Duration distributions (median, p90) per stage
37
- - Adjournment rates by stage and case type
38
- - Court capacity analysis (151 hearings/day median)
39
- - Case type distributions and filing patterns
40
-
41
- ### 2. Ripeness Classification (`scheduler/core/ripeness.py`)
42
- - **Purpose**: Identify cases with substantive bottlenecks
43
- - **Types**: SUMMONS, DEPENDENT, PARTY, DOCUMENT
44
- - **Data-Driven**: Extracted from 739K historical hearings
45
- - **Impact**: Prevents premature scheduling of unready cases
46
-
47
- ### 3. Simulation Engine (`scheduler/simulation/`)
48
- - **Discrete Event Simulation**: Configurable horizon (30-384+ days)
49
- - **Stochastic Modeling**: Realistic adjournments and disposal rates
50
- - **Multi-Courtroom**: 5 courtrooms with dynamic load-balanced allocation
51
- - **Policies**: FIFO, Age-based, Readiness-based, RL-based scheduling
52
- - **Performance Comparison**: Direct policy evaluation framework
53
-
54
- ### 4. Reinforcement Learning (`rl/`)
55
- - **Tabular Q-Learning**: 6D state space for case prioritization
56
- - **Hybrid Architecture**: RL prioritization with rule-based constraints
57
- - **Training Pipeline**: Configurable episodes and learning parameters
58
- - **Performance**: 52.1% disposal rate (parity with 51.9% baseline)
59
- - **Configuration Management**: JSON-based training profiles and parameter overrides
60
-
61
- ### 5. Case Management (`scheduler/core/`)
62
- - Case entity with lifecycle tracking
63
- - Ripeness status and bottleneck reasons
64
- - No-case-left-behind tracking
65
- - Hearing history and stage progression
66
-
67
- ## Features
68
-
69
- - **Interactive Data Exploration**: Plotly-powered visualizations with filtering
70
- - **Case Analysis**: Distribution, disposal times, and patterns by case type
71
- - **Hearing Patterns**: Stage progression and judicial assignment analysis
72
- - **Temporal Analysis**: Yearly, monthly, and weekly hearing patterns
73
- - **Judge Analytics**: Assignment patterns and workload distribution
74
- - **Filter Controls**: Dynamic filtering by case type and year range
75
-
76
- ## Quick Start
77
-
78
- ### Interactive Dashboard (Primary Interface)
79
-
80
- **For submission/demo, use the dashboard - it's fully self-contained:**
81
 
82
  ```bash
83
- # Launch dashboard
84
  uv run streamlit run scheduler/dashboard/app.py
85
-
86
- # Open browser to http://localhost:8501
87
  ```
88
 
89
- **The dashboard handles everything:**
90
- 1. Run EDA pipeline (processes raw data, extracts parameters, generates visualizations)
91
- 2. Explore historical data and parameters
92
- 3. Test ripeness classification
93
- 4. Generate cases and run simulations
94
- 5. Review cause lists with judge override capability
95
- 6. Train RL models
96
- 7. Compare performance and generate reports
97
 
98
- **No CLI commands required** - everything is accessible through the web interface.
 
 
 
 
 
99
 
100
- ### Alternative: Command Line Interface
101
 
102
- For automation or scripting, all operations available via CLI:
103
 
104
  ```bash
105
- # See all available commands
106
  uv run court-scheduler --help
107
 
108
- # Run full workflow (generate cases + simulate)
109
  uv run court-scheduler workflow --cases 10000 --days 384
110
  ```
111
 
112
- ### Common Operations
113
-
114
- **1. Run EDA Pipeline** (extract parameters from historical data):
115
- ```bash
116
- uv run court-scheduler eda
117
- ```
118
-
119
- **2. Generate Test Cases**:
120
- ```bash
121
- uv run court-scheduler generate --cases 10000 --output data/cases.csv
122
- ```
123
-
124
- **3. Run Simulation**:
125
- ```bash
126
- uv run court-scheduler simulate --cases data/cases.csv --days 384 --policy readiness
127
- ```
128
-
129
- **4. Train RL Agent** (optional enhancement):
130
- ```bash
131
- uv run court-scheduler train --episodes 20 --output models/agent.pkl
132
- ```
133
-
134
- **5. Full Workflow** (end-to-end):
135
- ```bash
136
- uv run court-scheduler workflow --cases 10000 --days 384 --output results/
137
- ```
138
-
139
- See [HACKATHON_SUBMISSION.md](HACKATHON_SUBMISSION.md) for detailed submission instructions.
140
-
141
- ### Advanced Usage
142
-
143
- <details>
144
- <summary>Click for configuration and customization options</summary>
145
-
146
- #### Using Configuration Files
147
-
148
- ```bash
149
- # Generate with custom config
150
- uv run court-scheduler generate --config configs/generate_config.toml
151
-
152
- # Simulate with custom config
153
- uv run court-scheduler simulate --config configs/simulate_config.toml
154
- ```
155
-
156
- #### Interactive Mode
157
-
158
- ```bash
159
- # Prompt for all parameters
160
- uv run court-scheduler simulate --interactive
161
- ```
162
-
163
- #### Custom Parameters
164
-
165
- ```bash
166
- # Training with custom hyperparameters
167
- uv run court-scheduler train \
168
- --episodes 50 \
169
- --cases 200 \
170
- --lr 0.15 \
171
- --epsilon 0.4 \
172
- --output models/custom_agent.pkl
173
-
174
- # Simulation with specific settings
175
- uv run court-scheduler simulate \
176
- --cases data/cases.csv \
177
- --days 730 \
178
- --policy readiness \
179
- --seed 42 \
180
- --log-dir outputs/long_run
181
- ```
182
-
183
- #### Policy Comparison
184
-
185
- ```bash
186
- # Run with different policies
187
- uv run court-scheduler simulate --policy fifo --log-dir outputs/fifo_run
188
- uv run court-scheduler simulate --policy age --log-dir outputs/age_run
189
- uv run court-scheduler simulate --policy readiness --log-dir outputs/readiness_run
190
- ```
191
-
192
- </details>
193
-
194
- ## CLI Reference
195
-
196
- All commands follow the pattern: `uv run court-scheduler <command> [options]`
197
-
198
- | Command | Description | Key Options |
199
- |---------|-------------|-------------|
200
- | `eda` | Run EDA pipeline | `--skip-clean`, `--skip-viz`, `--skip-params` |
201
- | `generate` | Generate test cases | `--cases`, `--start`, `--end`, `--output` |
202
- | `simulate` | Run simulation | `--cases`, `--days`, `--policy`, `--log-dir` |
203
- | `train` | Train RL agent | `--episodes`, `--lr`, `--epsilon`, `--output` |
204
- | `workflow` | Full pipeline | `--cases`, `--days`, `--output` |
205
- | `version` | Show version | - |
206
-
207
- For detailed options: `uv run court-scheduler <command> --help`
208
-
209
- ## Recent Improvements
210
-
211
- ### RL Training Gap Fixes
212
-
213
- Two critical gaps in the RL training system have been identified and fixed:
214
-
215
- **1. EDA Parameter Alignment**
216
- - **Issue**: Training environment used hardcoded probabilities (0.7, 0.6, 0.4) instead of EDA-derived parameters
217
- - **Fix**: Integrated ParameterLoader into RLTrainingEnvironment to use data-driven parameters
218
- - **Validation**: Adjournment rates now align within 1% of EDA-derived values (43.0% vs 42.3%)
219
- - **Impact**: Training now matches evaluation dynamics, improving policy generalization
220
-
221
- **2. Ripeness Feedback Loop**
222
- - **Issue**: Ripeness classification used static keyword/stage heuristics with no feedback mechanism
223
- - **Fix**: Created RipenessMetrics and RipenessCalibrator for dynamic threshold adjustment
224
- - **Components**:
225
- - `scheduler/monitoring/ripeness_metrics.py`: Tracks predictions vs outcomes, computes confusion matrix
226
- - `scheduler/monitoring/ripeness_calibrator.py`: Analyzes metrics and suggests threshold adjustments
227
- - Enhanced `RipenessClassifier` with `set_thresholds()` and `get_current_thresholds()` methods
228
- - **Impact**: Enables continuous improvement of ripeness classification accuracy based on real outcomes
229
-
230
- These fixes ensure that RL training is reproducible, aligned with evaluation conditions, and benefits from adaptive ripeness detection that learns from historical data.
231
-
232
- ## Key Insights
233
-
234
- ### Data Characteristics
235
- - **Case Types**: 8 civil case categories (RSA, CRP, RFA, CA, CCC, CP, MISC.CVL, CMP)
236
- - **Disposal Times**: Significant variation by case type and complexity
237
- - **Hearing Stages**: Primary stages include ADMISSION, ORDERS/JUDGMENT, and OTHER
238
- - **Judge Assignments**: Mix of single and multi-judge benches
239
-
240
- ### Scheduling Implications
241
- - Different case types require different handling strategies
242
- - Historical judge assignment patterns can inform scheduling preferences
243
- - Clear temporal patterns in hearing schedules
244
- - Multiple hearing stages requiring different resource allocation
245
-
246
- ## Current Results (Latest Simulation)
247
-
248
- ### Performance Metrics
249
- - **Cases Scheduled**: 97.7% (9,766/10,000 cases)
250
- - **Disposal Rate**: 81.4% (significantly above baseline)
251
- - **Adjournment Rate**: 31.1% (realistic, within expected range)
252
- - **Courtroom Balance**: Gini 0.002 (perfect load distribution)
253
- - **Utilization**: 45.0% (sustainable with realistic constraints)
254
-
255
- ### Disposal Rates by Case Type
256
- | Type | Disposed | Total | Rate | Performance |
257
- |------|----------|-------|------|-------------|
258
- | CP | 833 | 963 | 86.5% | Excellent |
259
- | CMP | 237 | 275 | 86.2% | Excellent |
260
- | CA | 1,676 | 1,949 | 86.0% | Excellent |
261
- | CCC | 978 | 1,147 | 85.3% | Excellent |
262
- | CRP | 1,750 | 2,062 | 84.9% | Excellent |
263
- | RSA | 1,488 | 1,924 | 77.3% | Good |
264
- | RFA | 1,174 | 1,680 | 69.9% | Fair |
265
-
266
- *Short-lifecycle cases (CP, CMP, CA) achieve 85%+ disposal. Complex appeals show expected lower rates due to longer processing requirements.*
267
-
268
- ## Hackathon Compliance
269
-
270
- ### Step 2: Data-Informed Modelling - COMPLETE
271
- - Analyzed 739,669 hearings for patterns
272
- - Classified cases as "ripe" vs "unripe" with bottleneck types
273
- - Developed adjournment and disposal assumptions
274
- - Proposed synthetic fields for data enrichment
275
-
276
- ### Step 3: Algorithm Development - COMPLETE
277
- - 2-year simulation operational with validated results
278
- - Stochastic case progression with realistic dynamics
279
- - Accounts for judicial working days (192/year)
280
- - Dynamic multi-courtroom allocation with perfect load balancing
281
- - Daily cause lists generated (CSV format)
282
- - User control & override system (judge approval workflow)
283
- - No-case-left-behind verification (97.7% coverage achieved)
284
 
285
- ## For Hackathon Teams
286
 
287
- ### Current Capabilities
288
- 1. **Ripeness Classification**: Data-driven bottleneck detection
289
- 2. **Realistic Simulation**: Stochastic adjournments, type-specific disposals
290
- 3. **Multiple Policies**: FIFO, age-based, readiness-based
291
- 4. **Fair Scheduling**: Gini coefficient 0.253 (low inequality)
292
- 5. **Dynamic Allocation**: Load-balanced distribution across 5 courtrooms (Gini 0.002)
293
 
294
- ### Development Status
295
- - **EDA & parameter extraction** - Complete
296
- - **Ripeness classification system** - Complete (40.8% cases filtered)
297
- - **Simulation engine with disposal logic** - Complete
298
- - **Dynamic multi-courtroom allocator** - Complete (perfect load balance)
299
- - **Daily cause list generator** - Complete (CSV export working)
300
- - **User control & override system** - Core API complete, UI pending
301
- - **No-case-left-behind verification** - Complete (97.7% coverage)
302
- - **Data gap analysis report** - Complete (8 synthetic fields proposed)
303
- - **Interactive dashboard** - Visualization components ready, UI assembly needed
304
 
305
- ## Documentation
306
 
307
- **Primary**: This README (complete user guide)
308
- **Additional**: `docs/` folder contains:
309
- - `DASHBOARD.md` - Dashboard usage and architecture
310
- - `CONFIGURATION.md` - Configuration system reference
311
- - `HACKATHON_SUBMISSION.md` - Hackathon-specific submission guide
312
 
313
- **Scripts**: See `scripts/README.md` for analysis utilities
 
 
1
  # Code4Change: Intelligent Court Scheduling System
2
 
3
+ Purpose-built for hackathon evaluation. This repository runs out of the box using the Streamlit dashboard and the uv tool.
4
 
5
+ ## Requirements
6
 
7
+ - Python 3.11+
8
+ - uv (required)
9
+ - macOS/Linux: `curl -LsSf https://astral.sh/uv/install.sh | sh`
10
+ - Windows (PowerShell): `irm https://astral.sh/uv/install.ps1 | iex`
 
 
 
 
11
 
12
+ ## Quick Start (Dashboard)
13
 
14
+ 1. Install uv (see above) and ensure Python 3.11+ is available.
15
+ 2. Clone this repository.
16
+ 3. Launch the dashboard:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
  ```bash
 
19
  uv run streamlit run scheduler/dashboard/app.py
 
 
20
  ```
21
 
22
+ Then open http://localhost:8501 in your browser.
 
 
 
 
 
 
 
23
 
24
+ The dashboard provides:
25
+ - Run EDA pipeline (process raw data and extract parameters)
26
+ - Explore data and parameters
27
+ - Generate cases and run simulations
28
+ - Review cause lists and judge overrides
29
+ - Compare performance and export reports
30
 
31
+ ## Command Line (optional)
32
 
33
+ All operations are available via CLI as well:
34
 
35
  ```bash
 
36
  uv run court-scheduler --help
37
 
38
+ # End-to-end workflow
39
  uv run court-scheduler workflow --cases 10000 --days 384
40
  ```
41
 
42
+ For a detailed walkthrough tailored for judges, see `docs/HACKATHON_SUBMISSION.md`.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
 
44
+ ## Data (DuckDB-first)
45
 
46
+ This repository uses a DuckDB snapshot as the canonical raw dataset.
 
 
 
 
 
47
 
48
+ - Preferred source: `Data/court_data.duckdb` (tables: `cases`, `hearings`). If this file is present, the EDA step will load directly from it.
49
+ - CSV fallback: If the DuckDB file is missing, place the two organizer CSVs in `Data/` with the exact names below and the EDA step will load them automatically:
50
+ - `ISDMHack_Cases_WPfinal.csv`
51
+ - `ISDMHack_Hear.csv`
 
 
 
 
 
 
52
 
53
+ No manual pre-processing is required; launch the dashboard and click “Run EDA Pipeline.”
54
 
55
+ ## Notes
 
 
 
 
56
 
57
+ - This submission intentionally focuses on the end-to-end demo path. Internal development notes, enhancements, and bug fix logs have been removed from the README.
58
+ - uv is enforced by the dashboard for a consistent, reproducible environment.
SUBMISSION_READINESS_AUDIT.md DELETED
@@ -1,313 +0,0 @@
1
- # Submission Readiness Audit - Critical Workflow Analysis
2
-
3
- **Date**: November 29, 2025
4
- **Purpose**: Validate that EVERY user action can be completed through dashboard
5
- **Goal**: Win the hackathon by ensuring zero gaps in functionality
6
-
7
- ---
8
-
9
- ## Audit Methodology
10
-
11
- Simulating fresh user experience with ONLY:
12
- 1. Raw data files (cases CSV, hearings CSV)
13
- 2. Code repository
14
- 3. Dashboard interface
15
-
16
- **NO pre-generated files, NO CLI usage, NO manual configuration**
17
-
18
- ---
19
-
20
- ## 🔴 CRITICAL GAPS FOUND
21
-
22
- ### GAP 1: Simulation Workflow - Policy Selection ✅ EXISTS
23
- **Location**: `3_Simulation_Workflow.py` (confirmed working)
24
- **Status**: ✅ IMPLEMENTED
25
- - User can select: FIFO, Age-based, Readiness, RL-based
26
- - RL requires trained model (handles gracefully)
27
-
28
- ### GAP 2: Simulation Configuration Values ✅ EXISTS
29
- **Location**: `3_Simulation_Workflow.py`
30
- **Status**: ✅ IMPLEMENTED
31
- **User Controls**:
32
- - Number of days to simulate
33
- - Number of courtrooms
34
- - Daily capacity per courtroom
35
- - Random seed
36
- - Policy selection
37
-
38
- ### GAP 3: Case Generation ✅ EXISTS
39
- **Location**: `3_Simulation_Workflow.py` Step 1
40
- **Status**: ✅ IMPLEMENTED
41
- **Options**:
42
- - Generate synthetic cases (with configurable parameters)
43
- - Upload CSV
44
- **Parameters exposed**:
45
- - Number of cases
46
- - Filing date range
47
- - Random seed
48
- - Output location
49
-
50
- ### GAP 4: RL Training ❓ NEEDS VERIFICATION
51
- **Location**: `3_RL_Training.py`
52
- **Questions**:
53
- - Can user train RL model from dashboard?
54
- - Can they configure hyperparameters (episodes, learning rate, epsilon)?
55
- - Can they save/load models?
56
- - How do they use trained model in simulation?
57
-
58
- ### GAP 5: Cause List Review & Override ❓ NEEDS VERIFICATION
59
- **Location**: `4_Cause_Lists_And_Overrides.py`
60
- **Questions**:
61
- - Can user view generated cause lists after simulation?
62
- - Can they modify case order (drag-and-drop)?
63
- - Can they remove/add cases?
64
- - Can they approve/reject algorithmic suggestions?
65
- - Is there an audit trail?
66
-
67
- ### GAP 6: Performance Comparison ❓ NEEDS VERIFICATION
68
- **Location**: `6_Analytics_And_Reports.py`
69
- **Questions**:
70
- - Can user compare multiple simulation runs?
71
- - Can they see fairness metrics (Gini coefficient)?
72
- - Can they export reports?
73
- - Can they identify which policy performed best?
74
-
75
- ### GAP 7: Ripeness Classifier Tuning ✅ EXISTS
76
- **Location**: `2_Ripeness_Classifier.py`
77
- **Status**: ✅ IMPLEMENTED (based on notebook context)
78
- - Interactive threshold adjustment
79
- - Test on sample cases
80
- - Batch classification
81
-
82
- ---
83
-
84
- ## 🔍 DETAILED VERIFICATION NEEDED
85
-
86
- ### Must Check: 3_RL_Training.py
87
- **Required Features**:
88
- - [ ] Training configuration form (episodes, LR, epsilon, gamma)
89
- - [ ] Start training button
90
- - [ ] Progress indicator during training
91
- - [ ] Save trained model with name
92
- - [ ] Load existing model for comparison
93
- - [ ] Model performance metrics
94
- - [ ] Link to use model in Simulation Workflow
95
-
96
- **If Missing**: User cannot train RL agent through dashboard
97
-
98
- ### Must Check: 4_Cause_Lists_And_Overrides.py
99
- **Required Features**:
100
- - [ ] Load cause lists from simulation output
101
- - [ ] Display: date, courtroom, scheduled cases
102
- - [ ] Override interface:
103
- - [ ] Reorder cases (drag-and-drop or priority input)
104
- - [ ] Remove case from list
105
- - [ ] Add case to list (from queue)
106
- - [ ] Mark ripeness override
107
- - [ ] Approve final list
108
- - [ ] Audit trail: who changed what, when
109
- - [ ] Export approved cause lists
110
-
111
- **If Missing**: Core hackathon requirement (judge control) not demonstrable
112
-
113
- ### Must Check: 6_Analytics_And_Reports.py
114
- **Required Features**:
115
- - [ ] List all simulation runs
116
- - [ ] Select runs to compare
117
- - [ ] Side-by-side metrics:
118
- - [ ] Disposal rate
119
- - [ ] Adjournment rate
120
- - [ ] Courtroom utilization
121
- - [ ] Fairness (Gini coefficient)
122
- - [ ] Cases scheduled vs abandoned
123
- - [ ] Charts: performance over time
124
- - [ ] Export comparison report (PDF/CSV)
125
-
126
- **If Missing**: Cannot demonstrate algorithmic improvements or validate claims
127
-
128
- ---
129
-
130
- ## 🎯 WINNING CRITERIA CHECKLIST
131
-
132
- ### Data-Informed Modelling (Step 2)
133
- - [x] EDA pipeline button in dashboard
134
- - [x] Ripeness classification interactive tuning
135
- - [x] Historical pattern visualizations
136
- - [ ] **VERIFY**: Can user see extracted parameters clearly?
137
-
138
- ### Algorithm Development (Step 3)
139
- - [x] Multi-policy simulation available
140
- - [x] Configurable simulation parameters
141
- - [ ] **VERIFY**: Cause list generation automatic?
142
- - [ ] **CRITICAL**: Judge override system demonstrable?
143
- - [ ] **VERIFY**: No-case-left-behind metrics shown?
144
-
145
- ### Fair Scheduling
146
- - [ ] **VERIFY**: Gini coefficient displayed in results?
147
- - [ ] **VERIFY**: Fairness comparison across policies?
148
- - [ ] **VERIFY**: Case age distribution shown?
149
-
150
- ### User Control & Transparency
151
- - [ ] **CRITICAL**: Override interface working?
152
- - [ ] **VERIFY**: Algorithm explainability (why case scheduled/rejected)?
153
- - [ ] **VERIFY**: Audit trail of all decisions?
154
-
155
- ### Production Readiness
156
- - [x] Self-contained dashboard (no CLI needed)
157
- - [x] EDA on-demand generation
158
- - [x] Case generation on-demand
159
- - [ ] **VERIFY**: End-to-end workflow completable?
160
- - [ ] **VERIFY**: All outputs exportable (CSV/PDF)?
161
-
162
- ---
163
-
164
- ## 🚨 HIGH-RISK GAPS (Potential Show-Stoppers)
165
-
166
- ### 1. Judge Override System
167
- **Risk**: If not working, fails core hackathon requirement
168
- **Impact**: Cannot demonstrate judicial autonomy
169
- **Action**: MUST verify `4_Cause_Lists_And_Overrides.py` has full CRUD operations
170
-
171
- ### 2. RL Model Training Loop
172
- **Risk**: If training only works via CLI, breaks "dashboard-only" claim
173
- **Impact**: Cannot demonstrate RL capability in live demo
174
- **Action**: MUST verify `3_RL_Training.py` can train AND use model in sim
175
-
176
- ### 3. Performance Comparison
177
- **Risk**: If cannot compare policies, cannot prove algorithmic value
178
- **Impact**: No evidence of improvement over baseline
179
- **Action**: MUST verify `6_Analytics_And_Reports.py` shows metrics comparison
180
-
181
- ### 4. Cause List Export
182
- **Risk**: If cannot export final cause lists, not "production ready"
183
- **Impact**: Cannot demonstrate deployment readiness
184
- **Action**: MUST verify CSV/PDF export from cause lists page
185
-
186
- ---
187
-
188
- ## 📋 NEXT STEPS (Priority Order)
189
-
190
- ### IMMEDIATE (P0 - Do Now)
191
- 1. **Read full content of**:
192
- - `3_RL_Training.py` (lines 1-end)
193
- - `4_Cause_Lists_And_Overrides.py` (lines 1-end)
194
- - `6_Analytics_And_Reports.py` (lines 1-end)
195
-
196
- 2. **Verify each gap** listed above
197
-
198
- 3. **For each missing feature, decide**:
199
- - Implement now (if < 30 min)
200
- - Create placeholder with "Coming Soon" (if > 30 min)
201
- - Document as limitation (if not critical)
202
-
203
- ### HIGH (P1 - Do Today)
204
- 4. **Test complete workflow as user would**:
205
- - Fresh launch → EDA → Generate cases → Simulate → View results → Export
206
- - Identify ANY point where user gets stuck
207
-
208
- 5. **Create user guide** in dashboard:
209
- - Step-by-step workflow
210
- - Expected processing times
211
- - What each button does
212
-
213
- ### MEDIUM (P2 - Nice to Have)
214
- 6. **Add progress indicators**:
215
- - EDA pipeline: "Processing 739K hearings... 45%"
216
- - Case generation: "Generated 5,000 / 10,000"
217
- - Simulation: "Day 120 / 384"
218
-
219
- 7. **Add data validation**:
220
- - Check if EDA output exists before allowing simulation
221
- - Warn if parameters seem unrealistic
222
-
223
- ---
224
-
225
- ## 🏆 SUBMISSION CHECKLIST
226
-
227
- Before submission, user should be able to (with ZERO CLI):
228
-
229
- ### Setup (One Time)
230
- - [ ] Launch dashboard
231
- - [ ] Click "Run EDA" button
232
- - [ ] Wait 2-5 minutes
233
- - [ ] See "EDA Complete" message
234
-
235
- ### Generate Cases
236
- - [ ] Go to "Simulation Workflow"
237
- - [ ] Enter: 10,000 cases, 2022-2023 date range
238
- - [ ] Click "Generate"
239
- - [ ] See "Generation Complete"
240
-
241
- ### Run Simulation
242
- - [ ] Configure: 384 days, 5 courtrooms, Readiness policy
243
- - [ ] Click "Run Simulation"
244
- - [ ] See progress bar
245
- - [ ] View results: disposal rate, Gini, utilization
246
-
247
- ### Judge Override
248
- - [ ] Go to "Cause Lists & Overrides"
249
- - [ ] Select a date and courtroom
250
- - [ ] See algorithm-suggested cause list
251
- - [ ] Reorder 2 cases (or add/remove)
252
- - [ ] Click "Approve"
253
- - [ ] See confirmation
254
-
255
- ### Performance Analysis
256
- - [ ] Go to "Analytics & Reports"
257
- - [ ] See list of past simulation runs
258
- - [ ] Select 2 runs (FIFO vs Readiness)
259
- - [ ] View comparison: disposal rates, fairness
260
- - [ ] Export comparison as CSV
261
-
262
- ### Train RL (Optional)
263
- - [ ] Go to "RL Training"
264
- - [ ] Configure: 20 episodes, 0.15 LR
265
- - [ ] Click "Train"
266
- - [ ] See training progress
267
- - [ ] Save model as "my_agent.pkl"
268
-
269
- ### Use RL Model
270
- - [ ] Go to "Simulation Workflow"
271
- - [ ] Select policy: "RL-based"
272
- - [ ] Select model: "my_agent.pkl"
273
- - [ ] Run simulation
274
- - [ ] Compare with baseline
275
-
276
- **If ANY step above fails or requires CLI, THAT IS A CRITICAL GAP.**
277
-
278
- ---
279
-
280
- ## 💡 RECOMMENDATIONS
281
-
282
- ### If Gaps Found:
283
- 1. **Critical gaps (override system)**: Implement immediately, even if basic
284
- 2. **Important gaps (RL training)**: Add "Coming Soon" notice + CLI fallback instructions
285
- 3. **Nice-to-have gaps**: Document as future enhancement
286
-
287
- ### If Time Allows:
288
- - Add tooltips explaining every parameter
289
- - Add "Example Workflow" guided tour
290
- - Add validation warnings (e.g., "10,000 cases with 5 days simulation seems short")
291
- - Add dashboard tour on first launch
292
-
293
- ### Communication Strategy:
294
- - If feature incomplete: "This shows RL training interface. For full training, use CLI: `uv run court-scheduler train`"
295
- - If feature works: "Fully interactive - no CLI needed"
296
- - Always emphasize: "Dashboard is primary interface, CLI is for automation"
297
-
298
- ---
299
-
300
- ## ✅ VERIFICATION PROTOCOL
301
-
302
- For EACH page, answer:
303
- 1. **Can user complete the task without leaving dashboard?**
304
- 2. **Are all configuration options exposed?**
305
- 3. **Is there clear feedback on success/failure?**
306
- 4. **Can user export/save results?**
307
- 5. **Is there a "Next Step" button to guide workflow?**
308
-
309
- If ANY answer is "No", that's a gap.
310
-
311
- ---
312
-
313
- **Next Action**: Read remaining dashboard pages and fill in verification checkboxes above.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
docs/CONFIGURATION.md CHANGED
@@ -1,179 +1,20 @@
1
- # Configuration Architecture
2
 
3
- ## Overview
4
- The codebase uses a layered configuration approach separating concerns by domain and lifecycle.
5
 
6
- ## Configuration Layers
 
7
 
8
- ### 1. Domain Constants (`scheduler/data/config.py`)
9
- **Purpose**: Immutable domain knowledge that never changes.
10
 
11
- **Contains**:
12
- - `STAGES` - Legal case lifecycle stages from domain knowledge
13
- - `TERMINAL_STAGES` - Stages indicating case disposal
14
- - `CASE_TYPES` - Valid case type taxonomy
15
- - `CASE_TYPE_DISTRIBUTION` - Historical distribution from EDA
16
- - `WORKING_DAYS_PER_YEAR` - Court calendar constant (192 days)
17
-
18
- **When to use**: Values derived from legal/institutional domain that are facts, not tunable parameters.
19
-
20
- ### 2. RL Training Configuration (`rl/config.py`)
21
- **Purpose**: Hyperparameters affecting RL agent learning behavior.
22
-
23
- **Class**: `RLTrainingConfig`
24
-
25
- **Parameters**:
26
- - `episodes`: Number of training episodes
27
- - `cases_per_episode`: Cases generated per episode
28
- - `episode_length_days`: Simulation horizon per episode
29
- - `learning_rate`: Q-learning alpha parameter
30
- - `discount_factor`: Q-learning gamma parameter
31
- - `initial_epsilon`: Starting exploration rate
32
- - `epsilon_decay`: Exploration decay factor
33
- - `min_epsilon`: Minimum exploration threshold
34
-
35
- **Presets**:
36
- - `DEFAULT_RL_TRAINING_CONFIG` - Standard training (100 episodes)
37
- - `QUICK_DEMO_RL_CONFIG` - Fast testing (20 episodes)
38
-
39
- **When to use**: Experimenting with RL training convergence and exploration strategies.
40
-
41
- ### 3. Policy Configuration (`rl/config.py`)
42
- **Purpose**: Policy-specific filtering and prioritization behavior.
43
-
44
- **Class**: `PolicyConfig`
45
-
46
- **Parameters**:
47
- - `min_gap_days`: Minimum days between hearings (fairness constraint)
48
- - `max_gap_alert_days`: Maximum gap before triggering alerts
49
- - `old_case_threshold_days`: Age threshold for priority boost
50
- - `skip_unripe_cases`: Whether to filter unripe cases
51
- - `allow_old_unripe_cases`: Allow scheduling very old unripe cases
52
-
53
- **When to use**: Tuning policy filtering logic without changing core algorithm.
54
-
55
- ### 4. Simulation Configuration (`scheduler/simulation/engine.py`)
56
- **Purpose**: Per-simulation operational parameters.
57
-
58
- **Class**: `CourtSimConfig`
59
-
60
- **Parameters**:
61
- - `start`: Simulation start date
62
- - `days`: Duration in days
63
- - `seed`: Random seed for reproducibility
64
- - `courtrooms`: Number of courtrooms to simulate
65
- - `daily_capacity`: Cases per courtroom per day
66
- - `policy`: Scheduling policy name (`fifo`, `age`, `readiness`, `rl`)
67
- - `duration_percentile`: EDA percentile for stage durations
68
- - `rl_agent_path`: Path to trained RL model (required if `policy="rl"`)
69
- - `log_dir`: Output directory for metrics
70
-
71
- **Validation**: `__post_init__` validates RL requirements and path types.
72
-
73
- **When to use**: Each simulation run (different policies, time periods, or capacities).
74
-
75
- ### 5. CLI Configuration (`cli/config.py`)
76
- **Purpose**: Command-line interface configuration management.
77
-
78
- **Functions**:
79
- - `load_generate_config()`: Load case generation TOML config
80
- - `load_simulate_config()`: Load simulation TOML config
81
- - `load_rl_training_config()`: Load RL training TOML config
82
-
83
- **Configuration Files** (TOML format in `configs/`):
84
- - `generate_config.toml`: Case generation parameters
85
- - `simulate_config.toml`: Simulation settings
86
- - `rl_training_config.toml`: Training hyperparameters
87
-
88
- **When to use**: Customizing CLI command behavior without modifying code.
89
-
90
- ## Configuration Flow
91
-
92
- ```
93
- CLI Execution:
94
- |-- CLI Commands (cli/main.py)
95
- |-- Command Options (Typer-based)
96
- |-- Config Files (TOML in configs/)
97
-
98
- |-- Data Generation:
99
- |-- Case generation parameters
100
- |-- Date ranges and distributions
101
-
102
- |-- RL Training:
103
- |-- RLTrainingConfig (training hyperparameters)
104
- |-- Training environment settings
105
-
106
- |-- Simulation:
107
- |-- CourtSimConfig (simulation settings)
108
- |-- rl_agent_path (from training output)
109
- |-- Policy instantiation:
110
- |-- PolicyConfig (policy-specific settings)
111
- ```
112
-
113
- ## Design Principles
114
-
115
- 1. **Separation of Concerns**: Each config class owns one domain
116
- 2. **Type Safety**: Dataclasses with validation in `__post_init__`
117
- 3. **No Magic**: Explicit parameters, no hidden defaults
118
- 4. **Immutability**: Domain constants never change
119
- 5. **Composition**: Configs nest (PipelineConfig contains RLTrainingConfig)
120
-
121
- ## Examples
122
-
123
- ### Quick Demo (CLI)
124
  ```bash
125
- # Command-line options
126
- uv run court-scheduler workflow --cases 10000 --days 90
 
 
127
  ```
128
 
129
- ### Quick Demo (Programmatic)
130
- ```python
131
- from rl.config import QUICK_DEMO_RL_CONFIG
132
- from scheduler.simulation.engine import CourtSimConfig
133
-
134
- # Use preset configs directly
135
- rl_config = QUICK_DEMO_RL_CONFIG # 20 episodes
136
- ```
137
-
138
- ### Custom Training
139
- ```python
140
- from rl.config import RLTrainingConfig
141
-
142
- custom_rl = RLTrainingConfig(
143
- episodes=500,
144
- learning_rate=0.1,
145
- initial_epsilon=0.3,
146
- epsilon_decay=0.995
147
- )
148
-
149
- config = PipelineConfig(
150
- n_cases=50000,
151
- rl_training=custom_rl,
152
- sim_days=730
153
- )
154
- ```
155
-
156
- ### Policy Tuning
157
- ```python
158
- from rl.config import PolicyConfig
159
-
160
- strict_policy = PolicyConfig(
161
- min_gap_days=14, # More conservative
162
- skip_unripe_cases=True,
163
- allow_old_unripe_cases=False # Strict ripeness enforcement
164
- )
165
-
166
- # Pass to RLPolicy
167
- policy = RLPolicy(agent_path=model_path, policy_config=strict_policy)
168
- ```
169
-
170
- ## Migration Guide
171
-
172
- ### Adding New Configuration
173
- 1. Determine layer (domain constant vs. tunable parameter)
174
- 2. Add to appropriate config class
175
- 3. Update `__post_init__` validation if needed
176
- 4. Document in this file
177
 
178
  ### Deprecating Parameters
179
  1. Move to config class first (keep old path working)
 
1
+ # Configuration Guide (Consolidated)
2
 
3
+ This configuration reference has been intentionally simplified for the hackathon to keep the repository focused for judges and evaluators.
 
4
 
5
+ For the end-to-end demo and instructions, see:
6
+ - `docs/HACKATHON_SUBMISSION.md`
7
 
8
+ Advanced usage help is available via the CLI:
 
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ```bash
11
+ uv run court-scheduler --help
12
+ uv run court-scheduler generate --help
13
+ uv run court-scheduler simulate --help
14
+ uv run court-scheduler workflow --help
15
  ```
16
 
17
+ Note: uv is required for all commands.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  ### Deprecating Parameters
20
  1. Move to config class first (keep old path working)
docs/DASHBOARD.md CHANGED
@@ -1,41 +1,17 @@
1
- # Interactive Dashboard
2
 
3
- **Last Updated**: 2025-11-29
4
- **Status**: Production Ready
5
- **Version**: 1.0.0
6
 
7
- ## Launch
 
 
8
 
9
  ```bash
10
  uv run streamlit run scheduler/dashboard/app.py
11
- # Open http://localhost:8501
12
  ```
13
 
14
- ## Pages
15
-
16
- 1. **Data & Insights** - Historical analysis of 739K+ hearings
17
- 2. **Ripeness Classifier** - Case bottleneck detection with explainability
18
- 3. **RL Training** - Train and evaluate RL scheduling agents
19
- 4. **Simulation Workflow** - Run simulations with configurable policies
20
- 5. **Cause Lists & Overrides** - Judge override interface for cause lists
21
- 6. **Analytics & Reports** - Performance comparison and reporting
22
-
23
- ## Workflows
24
-
25
- **EDA Exploration**: Run EDA → Launch dashboard → Filter and visualize data
26
- **Judge Overrides**: Launch dashboard → Simulation Workflow → Review/modify cause lists
27
- **RL Training**: Launch dashboard → RL Training page → Configure and train
28
-
29
- ## Data Sources
30
-
31
- - Historical data: `reports/figures/v*/cases_clean.parquet` and `hearings_clean.parquet`
32
- - Parameters: `reports/figures/v*/params/` (auto-detected latest version)
33
- - Falls back to bundled defaults if EDA not run
34
- - [ ] Batch classification (10K+ cases)
35
- - [ ] Multiple concurrent users (if deployed)
36
-
37
- ## Troubleshooting
38
 
39
- **Dashboard won't launch**: Run `uv sync` to install dependencies
40
- **Empty visualizations**: Run `uv run court-scheduler eda` first
41
- **Slow loading**: Data auto-cached after first load (1-hour TTL)
 
1
+ # Dashboard Guide (Consolidated)
2
 
3
+ This document has been simplified for the hackathon. Please use the main guide:
 
 
4
 
5
+ - See `docs/HACKATHON_SUBMISSION.md` for end-to-end demo instructions.
6
+
7
+ Quick launch:
8
 
9
  ```bash
10
  uv run streamlit run scheduler/dashboard/app.py
11
+ # Then open http://localhost:8501
12
  ```
13
 
14
+ Data source:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
+ - Preferred: `Data/court_data.duckdb` (tables: `cases`, `hearings`).
17
+ - Fallback: place `ISDMHack_Cases_WPfinal.csv` and `ISDMHack_Hear.csv` in `Data/` if the DuckDB file is not present.
 
docs/HACKATHON_SUBMISSION.md CHANGED
@@ -1,10 +1,10 @@
1
  # Hackathon Submission Guide
2
- ## Intelligent Court Scheduling System with Reinforcement Learning
3
 
4
  ### Quick Start - Hackathon Demo
5
 
6
  **IMPORTANT**: The dashboard is fully self-contained. You only need:
7
- 1. Raw data files (provided)
8
  2. This codebase
9
  3. Run the dashboard
10
 
@@ -25,7 +25,7 @@ uv run streamlit run scheduler/dashboard/app.py
25
  4. **Review Results**: Check "Cause Lists & Overrides" for judge override interface
26
  5. **Performance Analysis**: View "Analytics & Reports" for metrics comparison
27
 
28
- **No pre-processing required** - dashboard handles everything interactively.
29
 
30
  #### Alternative: CLI Workflow (for scripting)
31
  ```bash
@@ -53,16 +53,13 @@ uv run court-scheduler eda
53
  # 2. Generate synthetic cases
54
  uv run court-scheduler generate --cases 50000
55
 
56
- # 3. Train RL agent (optional)
57
- uv run court-scheduler train --episodes 100
58
-
59
- # 4. Run simulation
60
  uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy readiness
61
  ```
62
 
63
  ### What the Pipeline Does
64
 
65
- The comprehensive pipeline executes 7 automated steps:
66
 
67
  **Step 1: EDA & Parameter Extraction**
68
  - Analyzes 739K+ historical hearings
@@ -74,52 +71,39 @@ The comprehensive pipeline executes 7 automated steps:
74
  - Configurable size (default: 50,000 cases)
75
  - Diverse case types and complexity levels
76
 
77
- **Step 3: RL Training**
78
- - Trains Tabular Q-learning agent
79
- - Real-time progress monitoring with reward tracking
80
- - Configurable episodes and hyperparameters
81
-
82
- **Step 4: 2-Year Simulation**
83
  - Runs 730-day court scheduling simulation
84
- - Compares RL agent vs baseline algorithms
85
  - Tracks disposal rates, utilization, fairness metrics
86
 
87
- **Step 5: Daily Cause List Generation**
88
  - Generates production-ready daily cause lists
89
  - Exports for all simulation days
90
  - Court-room wise scheduling details
91
 
92
- **Step 6: Performance Analysis**
93
  - Comprehensive comparison reports
94
  - Performance visualizations
95
  - Statistical analysis of all metrics
96
 
97
- **Step 7: Executive Summary**
98
  - Hackathon-ready summary document
99
  - Key achievements and impact metrics
100
  - Deployment readiness checklist
101
 
102
  ### Expected Output
103
 
104
- After completion, you'll find in your output directory:
105
 
106
  ```
107
- data/hackathon_run/
108
- |-- pipeline_config.json # Full configuration used
109
- |-- training_cases.csv # Generated case dataset
110
- |-- trained_rl_agent.pkl # Trained RL model
111
- |-- EXECUTIVE_SUMMARY.md # Hackathon submission summary
112
- |-- COMPARISON_REPORT.md # Detailed performance comparison
113
- |-- simulation_rl/ # RL policy results
114
- |-- events.csv
115
- |-- metrics.csv
116
- |-- report.txt
117
- |-- cause_lists/
118
- |-- daily_cause_list.csv # 730 days of cause lists
119
- |-- simulation_readiness/ # Baseline results
120
- |-- ...
121
- |-- visualizations/ # Performance charts
122
- |-- performance_charts.md
123
  ```
124
 
125
  ### Hackathon Winning Features
@@ -130,11 +114,11 @@ data/hackathon_run/
130
  - **Multi-Courtroom Support**: Load-balanced allocation across 5+ courtrooms
131
  - **Scalability**: Tested with 50,000+ cases
132
 
133
- #### 2. Technical Innovation
134
- - **Reinforcement Learning**: AI-powered adaptive scheduling
135
- - **6D State Space**: Comprehensive case characteristic modeling
136
- - **Hybrid Architecture**: Combines RL intelligence with rule-based constraints
137
- - **Real-time Learning**: Continuous improvement through experience
138
 
139
  #### 3. Production Readiness
140
  - **Interactive CLI**: User-friendly parameter configuration
@@ -150,15 +134,7 @@ data/hackathon_run/
150
 
151
  ### Performance Benchmarks
152
 
153
- Based on comprehensive testing:
154
-
155
- | Metric | RL Agent | Baseline | Advantage |
156
- |--------|----------|----------|-----------|
157
- | Disposal Rate | 52.1% | 51.9% | +0.4% |
158
- | Court Utilization | 85%+ | 85%+ | Comparable |
159
- | Load Balance (Gini) | 0.248 | 0.243 | Comparable |
160
- | Scalability | 50K cases | 50K cases | Yes |
161
- | Adaptability | High | Fixed | High |
162
 
163
  ### Customization Options
164
 
@@ -174,12 +150,11 @@ uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy age
174
  ```
175
 
176
  #### For Technical Evaluation
 
177
  ```bash
178
- # Focus on RL training quality
179
- uv run court-scheduler train --episodes 200 --lr 0.12 --cases 500 --output models/intensive_agent.pkl
180
-
181
- # Then simulate with trained agent
182
- uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy rl --agent models/intensive_agent.pkl
183
  ```
184
 
185
  #### For Quick Demo/Testing
@@ -202,7 +177,6 @@ uv run court-scheduler workflow --cases 10000 --days 90
202
 
203
  2. **Demonstrate the Solution**
204
  - Run the interactive pipeline live
205
- - Show real-time RL training progress
206
  - Display generated cause lists
207
 
208
  3. **Present the Results**
@@ -211,7 +185,7 @@ uv run court-scheduler workflow --cases 10000 --days 90
211
  - Show actual cause list files (730 days ready)
212
 
213
  4. **Emphasize Innovation**
214
- - Reinforcement Learning for judicial scheduling (novel)
215
  - Production-ready from day 1 (practical)
216
  - Scalable to entire court system (impactful)
217
 
@@ -223,7 +197,8 @@ uv run court-scheduler workflow --cases 10000 --days 90
223
 
224
  ### System Requirements
225
 
226
- - **Python**: 3.10+ with UV
 
227
  - **Memory**: 8GB+ RAM (16GB recommended for 50K cases)
228
  - **Storage**: 2GB+ for full pipeline outputs
229
  - **Runtime**:
@@ -236,9 +211,6 @@ uv run court-scheduler workflow --cases 10000 --days 90
236
  **Issue**: Out of memory during simulation
237
  **Solution**: Reduce n_cases to 10,000-20,000 or increase system RAM
238
 
239
- **Issue**: RL training very slow
240
- **Solution**: Reduce episodes to 50 or cases_per_episode to 500
241
-
242
  **Issue**: EDA parameters not found
243
  **Solution**: Run `uv run court-scheduler eda` first
244
 
@@ -277,12 +249,11 @@ uv run court-scheduler workflow \
277
  ### Contact & Support
278
 
279
  For hackathon questions or technical support:
280
- - Review PIPELINE.md for detailed architecture
281
- - Check README.md for system overview
282
- - See rl/README.md for RL-specific documentation
283
 
284
  ---
285
 
286
  **Good luck with your hackathon submission!**
287
 
288
- This system represents a genuine breakthrough in applying AI to judicial efficiency. The combination of production-ready cause lists, proven performance metrics, and innovative RL architecture positions this as a compelling winning submission.
 
1
  # Hackathon Submission Guide
2
+ ## Intelligent Court Scheduling System
3
 
4
  ### Quick Start - Hackathon Demo
5
 
6
  **IMPORTANT**: The dashboard is fully self-contained. You only need:
7
+ 1. Preferred: `Data/court_data.duckdb` (included in this repo). Alternatively, place the two CSVs in `Data/` with exact names: `ISDMHack_Cases_WPfinal.csv` and `ISDMHack_Hear.csv`.
8
  2. This codebase
9
  3. Run the dashboard
10
 
 
25
  4. **Review Results**: Check "Cause Lists & Overrides" for judge override interface
26
  5. **Performance Analysis**: View "Analytics & Reports" for metrics comparison
27
 
28
+ **No pre-processing required** EDA automatically loads `Data/court_data.duckdb` when present; if missing, it falls back to `ISDMHack_Cases_WPfinal.csv` and `ISDMHack_Hear.csv` placed in `Data/`.
29
 
30
  #### Alternative: CLI Workflow (for scripting)
31
  ```bash
 
53
  # 2. Generate synthetic cases
54
  uv run court-scheduler generate --cases 50000
55
 
56
+ # 3. Run simulation
 
 
 
57
  uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy readiness
58
  ```
59
 
60
  ### What the Pipeline Does
61
 
62
+ The comprehensive pipeline executes 6 automated steps:
63
 
64
  **Step 1: EDA & Parameter Extraction**
65
  - Analyzes 739K+ historical hearings
 
71
  - Configurable size (default: 50,000 cases)
72
  - Diverse case types and complexity levels
73
 
74
+ **Step 3: 2-Year Simulation**
 
 
 
 
 
75
  - Runs 730-day court scheduling simulation
76
+ - Compares scheduling policies (FIFO, age-based, readiness)
77
  - Tracks disposal rates, utilization, fairness metrics
78
 
79
+ **Step 4: Daily Cause List Generation**
80
  - Generates production-ready daily cause lists
81
  - Exports for all simulation days
82
  - Court-room wise scheduling details
83
 
84
+ **Step 5: Performance Analysis**
85
  - Comprehensive comparison reports
86
  - Performance visualizations
87
  - Statistical analysis of all metrics
88
 
89
+ **Step 6: Executive Summary**
90
  - Hackathon-ready summary document
91
  - Key achievements and impact metrics
92
  - Deployment readiness checklist
93
 
94
  ### Expected Output
95
 
96
+ After completion, you'll find outputs under your selected run directory (created automatically; the dashboard uses outputs/simulation_runs by default):
97
 
98
  ```
99
+ outputs/simulation_runs/v<version>_<timestamp>/
100
+ |-- pipeline_config.json # Full configuration used
101
+ |-- events.csv # All scheduled events across days
102
+ |-- metrics.csv # Aggregate metrics for the run
103
+ |-- daily_summaries.csv # Per-day summary metrics
104
+ |-- cause_lists/ # Generated daily cause lists (CSV)
105
+ | |-- YYYY-MM-DD.csv # One file per simulation day
106
+ |-- figures/ # Optional charts (when exported)
 
 
 
 
 
 
 
 
107
  ```
108
 
109
  ### Hackathon Winning Features
 
114
  - **Multi-Courtroom Support**: Load-balanced allocation across 5+ courtrooms
115
  - **Scalability**: Tested with 50,000+ cases
116
 
117
+ #### 2. Technical Approach
118
+ - Data-informed simulation calibrated from historical hearings
119
+ - Multiple heuristic policies: FIFO, age-based, readiness-based
120
+ - Readiness policy enforces bottleneck/ripeness constraints
121
+ - Fairness metrics (e.g., Gini) and utilization tracking
122
 
123
  #### 3. Production Readiness
124
  - **Interactive CLI**: User-friendly parameter configuration
 
134
 
135
  ### Performance Benchmarks
136
 
137
+ Compare policies by running multiple simulations (e.g., readiness vs FIFO vs age) and reviewing disposal rate, utilization, and fairness (Gini). The Analytics & Reports dashboard page can load and compare runs side-by-side.
 
 
 
 
 
 
 
 
138
 
139
  ### Customization Options
140
 
 
150
  ```
151
 
152
  #### For Technical Evaluation
153
+ Focus on repeatability and fairness by comparing multiple policies and seeds:
154
  ```bash
155
+ uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy readiness --seed 1
156
+ uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy fifo --seed 1
157
+ uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy age --seed 1
 
 
158
  ```
159
 
160
  #### For Quick Demo/Testing
 
177
 
178
  2. **Demonstrate the Solution**
179
  - Run the interactive pipeline live
 
180
  - Display generated cause lists
181
 
182
  3. **Present the Results**
 
185
  - Show actual cause list files (730 days ready)
186
 
187
  4. **Emphasize Innovation**
188
+ - Data-driven readiness-based scheduling (novel for this context)
189
  - Production-ready from day 1 (practical)
190
  - Scalable to entire court system (impactful)
191
 
 
197
 
198
  ### System Requirements
199
 
200
+ - **Python**: 3.11+
201
+ - **uv**: required to run commands and the dashboard
202
  - **Memory**: 8GB+ RAM (16GB recommended for 50K cases)
203
  - **Storage**: 2GB+ for full pipeline outputs
204
  - **Runtime**:
 
211
  **Issue**: Out of memory during simulation
212
  **Solution**: Reduce n_cases to 10,000-20,000 or increase system RAM
213
 
 
 
 
214
  **Issue**: EDA parameters not found
215
  **Solution**: Run `uv run court-scheduler eda` first
216
 
 
249
  ### Contact & Support
250
 
251
  For hackathon questions or technical support:
252
+ - Check README.md for the system overview
253
+ - See this guide (docs/HACKATHON_SUBMISSION.md) for end-to-end instructions
 
254
 
255
  ---
256
 
257
  **Good luck with your hackathon submission!**
258
 
259
+ This system represents a pragmatic, data-driven approach to improving judicial efficiency. The combination of production-ready cause lists, proven performance metrics, and a transparent, judge-in-the-loop design positions this as a compelling winning submission.
scheduler/dashboard/app.py CHANGED
@@ -2,11 +2,11 @@
2
 
3
  This is the entry point for the Streamlit multi-page dashboard.
4
  Launch with: uv run court-scheduler dashboard
5
- Or directly: streamlit run scheduler/dashboard/app.py
6
  """
7
 
8
  from __future__ import annotations
9
 
 
10
  from pathlib import Path
11
 
12
  import streamlit as st
@@ -21,6 +21,21 @@ st.set_page_config(
21
  initial_sidebar_state="expanded",
22
  )
23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
  # Main page content
25
  st.title("Court Scheduling System Dashboard")
26
  st.markdown("**Karnataka High Court - Algorithmic Decision Support for Fair Scheduling**")
 
2
 
3
  This is the entry point for the Streamlit multi-page dashboard.
4
  Launch with: uv run court-scheduler dashboard
 
5
  """
6
 
7
  from __future__ import annotations
8
 
9
+ import subprocess
10
  from pathlib import Path
11
 
12
  import streamlit as st
 
21
  initial_sidebar_state="expanded",
22
  )
23
 
24
+ # Enforce `uv` availability for all dashboard-triggered commands
25
+ try:
26
+ uv_check = subprocess.run(["uv", "--version"], capture_output=True, text=True)
27
+ if uv_check.returncode != 0:
28
+ raise RuntimeError(uv_check.stderr or "uv not available")
29
+ except Exception:
30
+ import streamlit as st
31
+
32
+ st.error(
33
+ "'uv' is required to run this dashboard's commands. Please install uv and rerun.\n\n"
34
+ "Install on macOS/Linux: `curl -LsSf https://astral.sh/uv/install.sh | sh`\n"
35
+ "Install on Windows (PowerShell): `irm https://astral.sh/uv/install.ps1 | iex`"
36
+ )
37
+ st.stop()
38
+
39
  # Main page content
40
  st.title("Court Scheduling System Dashboard")
41
  st.markdown("**Karnataka High Court - Algorithmic Decision Support for Fair Scheduling**")
scheduler/dashboard/pages/6_Analytics_And_Reports.py CHANGED
@@ -12,6 +12,7 @@ from __future__ import annotations
12
  from datetime import datetime
13
  from pathlib import Path
14
 
 
15
  import pandas as pd
16
  import plotly.express as px
17
  import plotly.graph_objects as go
@@ -361,6 +362,70 @@ with tab3:
361
  with col3:
362
  st.metric("Max Age", f"{case_dates['age_days'].max():.0f} days")
363
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
364
  # Case type fairness
365
  if "case_type" in events_df.columns:
366
  st.markdown("---")
@@ -379,6 +444,62 @@ with tab3:
379
  fig.update_layout(height=400, xaxis_tickangle=-45)
380
  st.plotly_chart(fig, use_container_width=True)
381
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
382
  except Exception as e:
383
  st.error(f"Error loading events data: {e}")
384
 
 
12
  from datetime import datetime
13
  from pathlib import Path
14
 
15
+ import numpy as np
16
  import pandas as pd
17
  import plotly.express as px
18
  import plotly.graph_objects as go
 
362
  with col3:
363
  st.metric("Max Age", f"{case_dates['age_days'].max():.0f} days")
364
 
365
+ # Additional Fairness Metrics: Gini and Lorenz Curve
366
+ st.markdown("#### Inequality Metrics (Fairness)")
367
+
368
+ def _gini(values: np.ndarray) -> float:
369
+ v = np.asarray(values, dtype=float)
370
+ v = v[np.isfinite(v)]
371
+ v = v[v >= 0]
372
+ if v.size == 0:
373
+ return float("nan")
374
+ if np.all(v == 0):
375
+ return 0.0
376
+ v_sorted = np.sort(v)
377
+ n = v_sorted.size
378
+ cumulative = np.cumsum(v_sorted)
379
+ # Gini based on cumulative shares
380
+ gini = (n + 1 - 2 * np.sum(cumulative) / cumulative[-1]) / n
381
+ return float(gini)
382
+
383
+ ages = case_dates["age_days"].to_numpy()
384
+ gini_age = _gini(ages)
385
+
386
+ col_a, col_b = st.columns(2)
387
+ with col_a:
388
+ if np.isfinite(gini_age):
389
+ st.metric("Gini (Age Inequality)", f"{gini_age:.3f}")
390
+ else:
391
+ st.info("Gini (Age) not available")
392
+
393
+ # Lorenz curve for ages
394
+ with col_b:
395
+ try:
396
+ ages_clean = ages[np.isfinite(ages)]
397
+ ages_clean = ages_clean[ages_clean >= 0]
398
+ if ages_clean.size > 0:
399
+ ages_sorted = np.sort(ages_clean)
400
+ cum_ages = np.cumsum(ages_sorted)
401
+ cum_ages = np.insert(cum_ages, 0, 0)
402
+ cum_pop = np.linspace(0, 1, num=cum_ages.size)
403
+ lorenz = cum_ages / cum_ages[-1]
404
+ fig_lorenz = go.Figure()
405
+ fig_lorenz.add_trace(
406
+ go.Scatter(x=cum_pop, y=lorenz, mode="lines", name="Lorenz")
407
+ )
408
+ fig_lorenz.add_trace(
409
+ go.Scatter(
410
+ x=[0, 1],
411
+ y=[0, 1],
412
+ mode="lines",
413
+ name="Equality",
414
+ line=dict(dash="dash"),
415
+ )
416
+ )
417
+ fig_lorenz.update_layout(
418
+ title="Lorenz Curve of Case Ages",
419
+ xaxis_title="Cumulative share of cases",
420
+ yaxis_title="Cumulative share of total age",
421
+ height=350,
422
+ )
423
+ st.plotly_chart(fig_lorenz, use_container_width=True)
424
+ else:
425
+ st.info("Not enough data to plot Lorenz curve")
426
+ except Exception:
427
+ st.info("Unable to compute Lorenz curve for current data")
428
+
429
  # Case type fairness
430
  if "case_type" in events_df.columns:
431
  st.markdown("---")
 
444
  fig.update_layout(height=400, xaxis_tickangle=-45)
445
  st.plotly_chart(fig, use_container_width=True)
446
 
447
+ # Age distribution by case type (top N by cases)
448
+ st.markdown("#### Age Distribution by Case Type (Top 8)")
449
+ try:
450
+ # Map each case_id to a case_type (take the first occurrence)
451
+ cid_to_type = (
452
+ events_df.sort_values("date")
453
+ .groupby("case_id")["case_type"]
454
+ .first()
455
+ )
456
+ age_with_type = (
457
+ case_dates[["age_days"]]
458
+ .join(cid_to_type, how="left")
459
+ .dropna(subset=["case_type"]) # keep only cases with type
460
+ )
461
+ top_types = (
462
+ age_with_type["case_type"].value_counts().head(8).index.tolist()
463
+ )
464
+ filt = age_with_type["case_type"].isin(top_types)
465
+ fig_box = px.box(
466
+ age_with_type[filt],
467
+ x="case_type",
468
+ y="age_days",
469
+ points="outliers",
470
+ title="Case Age by Case Type (Top 8)",
471
+ labels={"case_type": "Case Type", "age_days": "Age (days)"},
472
+ )
473
+ fig_box.update_layout(height=420, xaxis_tickangle=-45)
474
+ st.plotly_chart(fig_box, use_container_width=True)
475
+
476
+ # Gini by case type (Top 8)
477
+ st.markdown("#### Inequality by Case Type (Gini)")
478
+ gini_rows = []
479
+ for ctype in top_types:
480
+ vals = age_with_type.loc[
481
+ age_with_type["case_type"] == ctype, "age_days"
482
+ ].to_numpy()
483
+ g = _gini(vals)
484
+ gini_rows.append({"case_type": ctype, "gini": g})
485
+ gini_df = pd.DataFrame(gini_rows).dropna()
486
+ if not gini_df.empty:
487
+ fig_gini = px.bar(
488
+ gini_df,
489
+ x="case_type",
490
+ y="gini",
491
+ title="Gini Coefficient by Case Type (Top 8)",
492
+ labels={"case_type": "Case Type", "gini": "Gini"},
493
+ )
494
+ fig_gini.update_layout(
495
+ height=380, xaxis_tickangle=-45, yaxis_range=[0, 1]
496
+ )
497
+ st.plotly_chart(fig_gini, use_container_width=True)
498
+ else:
499
+ st.info("Insufficient data to compute per-type Gini")
500
+ except Exception as _:
501
+ st.info("Unable to compute per-type age distributions for current data")
502
+
503
  except Exception as e:
504
  st.error(f"Error loading events data: {e}")
505