hackathon_code4change / Court Scheduling System Implementation Plan.md
RoyAalekh's picture
feat: Complete Court Scheduling System for Code4Change Hackathon
54c8522
|
raw
history blame
14 kB

Court Scheduling System Implementation Plan

Overview

Build an intelligent judicial scheduling system for Karnataka High Court that optimizes daily cause lists across multiple courtrooms over a 2-year simulation period, balancing fairness, efficiency, and urgency.

Architecture Design

System Components

  1. Parameter Loader: Load EDA-extracted parameters (transition probs, durations, capacities)
  2. Case Generator: Synthetic case creation with realistic attributes
  3. Simulation Engine: SimPy-based discrete event simulation
  4. Scheduling Policies: Multiple algorithms (FIFO, Priority, Optimized)
  5. Metrics Tracker: Performance evaluation (fairness, efficiency, urgency)
  6. Visualization: Dashboard for monitoring and analysis

Technology Stack

  • Simulation: SimPy (discrete event simulation)
  • Optimization: OR-Tools (CP-SAT solver)
  • Data Processing: Polars, Pandas
  • Visualization: Plotly, Streamlit
  • Testing: Pytest, Hypothesis

Module Structure

scheduler/
β”œβ”€β”€ core/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ case.py              # Case entity and lifecycle
β”‚   β”œβ”€β”€ courtroom.py         # Courtroom resource
β”‚   β”œβ”€β”€ judge.py             # Judge entity
β”‚   └── hearing.py           # Hearing event
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ param_loader.py      # Load EDA parameters
β”‚   β”œβ”€β”€ case_generator.py   # Generate synthetic cases
β”‚   └── config.py            # Configuration constants
β”œβ”€β”€ simulation/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ engine.py            # SimPy simulation engine
β”‚   β”œβ”€β”€ scheduler.py         # Base scheduler interface
β”‚   β”œβ”€β”€ policies/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ fifo.py         # FIFO scheduling
β”‚   β”‚   β”œβ”€β”€ priority.py     # Priority-based
β”‚   β”‚   └── optimized.py    # OR-Tools optimization
β”‚   └── events.py           # Event handlers
β”œβ”€β”€ optimization/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ model.py            # OR-Tools model
β”‚   β”œβ”€β”€ objectives.py       # Multi-objective functions
β”‚   └── constraints.py      # Constraint definitions
β”œβ”€β”€ metrics/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ fairness.py         # Gini coefficient, age variance
β”‚   β”œβ”€β”€ efficiency.py       # Utilization, throughput
β”‚   └── urgency.py          # Readiness coverage
β”œβ”€β”€ visualization/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ dashboard.py        # Streamlit dashboard
β”‚   └── plots.py            # Plotly visualizations
└── utils/
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ distributions.py    # Probability distributions
    └── calendar.py         # Working days calculator

Implementation Phases

Phase 1: Foundation (Days 1-2) - COMPLETE

Goal: Set up infrastructure and load parameters Status: 100% complete (1,323 lines implemented) Tasks:

  1. Create module directory structure (8 sub-packages)
  2. Implement parameter loader
    • Read stage_transition_probs.csv
    • Read stage_duration.csv
    • Read court_capacity_global.json
    • Read adjournment_proxies.csv
    • Read cases_features.csv
    • Automatic latest version detection
    • Lazy loading with caching
  3. Create core entities (Case, Courtroom, Judge, Hearing)
    • Case: Lifecycle, readiness score, priority score (218 lines)
    • Courtroom: Capacity tracking, scheduling, utilization (228 lines)
    • Judge: Workload tracking, specialization, adjournment rate (167 lines)
    • Hearing: Outcome tracking, rescheduling support (134 lines)
  4. Implement working days calculator (192 days/year)
    • Weekend/holiday detection
    • Seasonality factors
    • Working days counting (217 lines)
  5. Configuration system with EDA-derived constants (115 lines) Outputs:
  • scheduler/data/param_loader.py (244 lines)
  • scheduler/data/config.py (115 lines)
  • scheduler/core/case.py (218 lines)
  • scheduler/core/courtroom.py (228 lines)
  • scheduler/core/judge.py (167 lines)
  • scheduler/core/hearing.py (134 lines)
  • scheduler/utils/calendar.py (217 lines) Quality: Type hints 100%, Docstrings 100%, Integration complete

Phase 2: Case Generation (Days 3-4)

Goal: Generate synthetic case pool for simulation Tasks:

  1. Implement case generator using historical distributions
    • Case type distribution (CRP: 20.1%, CA: 20%, etc.)
    • Filing rate (monthly inflow from temporal analysis)
    • Initial stage assignment
  2. Generate 2-year case pool (~10,000 cases)
  3. Assign readiness scores and attributes Outputs:
  • scheduler/data/case_generator.py
  • Synthetic case dataset for simulation

Phase 3: Simulation Engine (Days 5-7)

Goal: Build discrete event simulation framework Tasks:

  1. Implement SimPy environment setup
  2. Create courtroom resources (5 courtrooms)
  3. Implement case lifecycle process
    • Stage progression using transition probabilities
    • Duration sampling from distributions
    • Adjournment modeling (stochastic)
  4. Implement daily scheduling loop
  5. Add case inflow/outflow dynamics Outputs:
  • scheduler/simulation/engine.py
  • scheduler/simulation/events.py
  • Working simulation (baseline)

Phase 4: Scheduling Policies (Days 8-10)

Goal: Implement multiple scheduling algorithms Tasks:

  1. Base scheduler interface
  2. FIFO scheduler (baseline)
  3. Priority-based scheduler
    • Use case age as primary factor
    • Use case type as secondary
  4. Readiness-score scheduler
    • Use EDA-computed readiness scores
    • Apply urgency weights
  5. Compare policies on metrics Outputs:
  • scheduler/simulation/scheduler.py (interface)
  • scheduler/simulation/policies/ (implementations)
  • Performance comparison report

Phase 5: Optimization Model (Days 11-14)

Goal: Implement OR-Tools-based optimal scheduler Tasks:

  1. Define decision variables
    • hearing_slots[case, date, court] ∈ {0,1}
  2. Implement constraints
    • Daily capacity per courtroom
    • Case can only be in one court per day
    • Minimum gap between hearings
    • Stage progression requirements
  3. Implement objective functions
    • Fairness: Minimize age variance
    • Efficiency: Maximize utilization
    • Urgency: Prioritize ready cases
  4. Multi-objective optimization (weighted sum)
  5. Solve for 30-day scheduling window (rolling) Outputs:
  • scheduler/optimization/model.py
  • scheduler/optimization/objectives.py
  • scheduler/optimization/constraints.py
  • Optimized scheduling policy

Phase 6: Metrics & Validation (Days 15-16)

Goal: Comprehensive performance evaluation Tasks:

  1. Implement fairness metrics
    • Gini coefficient of disposal times
    • Age variance within case types
    • Max age tracking
  2. Implement efficiency metrics
    • Court utilization rate
    • Average disposal time
    • Throughput (cases/month)
  3. Implement urgency metrics
    • Readiness score coverage
    • High-priority case delay
  4. Compare all policies
  5. Validate against historical data Outputs:
  • scheduler/metrics/ (all modules)
  • Validation report
  • Policy comparison matrix

Phase 7: Dashboard (Days 17-18)

Goal: Interactive visualization and monitoring Tasks:

  1. Streamlit dashboard setup
  2. Real-time queue visualization
  3. Judge workload display
  4. Alert system for long-pending cases
  5. What-if scenario analysis
  6. Export capability (cause lists as PDF/CSV) Outputs:
  • scheduler/visualization/dashboard.py
  • Interactive web interface
  • User documentation

Phase 8: Polish & Documentation (Days 19-20)

Goal: Production-ready system Tasks:

  1. Unit tests (pytest)
  2. Integration tests
  3. Performance benchmarking
  4. Comprehensive documentation
  5. Example notebooks
  6. Deployment guide Outputs:
  • Test suite (90%+ coverage)
  • Documentation (README, API docs)
  • Example usage notebooks
  • Final presentation materials

Key Design Decisions

1. Hybrid Approach

Decision: Use simulation for long-term dynamics, optimization for short-term scheduling Rationale: Simulation captures stochastic nature (adjournments, case progression), optimization finds optimal daily schedules within constraints

2. Rolling Optimization Window

Decision: Optimize 30-day windows, re-optimize weekly Rationale: Balance computational cost with scheduling quality, allow for dynamic adjustments

3. Stage-Based Progression Model

Decision: Model cases as finite state machines with probabilistic transitions Rationale: Matches our EDA findings (strong stage patterns), enables realistic progression

4. Multi-Objective Weighting

Decision: Fairness (40%), Efficiency (30%), Urgency (30%) Rationale: Prioritize fairness slightly, balance with practical concerns

5. Capacity Model

Decision: Use median capacity (151 cases/court/day) with seasonal adjustment Rationale: Conservative estimate from EDA, account for vacation periods

Parameter Utilization from EDA

EDA Output Scheduler Use
stage_transition_probs.csv Case progression probabilities
stage_duration.csv Duration sampling (median, p90)
court_capacity_global.json Daily capacity constraints
adjournment_proxies.csv Hearing outcome probabilities
cases_features.csv Initial readiness scores
case_type_summary.csv Case type distributions
monthly_hearings.csv Seasonal adjustment factors
correlations_spearman.csv Feature importance weights

Assumptions Made Explicit

Court Operations

  1. Working days: 192 days/year (from Karnataka HC calendar)
  2. Courtrooms: 5 courtrooms, each with 1 judge
  3. Daily capacity: 151 hearings/court/day (median from EDA)
  4. Hearing duration: Not modeled explicitly (capacity is count-based)
  5. Case queue assignment: By case type (RSA β†’ Court 1, CRP β†’ Court 2, etc.)

Case Dynamics

  1. Filing rate: ~6,000 cases/year (derived from historical data)
  2. Disposal rate: Matches filing rate (steady-state assumption)
  3. Stage progression: Probabilistic (Markov chain from EDA)
  4. Adjournment rate: 36-48% depending on stage and case type
  5. Case readiness: Computed from hearings, gaps, and stage

Scheduling Constraints

  1. Minimum gap: 7 days between hearings for same case
  2. Maximum gap: 90 days (alert triggered)
  3. Urgent cases: 5% of pool marked urgent (jump queue)
  4. Judge preferences: Not modeled (future enhancement)
  5. Multi-judge benches: Not modeled (all single-judge)

Simplifications

  1. No lawyer availability: Assumed all advocates always available
  2. No case dependencies: Each case independent
  3. No physical constraints: Assume sufficient courtrooms/facilities
  4. Deterministic durations: Within-hearing time not modeled
  5. Perfect information: All case attributes known

Success Criteria

Fairness Metrics

  • Gini coefficient < 0.4 (disposal time inequality)
  • Age variance reduction: 20% vs FIFO baseline
  • No case unlisted > 90 days without alert

Efficiency Metrics

  • Court utilization > 85%
  • Average disposal time: Within 10% of historical median by case type
  • Throughput: Match or exceed filing rate

Urgency Metrics

  • High-readiness cases: 80% scheduled within 14 days
  • Urgent cases: 95% scheduled within 7 days
  • Alert response: 100% of flagged cases reviewed

Risk Mitigation

Technical Risks

  1. Optimization solver timeout: Use heuristics as fallback
  2. Memory constraints: Batch processing for large case pools
  3. Stochastic variability: Run multiple simulation replications

Model Risks

  1. Parameter drift: Allow manual parameter overrides
  2. Edge cases: Implement rule-based fallbacks
  3. Unexpected patterns: Continuous monitoring and adjustment

Future Enhancements

Short-term

  1. Judge preference modeling
  2. Multi-judge bench support
  3. Case dependency tracking
  4. Lawyer availability constraints

Medium-term

  1. Machine learning for duration prediction
  2. Automated parameter updates from live data
  3. Real-time integration with eCourts
  4. Mobile app for judges

Long-term

  1. Multi-court coordination (district + high court)
  2. Predictive analytics for case outcomes
  3. Resource optimization (judges, courtrooms)
  4. National deployment framework

Deliverables Checklist

  • Scheduler module (fully functional)
  • Parameter loader (tested with EDA outputs)
  • Case generator (realistic synthetic data)
  • Simulation engine (2-year simulation capability)
  • Multiple scheduling policies (FIFO, Priority, Optimized)
  • Optimization model (OR-Tools implementation)
  • Metrics framework (fairness, efficiency, urgency)
  • Dashboard (Streamlit web interface)
  • Validation report (comparison vs historical data)
  • Documentation (comprehensive)
  • Test suite (90%+ coverage)
  • Example notebooks (usage demonstrations)
  • Presentation materials (slides, demo video)

Timeline Summary

Phase Days Key Deliverable
Foundation 1-2 Parameter loader, core entities
Case Generation 3-4 Synthetic case dataset
Simulation 5-7 Working SimPy simulation
Policies 8-10 Multiple scheduling algorithms
Optimization 11-14 OR-Tools optimal scheduler
Metrics 15-16 Validation and comparison
Dashboard 17-18 Interactive visualization
Polish 19-20 Tests, docs, deployment
Total: 20 days (aggressive timeline, assumes full-time focus)

Next Immediate Actions

  1. Create scheduler module directory structure
  2. Implement parameter loader (read all EDA CSVs/JSONs)
  3. Define core entities (Case, Courtroom, Judge, Hearing)
  4. Set up development environment with uv
  5. Initialize git repository with proper .gitignore
  6. Create initial unit tests

Plan Version: 1.0 Created: 2025-11-19 Status: Ready to begin implementation