Spaces:

RoyAalekh
/

hackathon_code4change

Sleeping

App Files Files Community

RoyAalekh commited on Nov 30, 2025

Commit

c39a084

1 Parent(s): eadbc29

Submission ready

Browse files

Files changed (7) hide show

README.md +31 -286
SUBMISSION_READINESS_AUDIT.md +0 -313
docs/CONFIGURATION.md +10 -169
docs/DASHBOARD.md +9 -33
docs/HACKATHON_SUBMISSION.md +35 -64
scheduler/dashboard/app.py +16 -1
scheduler/dashboard/pages/6_Analytics_And_Reports.py +121 -0

README.md CHANGED Viewed

@@ -1,313 +1,58 @@
 # Code4Change: Intelligent Court Scheduling System
-Data-driven court scheduling system with ripeness classification, multi-courtroom simulation, and intelligent case prioritization for Karnataka High Court.
-## Project Overview
-This project delivers a **comprehensive** court scheduling system featuring:
-- **EDA & Parameter Extraction**: Analysis of 739K+ hearings to derive scheduling parameters
-- **Ripeness Classification**: Data-driven bottleneck detection (filtering unripe cases)
-- **Simulation Engine**: Multi-year court operations simulation with realistic outcomes
-- **Multiple Scheduling Policies**: FIFO, Age-based, Readiness-based, and RL-based
-- **Reinforcement Learning**: Tabular Q-learning achieving performance parity with heuristics
-- **Load Balancing**: Dynamic courtroom allocation with low inequality
-- **Configurable Pipeline**: Modular training and evaluation framework
-## Key Achievements
-**81.4% Disposal Rate** - Significantly exceeds baseline expectations
-**Perfect Courtroom Balance** - Gini 0.002 load distribution
-**97.7% Case Coverage** - Near-zero case abandonment
-**Smart Bottleneck Detection** - 40.8% unripe cases filtered to save judicial time
-**Judge Control** - Complete override system for judicial autonomy
-**Production Ready** - Full cause list generation and audit capabilities
-## Dataset
-- **Cases**: 134,699 unique civil cases with 24 attributes
-- **Hearings**: 739,670 individual hearings with 31 attributes
-- **Timespan**: 2000-2025 (disposed cases only)
-- **Scope**: Karnataka High Court, Bangalore Bench
-## System Architecture
-### 1. EDA & Parameter Extraction (`src/`)
-- Stage transition probabilities by case type
-- Duration distributions (median, p90) per stage
-- Adjournment rates by stage and case type
-- Court capacity analysis (151 hearings/day median)
-- Case type distributions and filing patterns
-### 2. Ripeness Classification (`scheduler/core/ripeness.py`)
-- **Purpose**: Identify cases with substantive bottlenecks
-- **Types**: SUMMONS, DEPENDENT, PARTY, DOCUMENT
-- **Data-Driven**: Extracted from 739K historical hearings
-- **Impact**: Prevents premature scheduling of unready cases
-### 3. Simulation Engine (`scheduler/simulation/`)
-- **Discrete Event Simulation**: Configurable horizon (30-384+ days)
-- **Stochastic Modeling**: Realistic adjournments and disposal rates
-- **Multi-Courtroom**: 5 courtrooms with dynamic load-balanced allocation
-- **Policies**: FIFO, Age-based, Readiness-based, RL-based scheduling
-- **Performance Comparison**: Direct policy evaluation framework
-### 4. Reinforcement Learning (`rl/`)
-- **Tabular Q-Learning**: 6D state space for case prioritization
-- **Hybrid Architecture**: RL prioritization with rule-based constraints
-- **Training Pipeline**: Configurable episodes and learning parameters
-- **Performance**: 52.1% disposal rate (parity with 51.9% baseline)
-- **Configuration Management**: JSON-based training profiles and parameter overrides
-### 5. Case Management (`scheduler/core/`)
-- Case entity with lifecycle tracking
-- Ripeness status and bottleneck reasons
-- No-case-left-behind tracking
-- Hearing history and stage progression
-## Features
-- **Interactive Data Exploration**: Plotly-powered visualizations with filtering
-- **Case Analysis**: Distribution, disposal times, and patterns by case type
-- **Hearing Patterns**: Stage progression and judicial assignment analysis
-- **Temporal Analysis**: Yearly, monthly, and weekly hearing patterns
-- **Judge Analytics**: Assignment patterns and workload distribution
-- **Filter Controls**: Dynamic filtering by case type and year range
-## Quick Start
-### Interactive Dashboard (Primary Interface)
-**For submission/demo, use the dashboard - it's fully self-contained:**
 ```bash
-# Launch dashboard
 uv run streamlit run scheduler/dashboard/app.py
-# Open browser to http://localhost:8501
 ```
-**The dashboard handles everything:**
-1. Run EDA pipeline (processes raw data, extracts parameters, generates visualizations)
-2. Explore historical data and parameters
-3. Test ripeness classification
-4. Generate cases and run simulations
-5. Review cause lists with judge override capability
-6. Train RL models
-7. Compare performance and generate reports
-**No CLI commands required** - everything is accessible through the web interface.
-### Alternative: Command Line Interface
-For automation or scripting, all operations available via CLI:
 ```bash
-# See all available commands
 uv run court-scheduler --help
-# Run full workflow (generate cases + simulate)
 uv run court-scheduler workflow --cases 10000 --days 384
 ```
-### Common Operations
-**1. Run EDA Pipeline** (extract parameters from historical data):
-```bash
-uv run court-scheduler eda
-```
-**2. Generate Test Cases**:
-```bash
-uv run court-scheduler generate --cases 10000 --output data/cases.csv
-```
-**3. Run Simulation**:
-```bash
-uv run court-scheduler simulate --cases data/cases.csv --days 384 --policy readiness
-```
-**4. Train RL Agent** (optional enhancement):
-```bash
-uv run court-scheduler train --episodes 20 --output models/agent.pkl
-```
-**5. Full Workflow** (end-to-end):
-```bash
-uv run court-scheduler workflow --cases 10000 --days 384 --output results/
-```
-See [HACKATHON_SUBMISSION.md](HACKATHON_SUBMISSION.md) for detailed submission instructions.
-### Advanced Usage
-<details>
-<summary>Click for configuration and customization options</summary>
-#### Using Configuration Files
-```bash
-# Generate with custom config
-uv run court-scheduler generate --config configs/generate_config.toml
-# Simulate with custom config
-uv run court-scheduler simulate --config configs/simulate_config.toml
-```
-#### Interactive Mode
-```bash
-# Prompt for all parameters
-uv run court-scheduler simulate --interactive
-```
-#### Custom Parameters
-```bash
-# Training with custom hyperparameters
-uv run court-scheduler train \
-  --episodes 50 \
-  --cases 200 \
-  --lr 0.15 \
-  --epsilon 0.4 \
-  --output models/custom_agent.pkl
-# Simulation with specific settings
-uv run court-scheduler simulate \
-  --cases data/cases.csv \
-  --days 730 \
-  --policy readiness \
-  --seed 42 \
-  --log-dir outputs/long_run
-```
-#### Policy Comparison
-```bash
-# Run with different policies
-uv run court-scheduler simulate --policy fifo --log-dir outputs/fifo_run
-uv run court-scheduler simulate --policy age --log-dir outputs/age_run
-uv run court-scheduler simulate --policy readiness --log-dir outputs/readiness_run
-```
-</details>
-## CLI Reference
-All commands follow the pattern: `uv run court-scheduler <command> [options]`
-| Command | Description | Key Options |
-|---------|-------------|-------------|
-| `eda` | Run EDA pipeline | `--skip-clean`, `--skip-viz`, `--skip-params` |
-| `generate` | Generate test cases | `--cases`, `--start`, `--end`, `--output` |
-| `simulate` | Run simulation | `--cases`, `--days`, `--policy`, `--log-dir` |
-| `train` | Train RL agent | `--episodes`, `--lr`, `--epsilon`, `--output` |
-| `workflow` | Full pipeline | `--cases`, `--days`, `--output` |
-| `version` | Show version | - |
-For detailed options: `uv run court-scheduler <command> --help`
-## Recent Improvements
-### RL Training Gap Fixes
-Two critical gaps in the RL training system have been identified and fixed:
-**1. EDA Parameter Alignment**
-- **Issue**: Training environment used hardcoded probabilities (0.7, 0.6, 0.4) instead of EDA-derived parameters
-- **Fix**: Integrated ParameterLoader into RLTrainingEnvironment to use data-driven parameters
-- **Validation**: Adjournment rates now align within 1% of EDA-derived values (43.0% vs 42.3%)
-- **Impact**: Training now matches evaluation dynamics, improving policy generalization
-**2. Ripeness Feedback Loop**
-- **Issue**: Ripeness classification used static keyword/stage heuristics with no feedback mechanism
-- **Fix**: Created RipenessMetrics and RipenessCalibrator for dynamic threshold adjustment
-- **Components**:
-  - `scheduler/monitoring/ripeness_metrics.py`: Tracks predictions vs outcomes, computes confusion matrix
-  - `scheduler/monitoring/ripeness_calibrator.py`: Analyzes metrics and suggests threshold adjustments
-  - Enhanced `RipenessClassifier` with `set_thresholds()` and `get_current_thresholds()` methods
-- **Impact**: Enables continuous improvement of ripeness classification accuracy based on real outcomes
-These fixes ensure that RL training is reproducible, aligned with evaluation conditions, and benefits from adaptive ripeness detection that learns from historical data.
-## Key Insights
-### Data Characteristics
-- **Case Types**: 8 civil case categories (RSA, CRP, RFA, CA, CCC, CP, MISC.CVL, CMP)
-- **Disposal Times**: Significant variation by case type and complexity
-- **Hearing Stages**: Primary stages include ADMISSION, ORDERS/JUDGMENT, and OTHER
-- **Judge Assignments**: Mix of single and multi-judge benches
-### Scheduling Implications
-- Different case types require different handling strategies
-- Historical judge assignment patterns can inform scheduling preferences
-- Clear temporal patterns in hearing schedules
-- Multiple hearing stages requiring different resource allocation
-## Current Results (Latest Simulation)
-### Performance Metrics
-- **Cases Scheduled**: 97.7% (9,766/10,000 cases)
-- **Disposal Rate**: 81.4% (significantly above baseline)
-- **Adjournment Rate**: 31.1% (realistic, within expected range)
-- **Courtroom Balance**: Gini 0.002 (perfect load distribution)
-- **Utilization**: 45.0% (sustainable with realistic constraints)
-### Disposal Rates by Case Type
-| Type | Disposed | Total | Rate | Performance |
-|------|----------|-------|------|-------------|
-| CP   | 833      | 963   | 86.5% | Excellent |
-| CMP  | 237      | 275   | 86.2% | Excellent |
-| CA   | 1,676    | 1,949 | 86.0% | Excellent |
-| CCC  | 978      | 1,147 | 85.3% | Excellent |
-| CRP  | 1,750    | 2,062 | 84.9% | Excellent |
-| RSA  | 1,488    | 1,924 | 77.3% | Good |
-| RFA  | 1,174    | 1,680 | 69.9% | Fair |
-*Short-lifecycle cases (CP, CMP, CA) achieve 85%+ disposal. Complex appeals show expected lower rates due to longer processing requirements.*
-## Hackathon Compliance
-### Step 2: Data-Informed Modelling - COMPLETE
-- Analyzed 739,669 hearings for patterns
-- Classified cases as "ripe" vs "unripe" with bottleneck types
-- Developed adjournment and disposal assumptions
-- Proposed synthetic fields for data enrichment
-### Step 3: Algorithm Development - COMPLETE
-- 2-year simulation operational with validated results
-- Stochastic case progression with realistic dynamics
-- Accounts for judicial working days (192/year)
-- Dynamic multi-courtroom allocation with perfect load balancing
-- Daily cause lists generated (CSV format)
-- User control & override system (judge approval workflow)
-- No-case-left-behind verification (97.7% coverage achieved)
-## For Hackathon Teams
-### Current Capabilities
-1. **Ripeness Classification**: Data-driven bottleneck detection
-2. **Realistic Simulation**: Stochastic adjournments, type-specific disposals
-3. **Multiple Policies**: FIFO, age-based, readiness-based
-4. **Fair Scheduling**: Gini coefficient 0.253 (low inequality)
-5. **Dynamic Allocation**: Load-balanced distribution across 5 courtrooms (Gini 0.002)
-### Development Status
-- **EDA & parameter extraction** - Complete
-- **Ripeness classification system** - Complete (40.8% cases filtered)
-- **Simulation engine with disposal logic** - Complete
-- **Dynamic multi-courtroom allocator** - Complete (perfect load balance)
-- **Daily cause list generator** - Complete (CSV export working)
-- **User control & override system** - Core API complete, UI pending
-- **No-case-left-behind verification** - Complete (97.7% coverage)
-- **Data gap analysis report** - Complete (8 synthetic fields proposed)
-- **Interactive dashboard** - Visualization components ready, UI assembly needed
-## Documentation
-**Primary**: This README (complete user guide)
-**Additional**: `docs/` folder contains:
-- `DASHBOARD.md` - Dashboard usage and architecture
-- `CONFIGURATION.md` - Configuration system reference
-- `HACKATHON_SUBMISSION.md` - Hackathon-specific submission guide
-**Scripts**: See `scripts/README.md` for analysis utilities

 # Code4Change: Intelligent Court Scheduling System
+Purpose-built for hackathon evaluation. This repository runs out of the box using the Streamlit dashboard and the uv tool.
+## Requirements
+- Python 3.11+
+- uv (required)
+  - macOS/Linux: `curl -LsSf https://astral.sh/uv/install.sh | sh`
+  - Windows (PowerShell): `irm https://astral.sh/uv/install.ps1 | iex`
+## Quick Start (Dashboard)
+1. Install uv (see above) and ensure Python 3.11+ is available.
+2. Clone this repository.
+3. Launch the dashboard:
 ```bash
 uv run streamlit run scheduler/dashboard/app.py
 ```
+Then open http://localhost:8501 in your browser.
+The dashboard provides:
+- Run EDA pipeline (process raw data and extract parameters)
+- Explore data and parameters
+- Generate cases and run simulations
+- Review cause lists and judge overrides
+- Compare performance and export reports
+## Command Line (optional)
+All operations are available via CLI as well:
 ```bash
 uv run court-scheduler --help
+# End-to-end workflow
 uv run court-scheduler workflow --cases 10000 --days 384
 ```
+For a detailed walkthrough tailored for judges, see `docs/HACKATHON_SUBMISSION.md`.
+## Data (DuckDB-first)
+This repository uses a DuckDB snapshot as the canonical raw dataset.
+- Preferred source: `Data/court_data.duckdb` (tables: `cases`, `hearings`). If this file is present, the EDA step will load directly from it.
+- CSV fallback: If the DuckDB file is missing, place the two organizer CSVs in `Data/` with the exact names below and the EDA step will load them automatically:
+  - `ISDMHack_Cases_WPfinal.csv`
+  - `ISDMHack_Hear.csv`
+No manual pre-processing is required; launch the dashboard and click “Run EDA Pipeline.”
+## Notes
+- This submission intentionally focuses on the end-to-end demo path. Internal development notes, enhancements, and bug fix logs have been removed from the README.
+- uv is enforced by the dashboard for a consistent, reproducible environment.

SUBMISSION_READINESS_AUDIT.md DELETED Viewed

@@ -1,313 +0,0 @@
-# Submission Readiness Audit - Critical Workflow Analysis
-**Date**: November 29, 2025
-**Purpose**: Validate that EVERY user action can be completed through dashboard
-**Goal**: Win the hackathon by ensuring zero gaps in functionality
----
-## Audit Methodology
-Simulating fresh user experience with ONLY:
-1. Raw data files (cases CSV, hearings CSV)
-2. Code repository
-3. Dashboard interface
-**NO pre-generated files, NO CLI usage, NO manual configuration**
----
-## 🔴 CRITICAL GAPS FOUND
-### GAP 1: Simulation Workflow - Policy Selection ✅ EXISTS
-**Location**: `3_Simulation_Workflow.py` (confirmed working)
-**Status**: ✅ IMPLEMENTED
-- User can select: FIFO, Age-based, Readiness, RL-based
-- RL requires trained model (handles gracefully)
-### GAP 2: Simulation Configuration Values ✅ EXISTS
-**Location**: `3_Simulation_Workflow.py`
-**Status**: ✅ IMPLEMENTED
-**User Controls**:
-- Number of days to simulate
-- Number of courtrooms
-- Daily capacity per courtroom
-- Random seed
-- Policy selection
-### GAP 3: Case Generation ✅ EXISTS
-**Location**: `3_Simulation_Workflow.py` Step 1
-**Status**: ✅ IMPLEMENTED
-**Options**:
-- Generate synthetic cases (with configurable parameters)
-- Upload CSV
-**Parameters exposed**:
-- Number of cases
-- Filing date range
-- Random seed
-- Output location
-### GAP 4: RL Training ❓ NEEDS VERIFICATION
-**Location**: `3_RL_Training.py`
-**Questions**:
-- Can user train RL model from dashboard?
-- Can they configure hyperparameters (episodes, learning rate, epsilon)?
-- Can they save/load models?
-- How do they use trained model in simulation?
-### GAP 5: Cause List Review & Override ❓ NEEDS VERIFICATION
-**Location**: `4_Cause_Lists_And_Overrides.py`
-**Questions**:
-- Can user view generated cause lists after simulation?
-- Can they modify case order (drag-and-drop)?
-- Can they remove/add cases?
-- Can they approve/reject algorithmic suggestions?
-- Is there an audit trail?
-### GAP 6: Performance Comparison ❓ NEEDS VERIFICATION
-**Location**: `6_Analytics_And_Reports.py`
-**Questions**:
-- Can user compare multiple simulation runs?
-- Can they see fairness metrics (Gini coefficient)?
-- Can they export reports?
-- Can they identify which policy performed best?
-### GAP 7: Ripeness Classifier Tuning ✅ EXISTS
-**Location**: `2_Ripeness_Classifier.py`
-**Status**: ✅ IMPLEMENTED (based on notebook context)
-- Interactive threshold adjustment
-- Test on sample cases
-- Batch classification
----
-## 🔍 DETAILED VERIFICATION NEEDED
-### Must Check: 3_RL_Training.py
-**Required Features**:
-- [ ] Training configuration form (episodes, LR, epsilon, gamma)
-- [ ] Start training button
-- [ ] Progress indicator during training
-- [ ] Save trained model with name
-- [ ] Load existing model for comparison
-- [ ] Model performance metrics
-- [ ] Link to use model in Simulation Workflow
-**If Missing**: User cannot train RL agent through dashboard
-### Must Check: 4_Cause_Lists_And_Overrides.py
-**Required Features**:
-- [ ] Load cause lists from simulation output
-- [ ] Display: date, courtroom, scheduled cases
-- [ ] Override interface:
-  - [ ] Reorder cases (drag-and-drop or priority input)
-  - [ ] Remove case from list
-  - [ ] Add case to list (from queue)
-  - [ ] Mark ripeness override
-  - [ ] Approve final list
-- [ ] Audit trail: who changed what, when
-- [ ] Export approved cause lists
-**If Missing**: Core hackathon requirement (judge control) not demonstrable
-### Must Check: 6_Analytics_And_Reports.py
-**Required Features**:
-- [ ] List all simulation runs
-- [ ] Select runs to compare
-- [ ] Side-by-side metrics:
-  - [ ] Disposal rate
-  - [ ] Adjournment rate
-  - [ ] Courtroom utilization
-  - [ ] Fairness (Gini coefficient)
-  - [ ] Cases scheduled vs abandoned
-- [ ] Charts: performance over time
-- [ ] Export comparison report (PDF/CSV)
-**If Missing**: Cannot demonstrate algorithmic improvements or validate claims
----
-## 🎯 WINNING CRITERIA CHECKLIST
-### Data-Informed Modelling (Step 2)
-- [x] EDA pipeline button in dashboard
-- [x] Ripeness classification interactive tuning
-- [x] Historical pattern visualizations
-- [ ] **VERIFY**: Can user see extracted parameters clearly?
-### Algorithm Development (Step 3)
-- [x] Multi-policy simulation available
-- [x] Configurable simulation parameters
-- [ ] **VERIFY**: Cause list generation automatic?
-- [ ] **CRITICAL**: Judge override system demonstrable?
-- [ ] **VERIFY**: No-case-left-behind metrics shown?
-### Fair Scheduling
-- [ ] **VERIFY**: Gini coefficient displayed in results?
-- [ ] **VERIFY**: Fairness comparison across policies?
-- [ ] **VERIFY**: Case age distribution shown?
-### User Control & Transparency
-- [ ] **CRITICAL**: Override interface working?
-- [ ] **VERIFY**: Algorithm explainability (why case scheduled/rejected)?
-- [ ] **VERIFY**: Audit trail of all decisions?
-### Production Readiness
-- [x] Self-contained dashboard (no CLI needed)
-- [x] EDA on-demand generation
-- [x] Case generation on-demand
-- [ ] **VERIFY**: End-to-end workflow completable?
-- [ ] **VERIFY**: All outputs exportable (CSV/PDF)?
----
-## 🚨 HIGH-RISK GAPS (Potential Show-Stoppers)
-### 1. Judge Override System
-**Risk**: If not working, fails core hackathon requirement
-**Impact**: Cannot demonstrate judicial autonomy
-**Action**: MUST verify `4_Cause_Lists_And_Overrides.py` has full CRUD operations
-### 2. RL Model Training Loop
-**Risk**: If training only works via CLI, breaks "dashboard-only" claim
-**Impact**: Cannot demonstrate RL capability in live demo
-**Action**: MUST verify `3_RL_Training.py` can train AND use model in sim
-### 3. Performance Comparison
-**Risk**: If cannot compare policies, cannot prove algorithmic value
-**Impact**: No evidence of improvement over baseline
-**Action**: MUST verify `6_Analytics_And_Reports.py` shows metrics comparison
-### 4. Cause List Export
-**Risk**: If cannot export final cause lists, not "production ready"
-**Impact**: Cannot demonstrate deployment readiness
-**Action**: MUST verify CSV/PDF export from cause lists page
----
-## 📋 NEXT STEPS (Priority Order)
-### IMMEDIATE (P0 - Do Now)
-1. **Read full content of**:
-   - `3_RL_Training.py` (lines 1-end)
-   - `4_Cause_Lists_And_Overrides.py` (lines 1-end)
-   - `6_Analytics_And_Reports.py` (lines 1-end)
-2. **Verify each gap** listed above
-3. **For each missing feature, decide**:
-   - Implement now (if < 30 min)
-   - Create placeholder with "Coming Soon" (if > 30 min)
-   - Document as limitation (if not critical)
-### HIGH (P1 - Do Today)
-4. **Test complete workflow as user would**:
-   - Fresh launch → EDA → Generate cases → Simulate → View results → Export
-   - Identify ANY point where user gets stuck
-5. **Create user guide** in dashboard:
-   - Step-by-step workflow
-   - Expected processing times
-   - What each button does
-### MEDIUM (P2 - Nice to Have)
-6. **Add progress indicators**:
-   - EDA pipeline: "Processing 739K hearings... 45%"
-   - Case generation: "Generated 5,000 / 10,000"
-   - Simulation: "Day 120 / 384"
-7. **Add data validation**:
-   - Check if EDA output exists before allowing simulation
-   - Warn if parameters seem unrealistic
----
-## 🏆 SUBMISSION CHECKLIST
-Before submission, user should be able to (with ZERO CLI):
-### Setup (One Time)
-- [ ] Launch dashboard
-- [ ] Click "Run EDA" button
-- [ ] Wait 2-5 minutes
-- [ ] See "EDA Complete" message
-### Generate Cases
-- [ ] Go to "Simulation Workflow"
-- [ ] Enter: 10,000 cases, 2022-2023 date range
-- [ ] Click "Generate"
-- [ ] See "Generation Complete"
-### Run Simulation
-- [ ] Configure: 384 days, 5 courtrooms, Readiness policy
-- [ ] Click "Run Simulation"
-- [ ] See progress bar
-- [ ] View results: disposal rate, Gini, utilization
-### Judge Override
-- [ ] Go to "Cause Lists & Overrides"
-- [ ] Select a date and courtroom
-- [ ] See algorithm-suggested cause list
-- [ ] Reorder 2 cases (or add/remove)
-- [ ] Click "Approve"
-- [ ] See confirmation
-### Performance Analysis
-- [ ] Go to "Analytics & Reports"
-- [ ] See list of past simulation runs
-- [ ] Select 2 runs (FIFO vs Readiness)
-- [ ] View comparison: disposal rates, fairness
-- [ ] Export comparison as CSV
-### Train RL (Optional)
-- [ ] Go to "RL Training"
-- [ ] Configure: 20 episodes, 0.15 LR
-- [ ] Click "Train"
-- [ ] See training progress
-- [ ] Save model as "my_agent.pkl"
-### Use RL Model
-- [ ] Go to "Simulation Workflow"
-- [ ] Select policy: "RL-based"
-- [ ] Select model: "my_agent.pkl"
-- [ ] Run simulation
-- [ ] Compare with baseline
-**If ANY step above fails or requires CLI, THAT IS A CRITICAL GAP.**
----
-## 💡 RECOMMENDATIONS
-### If Gaps Found:
-1. **Critical gaps (override system)**: Implement immediately, even if basic
-2. **Important gaps (RL training)**: Add "Coming Soon" notice + CLI fallback instructions
-3. **Nice-to-have gaps**: Document as future enhancement
-### If Time Allows:
-- Add tooltips explaining every parameter
-- Add "Example Workflow" guided tour
-- Add validation warnings (e.g., "10,000 cases with 5 days simulation seems short")
-- Add dashboard tour on first launch
-### Communication Strategy:
-- If feature incomplete: "This shows RL training interface. For full training, use CLI: `uv run court-scheduler train`"
-- If feature works: "Fully interactive - no CLI needed"
-- Always emphasize: "Dashboard is primary interface, CLI is for automation"
----
-## ✅ VERIFICATION PROTOCOL
-For EACH page, answer:
-1. **Can user complete the task without leaving dashboard?**
-2. **Are all configuration options exposed?**
-3. **Is there clear feedback on success/failure?**
-4. **Can user export/save results?**
-5. **Is there a "Next Step" button to guide workflow?**
-If ANY answer is "No", that's a gap.
----
-**Next Action**: Read remaining dashboard pages and fill in verification checkboxes above.

docs/CONFIGURATION.md CHANGED Viewed

@@ -1,179 +1,20 @@
-# Configuration Architecture
-## Overview
-The codebase uses a layered configuration approach separating concerns by domain and lifecycle.
-## Configuration Layers
-### 1. Domain Constants (`scheduler/data/config.py`)
-**Purpose**: Immutable domain knowledge that never changes.
-**Contains**:
-- `STAGES` - Legal case lifecycle stages from domain knowledge
-- `TERMINAL_STAGES` - Stages indicating case disposal
-- `CASE_TYPES` - Valid case type taxonomy
-- `CASE_TYPE_DISTRIBUTION` - Historical distribution from EDA
-- `WORKING_DAYS_PER_YEAR` - Court calendar constant (192 days)
-**When to use**: Values derived from legal/institutional domain that are facts, not tunable parameters.
-### 2. RL Training Configuration (`rl/config.py`)
-**Purpose**: Hyperparameters affecting RL agent learning behavior.
-**Class**: `RLTrainingConfig`
-**Parameters**:
-- `episodes`: Number of training episodes
-- `cases_per_episode`: Cases generated per episode
-- `episode_length_days`: Simulation horizon per episode
-- `learning_rate`: Q-learning alpha parameter
-- `discount_factor`: Q-learning gamma parameter
-- `initial_epsilon`: Starting exploration rate
-- `epsilon_decay`: Exploration decay factor
-- `min_epsilon`: Minimum exploration threshold
-**Presets**:
-- `DEFAULT_RL_TRAINING_CONFIG` - Standard training (100 episodes)
-- `QUICK_DEMO_RL_CONFIG` - Fast testing (20 episodes)
-**When to use**: Experimenting with RL training convergence and exploration strategies.
-### 3. Policy Configuration (`rl/config.py`)
-**Purpose**: Policy-specific filtering and prioritization behavior.
-**Class**: `PolicyConfig`
-**Parameters**:
-- `min_gap_days`: Minimum days between hearings (fairness constraint)
-- `max_gap_alert_days`: Maximum gap before triggering alerts
-- `old_case_threshold_days`: Age threshold for priority boost
-- `skip_unripe_cases`: Whether to filter unripe cases
-- `allow_old_unripe_cases`: Allow scheduling very old unripe cases
-**When to use**: Tuning policy filtering logic without changing core algorithm.
-### 4. Simulation Configuration (`scheduler/simulation/engine.py`)
-**Purpose**: Per-simulation operational parameters.
-**Class**: `CourtSimConfig`
-**Parameters**:
-- `start`: Simulation start date
-- `days`: Duration in days
-- `seed`: Random seed for reproducibility
-- `courtrooms`: Number of courtrooms to simulate
-- `daily_capacity`: Cases per courtroom per day
-- `policy`: Scheduling policy name (`fifo`, `age`, `readiness`, `rl`)
-- `duration_percentile`: EDA percentile for stage durations
-- `rl_agent_path`: Path to trained RL model (required if `policy="rl"`)
-- `log_dir`: Output directory for metrics
-**Validation**: `__post_init__` validates RL requirements and path types.
-**When to use**: Each simulation run (different policies, time periods, or capacities).
-### 5. CLI Configuration (`cli/config.py`)
-**Purpose**: Command-line interface configuration management.
-**Functions**:
-- `load_generate_config()`: Load case generation TOML config
-- `load_simulate_config()`: Load simulation TOML config
-- `load_rl_training_config()`: Load RL training TOML config
-**Configuration Files** (TOML format in `configs/`):
-- `generate_config.toml`: Case generation parameters
-- `simulate_config.toml`: Simulation settings
-- `rl_training_config.toml`: Training hyperparameters
-**When to use**: Customizing CLI command behavior without modifying code.
-## Configuration Flow
-```
-CLI Execution:
-|-- CLI Commands (cli/main.py)
-    |-- Command Options (Typer-based)
-    |-- Config Files (TOML in configs/)
-|-- Data Generation:
-    |-- Case generation parameters
-    |-- Date ranges and distributions
-|-- RL Training:
-    |-- RLTrainingConfig (training hyperparameters)
-    |-- Training environment settings
-|-- Simulation:
-    |-- CourtSimConfig (simulation settings)
-        |-- rl_agent_path (from training output)
-    |-- Policy instantiation:
-        |-- PolicyConfig (policy-specific settings)
-```
-## Design Principles
-1. **Separation of Concerns**: Each config class owns one domain
-2. **Type Safety**: Dataclasses with validation in `__post_init__`
-3. **No Magic**: Explicit parameters, no hidden defaults
-4. **Immutability**: Domain constants never change
-5. **Composition**: Configs nest (PipelineConfig contains RLTrainingConfig)
-## Examples
-### Quick Demo (CLI)
 ```bash
-# Command-line options
-uv run court-scheduler workflow --cases 10000 --days 90
 ```
-### Quick Demo (Programmatic)
-```python
-from rl.config import QUICK_DEMO_RL_CONFIG
-from scheduler.simulation.engine import CourtSimConfig
-# Use preset configs directly
-rl_config = QUICK_DEMO_RL_CONFIG  # 20 episodes
-```
-### Custom Training
-```python
-from rl.config import RLTrainingConfig
-custom_rl = RLTrainingConfig(
-    episodes=500,
-    learning_rate=0.1,
-    initial_epsilon=0.3,
-    epsilon_decay=0.995
-)
-config = PipelineConfig(
-    n_cases=50000,
-    rl_training=custom_rl,
-    sim_days=730
-)
-```
-### Policy Tuning
-```python
-from rl.config import PolicyConfig
-strict_policy = PolicyConfig(
-    min_gap_days=14,  # More conservative
-    skip_unripe_cases=True,
-    allow_old_unripe_cases=False  # Strict ripeness enforcement
-)
-# Pass to RLPolicy
-policy = RLPolicy(agent_path=model_path, policy_config=strict_policy)
-```
-## Migration Guide
-### Adding New Configuration
-1. Determine layer (domain constant vs. tunable parameter)
-2. Add to appropriate config class
-3. Update `__post_init__` validation if needed
-4. Document in this file
 ### Deprecating Parameters
 1. Move to config class first (keep old path working)

+# Configuration Guide (Consolidated)
+This configuration reference has been intentionally simplified for the hackathon to keep the repository focused for judges and evaluators.
+For the end-to-end demo and instructions, see:
+- `docs/HACKATHON_SUBMISSION.md`
+Advanced usage help is available via the CLI:
 ```bash
+uv run court-scheduler --help
+uv run court-scheduler generate --help
+uv run court-scheduler simulate --help
+uv run court-scheduler workflow --help
 ```
+Note: uv is required for all commands.
 ### Deprecating Parameters
 1. Move to config class first (keep old path working)

docs/DASHBOARD.md CHANGED Viewed

@@ -1,41 +1,17 @@
-# Interactive Dashboard
-**Last Updated**: 2025-11-29
-**Status**: Production Ready
-**Version**: 1.0.0
-## Launch
 ```bash
 uv run streamlit run scheduler/dashboard/app.py
-# Open http://localhost:8501
 ```
-## Pages
-1. **Data & Insights** - Historical analysis of 739K+ hearings
-2. **Ripeness Classifier** - Case bottleneck detection with explainability
-3. **RL Training** - Train and evaluate RL scheduling agents
-4. **Simulation Workflow** - Run simulations with configurable policies
-5. **Cause Lists & Overrides** - Judge override interface for cause lists
-6. **Analytics & Reports** - Performance comparison and reporting
-## Workflows
-**EDA Exploration**: Run EDA → Launch dashboard → Filter and visualize data
-**Judge Overrides**: Launch dashboard → Simulation Workflow → Review/modify cause lists
-**RL Training**: Launch dashboard → RL Training page → Configure and train
-## Data Sources
-- Historical data: `reports/figures/v*/cases_clean.parquet` and `hearings_clean.parquet`
-- Parameters: `reports/figures/v*/params/` (auto-detected latest version)
-- Falls back to bundled defaults if EDA not run
-- [ ] Batch classification (10K+ cases)
-- [ ] Multiple concurrent users (if deployed)
-## Troubleshooting
-**Dashboard won't launch**: Run `uv sync` to install dependencies
-**Empty visualizations**: Run `uv run court-scheduler eda` first
-**Slow loading**: Data auto-cached after first load (1-hour TTL)

+# Dashboard Guide (Consolidated)
+This document has been simplified for the hackathon. Please use the main guide:
+- See `docs/HACKATHON_SUBMISSION.md` for end-to-end demo instructions.
+Quick launch:
 ```bash
 uv run streamlit run scheduler/dashboard/app.py
+# Then open http://localhost:8501
 ```
+Data source:
+- Preferred: `Data/court_data.duckdb` (tables: `cases`, `hearings`).
+- Fallback: place `ISDMHack_Cases_WPfinal.csv` and `ISDMHack_Hear.csv` in `Data/` if the DuckDB file is not present.

docs/HACKATHON_SUBMISSION.md CHANGED Viewed

@@ -1,10 +1,10 @@
 # Hackathon Submission Guide
-## Intelligent Court Scheduling System with Reinforcement Learning
 ### Quick Start - Hackathon Demo
 **IMPORTANT**: The dashboard is fully self-contained. You only need:
-1. Raw data files (provided)
 2. This codebase
 3. Run the dashboard
@@ -25,7 +25,7 @@ uv run streamlit run scheduler/dashboard/app.py
 4. **Review Results**: Check "Cause Lists & Overrides" for judge override interface
 5. **Performance Analysis**: View "Analytics & Reports" for metrics comparison
-**No pre-processing required** - dashboard handles everything interactively.
 #### Alternative: CLI Workflow (for scripting)
 ```bash
@@ -53,16 +53,13 @@ uv run court-scheduler eda
 # 2. Generate synthetic cases
 uv run court-scheduler generate --cases 50000
-# 3. Train RL agent (optional)
-uv run court-scheduler train --episodes 100
-# 4. Run simulation
 uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy readiness
 ```
 ### What the Pipeline Does
-The comprehensive pipeline executes 7 automated steps:
 **Step 1: EDA & Parameter Extraction**
 - Analyzes 739K+ historical hearings
@@ -74,52 +71,39 @@ The comprehensive pipeline executes 7 automated steps:
 - Configurable size (default: 50,000 cases)
 - Diverse case types and complexity levels
-**Step 3: RL Training**
-- Trains Tabular Q-learning agent
-- Real-time progress monitoring with reward tracking
-- Configurable episodes and hyperparameters
-**Step 4: 2-Year Simulation**
 - Runs 730-day court scheduling simulation
-- Compares RL agent vs baseline algorithms
 - Tracks disposal rates, utilization, fairness metrics
-**Step 5: Daily Cause List Generation**
 - Generates production-ready daily cause lists
 - Exports for all simulation days
 - Court-room wise scheduling details
-**Step 6: Performance Analysis**
 - Comprehensive comparison reports
 - Performance visualizations
 - Statistical analysis of all metrics
-**Step 7: Executive Summary**
 - Hackathon-ready summary document
 - Key achievements and impact metrics
 - Deployment readiness checklist
 ### Expected Output
-After completion, you'll find in your output directory:
 ```
-data/hackathon_run/
-|-- pipeline_config.json          # Full configuration used
-|-- training_cases.csv            # Generated case dataset
-|-- trained_rl_agent.pkl          # Trained RL model
-|-- EXECUTIVE_SUMMARY.md          # Hackathon submission summary
-|-- COMPARISON_REPORT.md          # Detailed performance comparison
-|-- simulation_rl/                # RL policy results
-    |-- events.csv
-    |-- metrics.csv
-    |-- report.txt
-    |-- cause_lists/
-        |-- daily_cause_list.csv  # 730 days of cause lists
-|-- simulation_readiness/         # Baseline results
-    |-- ...
-|-- visualizations/               # Performance charts
-    |-- performance_charts.md
 ```
 ### Hackathon Winning Features
@@ -130,11 +114,11 @@ data/hackathon_run/
 - **Multi-Courtroom Support**: Load-balanced allocation across 5+ courtrooms
 - **Scalability**: Tested with 50,000+ cases
-#### 2. Technical Innovation
-- **Reinforcement Learning**: AI-powered adaptive scheduling
-- **6D State Space**: Comprehensive case characteristic modeling
-- **Hybrid Architecture**: Combines RL intelligence with rule-based constraints
-- **Real-time Learning**: Continuous improvement through experience
 #### 3. Production Readiness
 - **Interactive CLI**: User-friendly parameter configuration
@@ -150,15 +134,7 @@ data/hackathon_run/
 ### Performance Benchmarks
-Based on comprehensive testing:
-| Metric | RL Agent | Baseline | Advantage |
-|--------|----------|----------|-----------|
-| Disposal Rate | 52.1% | 51.9% | +0.4% |
-| Court Utilization | 85%+ | 85%+ | Comparable |
-| Load Balance (Gini) | 0.248 | 0.243 | Comparable |
-| Scalability | 50K cases | 50K cases | Yes |
-| Adaptability | High | Fixed | High |
 ### Customization Options
@@ -174,12 +150,11 @@ uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy age
 ```
 #### For Technical Evaluation
 ```bash
-# Focus on RL training quality
-uv run court-scheduler train --episodes 200 --lr 0.12 --cases 500 --output models/intensive_agent.pkl
-# Then simulate with trained agent
-uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy rl --agent models/intensive_agent.pkl
 ```
 #### For Quick Demo/Testing
@@ -202,7 +177,6 @@ uv run court-scheduler workflow --cases 10000 --days 90
 2. **Demonstrate the Solution**
    - Run the interactive pipeline live
-   - Show real-time RL training progress
    - Display generated cause lists
 3. **Present the Results**
@@ -211,7 +185,7 @@ uv run court-scheduler workflow --cases 10000 --days 90
    - Show actual cause list files (730 days ready)
 4. **Emphasize Innovation**
-   - Reinforcement Learning for judicial scheduling (novel)
    - Production-ready from day 1 (practical)
    - Scalable to entire court system (impactful)
@@ -223,7 +197,8 @@ uv run court-scheduler workflow --cases 10000 --days 90
 ### System Requirements
-- **Python**: 3.10+ with UV
 - **Memory**: 8GB+ RAM (16GB recommended for 50K cases)
 - **Storage**: 2GB+ for full pipeline outputs
 - **Runtime**:
@@ -236,9 +211,6 @@ uv run court-scheduler workflow --cases 10000 --days 90
 **Issue**: Out of memory during simulation
 **Solution**: Reduce n_cases to 10,000-20,000 or increase system RAM
-**Issue**: RL training very slow
-**Solution**: Reduce episodes to 50 or cases_per_episode to 500
 **Issue**: EDA parameters not found
 **Solution**: Run `uv run court-scheduler eda` first
@@ -277,12 +249,11 @@ uv run court-scheduler workflow \
 ### Contact & Support
 For hackathon questions or technical support:
-- Review PIPELINE.md for detailed architecture
-- Check README.md for system overview
-- See rl/README.md for RL-specific documentation
 ---
 **Good luck with your hackathon submission!**
-This system represents a genuine breakthrough in applying AI to judicial efficiency. The combination of production-ready cause lists, proven performance metrics, and innovative RL architecture positions this as a compelling winning submission.

 # Hackathon Submission Guide
+## Intelligent Court Scheduling System
 ### Quick Start - Hackathon Demo
 **IMPORTANT**: The dashboard is fully self-contained. You only need:
+1. Preferred: `Data/court_data.duckdb` (included in this repo). Alternatively, place the two CSVs in `Data/` with exact names: `ISDMHack_Cases_WPfinal.csv` and `ISDMHack_Hear.csv`.
 2. This codebase
 3. Run the dashboard
 4. **Review Results**: Check "Cause Lists & Overrides" for judge override interface
 5. **Performance Analysis**: View "Analytics & Reports" for metrics comparison
+**No pre-processing required** — EDA automatically loads `Data/court_data.duckdb` when present; if missing, it falls back to `ISDMHack_Cases_WPfinal.csv` and `ISDMHack_Hear.csv` placed in `Data/`.
 #### Alternative: CLI Workflow (for scripting)
 ```bash
 # 2. Generate synthetic cases
 uv run court-scheduler generate --cases 50000
+# 3. Run simulation
 uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy readiness
 ```
 ### What the Pipeline Does
+The comprehensive pipeline executes 6 automated steps:
 **Step 1: EDA & Parameter Extraction**
 - Analyzes 739K+ historical hearings
 - Configurable size (default: 50,000 cases)
 - Diverse case types and complexity levels
+**Step 3: 2-Year Simulation**
 - Runs 730-day court scheduling simulation
+- Compares scheduling policies (FIFO, age-based, readiness)
 - Tracks disposal rates, utilization, fairness metrics
+**Step 4: Daily Cause List Generation**
 - Generates production-ready daily cause lists
 - Exports for all simulation days
 - Court-room wise scheduling details
+**Step 5: Performance Analysis**
 - Comprehensive comparison reports
 - Performance visualizations
 - Statistical analysis of all metrics
+**Step 6: Executive Summary**
 - Hackathon-ready summary document
 - Key achievements and impact metrics
 - Deployment readiness checklist
 ### Expected Output
+After completion, you'll find outputs under your selected run directory (created automatically; the dashboard uses outputs/simulation_runs by default):
 ```
+outputs/simulation_runs/v<version>_<timestamp>/
+|-- pipeline_config.json     # Full configuration used
+|-- events.csv               # All scheduled events across days
+|-- metrics.csv              # Aggregate metrics for the run
+|-- daily_summaries.csv      # Per-day summary metrics
+|-- cause_lists/             # Generated daily cause lists (CSV)
+|   |-- YYYY-MM-DD.csv       # One file per simulation day
+|-- figures/                 # Optional charts (when exported)
 ```
 ### Hackathon Winning Features
 - **Multi-Courtroom Support**: Load-balanced allocation across 5+ courtrooms
 - **Scalability**: Tested with 50,000+ cases
+#### 2. Technical Approach
+- Data-informed simulation calibrated from historical hearings
+- Multiple heuristic policies: FIFO, age-based, readiness-based
+- Readiness policy enforces bottleneck/ripeness constraints
+- Fairness metrics (e.g., Gini) and utilization tracking
 #### 3. Production Readiness
 - **Interactive CLI**: User-friendly parameter configuration
 ### Performance Benchmarks
+Compare policies by running multiple simulations (e.g., readiness vs FIFO vs age) and reviewing disposal rate, utilization, and fairness (Gini). The Analytics & Reports dashboard page can load and compare runs side-by-side.
 ### Customization Options
 ```
 #### For Technical Evaluation
+Focus on repeatability and fairness by comparing multiple policies and seeds:
 ```bash
+uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy readiness --seed 1
+uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy fifo --seed 1
+uv run court-scheduler simulate --cases data/cases.csv --days 730 --policy age --seed 1
 ```
 #### For Quick Demo/Testing
 2. **Demonstrate the Solution**
    - Run the interactive pipeline live
    - Display generated cause lists
 3. **Present the Results**
    - Show actual cause list files (730 days ready)
 4. **Emphasize Innovation**
+   - Data-driven readiness-based scheduling (novel for this context)
    - Production-ready from day 1 (practical)
    - Scalable to entire court system (impactful)
 ### System Requirements
+- **Python**: 3.11+
+- **uv**: required to run commands and the dashboard
 - **Memory**: 8GB+ RAM (16GB recommended for 50K cases)
 - **Storage**: 2GB+ for full pipeline outputs
 - **Runtime**:
 **Issue**: Out of memory during simulation
 **Solution**: Reduce n_cases to 10,000-20,000 or increase system RAM
 **Issue**: EDA parameters not found
 **Solution**: Run `uv run court-scheduler eda` first
 ### Contact & Support
 For hackathon questions or technical support:
+- Check README.md for the system overview
+- See this guide (docs/HACKATHON_SUBMISSION.md) for end-to-end instructions
 ---
 **Good luck with your hackathon submission!**
+This system represents a pragmatic, data-driven approach to improving judicial efficiency. The combination of production-ready cause lists, proven performance metrics, and a transparent, judge-in-the-loop design positions this as a compelling winning submission.

scheduler/dashboard/app.py CHANGED Viewed

@@ -2,11 +2,11 @@
 This is the entry point for the Streamlit multi-page dashboard.
 Launch with: uv run court-scheduler dashboard
-Or directly: streamlit run scheduler/dashboard/app.py
 """
 from __future__ import annotations
 from pathlib import Path
 import streamlit as st
@@ -21,6 +21,21 @@ st.set_page_config(
     initial_sidebar_state="expanded",
 )
 # Main page content
 st.title("Court Scheduling System Dashboard")
 st.markdown("**Karnataka High Court - Algorithmic Decision Support for Fair Scheduling**")

 This is the entry point for the Streamlit multi-page dashboard.
 Launch with: uv run court-scheduler dashboard
 """
 from __future__ import annotations
+import subprocess
 from pathlib import Path
 import streamlit as st
     initial_sidebar_state="expanded",
 )
+# Enforce `uv` availability for all dashboard-triggered commands
+try:
+    uv_check = subprocess.run(["uv", "--version"], capture_output=True, text=True)
+    if uv_check.returncode != 0:
+        raise RuntimeError(uv_check.stderr or "uv not available")
+except Exception:
+    import streamlit as st
+    st.error(
+        "'uv' is required to run this dashboard's commands. Please install uv and rerun.\n\n"
+        "Install on macOS/Linux: `curl -LsSf https://astral.sh/uv/install.sh | sh`\n"
+        "Install on Windows (PowerShell): `irm https://astral.sh/uv/install.ps1 | iex`"
+    )
+    st.stop()
 # Main page content
 st.title("Court Scheduling System Dashboard")
 st.markdown("**Karnataka High Court - Algorithmic Decision Support for Fair Scheduling**")

scheduler/dashboard/pages/6_Analytics_And_Reports.py CHANGED Viewed

@@ -12,6 +12,7 @@ from __future__ import annotations
 from datetime import datetime
 from pathlib import Path
 import pandas as pd
 import plotly.express as px
 import plotly.graph_objects as go
@@ -361,6 +362,70 @@ with tab3:
                         with col3:
                             st.metric("Max Age", f"{case_dates['age_days'].max():.0f} days")
                     # Case type fairness
                     if "case_type" in events_df.columns:
                         st.markdown("---")
@@ -379,6 +444,62 @@ with tab3:
                         fig.update_layout(height=400, xaxis_tickangle=-45)
                         st.plotly_chart(fig, use_container_width=True)
                 except Exception as e:
                     st.error(f"Error loading events data: {e}")

 from datetime import datetime
 from pathlib import Path
+import numpy as np
 import pandas as pd
 import plotly.express as px
 import plotly.graph_objects as go
                         with col3:
                             st.metric("Max Age", f"{case_dates['age_days'].max():.0f} days")
+                        # Additional Fairness Metrics: Gini and Lorenz Curve
+                        st.markdown("#### Inequality Metrics (Fairness)")
+                        def _gini(values: np.ndarray) -> float:
+                            v = np.asarray(values, dtype=float)
+                            v = v[np.isfinite(v)]
+                            v = v[v >= 0]
+                            if v.size == 0:
+                                return float("nan")
+                            if np.all(v == 0):
+                                return 0.0
+                            v_sorted = np.sort(v)
+                            n = v_sorted.size
+                            cumulative = np.cumsum(v_sorted)
+                            # Gini based on cumulative shares
+                            gini = (n + 1 - 2 * np.sum(cumulative) / cumulative[-1]) / n
+                            return float(gini)
+                        ages = case_dates["age_days"].to_numpy()
+                        gini_age = _gini(ages)
+                        col_a, col_b = st.columns(2)
+                        with col_a:
+                            if np.isfinite(gini_age):
+                                st.metric("Gini (Age Inequality)", f"{gini_age:.3f}")
+                            else:
+                                st.info("Gini (Age) not available")
+                        # Lorenz curve for ages
+                        with col_b:
+                            try:
+                                ages_clean = ages[np.isfinite(ages)]
+                                ages_clean = ages_clean[ages_clean >= 0]
+                                if ages_clean.size > 0:
+                                    ages_sorted = np.sort(ages_clean)
+                                    cum_ages = np.cumsum(ages_sorted)
+                                    cum_ages = np.insert(cum_ages, 0, 0)
+                                    cum_pop = np.linspace(0, 1, num=cum_ages.size)
+                                    lorenz = cum_ages / cum_ages[-1]
+                                    fig_lorenz = go.Figure()
+                                    fig_lorenz.add_trace(
+                                        go.Scatter(x=cum_pop, y=lorenz, mode="lines", name="Lorenz")
+                                    )
+                                    fig_lorenz.add_trace(
+                                        go.Scatter(
+                                            x=[0, 1],
+                                            y=[0, 1],
+                                            mode="lines",
+                                            name="Equality",
+                                            line=dict(dash="dash"),
+                                        )
+                                    )
+                                    fig_lorenz.update_layout(
+                                        title="Lorenz Curve of Case Ages",
+                                        xaxis_title="Cumulative share of cases",
+                                        yaxis_title="Cumulative share of total age",
+                                        height=350,
+                                    )
+                                    st.plotly_chart(fig_lorenz, use_container_width=True)
+                                else:
+                                    st.info("Not enough data to plot Lorenz curve")
+                            except Exception:
+                                st.info("Unable to compute Lorenz curve for current data")
                     # Case type fairness
                     if "case_type" in events_df.columns:
                         st.markdown("---")
                         fig.update_layout(height=400, xaxis_tickangle=-45)
                         st.plotly_chart(fig, use_container_width=True)
+                        # Age distribution by case type (top N by cases)
+                        st.markdown("#### Age Distribution by Case Type (Top 8)")
+                        try:
+                            # Map each case_id to a case_type (take the first occurrence)
+                            cid_to_type = (
+                                events_df.sort_values("date")
+                                .groupby("case_id")["case_type"]
+                                .first()
+                            )
+                            age_with_type = (
+                                case_dates[["age_days"]]
+                                .join(cid_to_type, how="left")
+                                .dropna(subset=["case_type"])  # keep only cases with type
+                            )
+                            top_types = (
+                                age_with_type["case_type"].value_counts().head(8).index.tolist()
+                            )
+                            filt = age_with_type["case_type"].isin(top_types)
+                            fig_box = px.box(
+                                age_with_type[filt],
+                                x="case_type",
+                                y="age_days",
+                                points="outliers",
+                                title="Case Age by Case Type (Top 8)",
+                                labels={"case_type": "Case Type", "age_days": "Age (days)"},
+                            )
+                            fig_box.update_layout(height=420, xaxis_tickangle=-45)
+                            st.plotly_chart(fig_box, use_container_width=True)
+                            # Gini by case type (Top 8)
+                            st.markdown("#### Inequality by Case Type (Gini)")
+                            gini_rows = []
+                            for ctype in top_types:
+                                vals = age_with_type.loc[
+                                    age_with_type["case_type"] == ctype, "age_days"
+                                ].to_numpy()
+                                g = _gini(vals)
+                                gini_rows.append({"case_type": ctype, "gini": g})
+                            gini_df = pd.DataFrame(gini_rows).dropna()
+                            if not gini_df.empty:
+                                fig_gini = px.bar(
+                                    gini_df,
+                                    x="case_type",
+                                    y="gini",
+                                    title="Gini Coefficient by Case Type (Top 8)",
+                                    labels={"case_type": "Case Type", "gini": "Gini"},
+                                )
+                                fig_gini.update_layout(
+                                    height=380, xaxis_tickangle=-45, yaxis_range=[0, 1]
+                                )
+                                st.plotly_chart(fig_gini, use_container_width=True)
+                            else:
+                                st.info("Insufficient data to compute per-type Gini")
+                        except Exception as _:
+                            st.info("Unable to compute per-type age distributions for current data")
                 except Exception as e:
                     st.error(f"Error loading events data: {e}")