# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Standard Task Workflow For tasks of implementing **new features**: 1. Read PRD.md, Plan.md, Progress.md before coding 2. Summarize current project state before implementation 3. Carry out the implementatation; after that, build and test if possible 4. Update Progress.md after changes 5. Commit with a clear, concise message For tasks of **bug fixing**: 1. Summarize the bug, reason and solution before implementation 2. Carry out the implementation to fix the bug; build and test afterwards; 3. Update Progress.md after changes 4. Commit with a clear, concise message For tasks of **reboot** from a new codex session: 1. Read doc/v0/PRD.md, doc/v0/Plan.md, doc/v0/Progress.md for baseline implementation 2. Read doc/v1/PRD.md, doc/v1/Plan.md, doc/v1/Progress.md for edge refinement (v1) 3. Assume this is a continuation of an existing project. 4. Summarize your understanding of the current state and propose the next concrete step without writing code yet. ## Project Overview Ring Sizer is a **local, terminal-executable computer vision program** that measures the outer width (diameter) of a finger at the ring-wearing zone using a single RGB image. It uses a standard credit card (ISO/IEC 7810 ID-1: 85.60mm × 53.98mm) as a physical size reference for scale calibration. **Key characteristics:** - Single image input (JPG/PNG) - **v1: Dual edge detection** - Landmark-based axis + Sobel gradient refinement - MediaPipe-based hand and finger segmentation - MediaPipe-based hand and finger segmentation - Outputs JSON measurement data and optional debug visualization - No cloud processing, runs entirely locally - Python 3.8+ with OpenCV, NumPy, MediaPipe, and SciPy ## Development Commands ### Installation ```bash # Create virtual environment (recommended) python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate # Install dependencies pip install -r requirements.txt ``` ### Running the Program ```bash # Basic measurement (defaults to index finger, auto edge detection) python measure_finger.py --input input/test_image.jpg --output output/result.json # Measure specific finger (index, middle, ring, or auto) python measure_finger.py \ --input input/test_image.jpg \ --output output/result.json \ --finger-index ring # With debug visualization python measure_finger.py \ --input input/test_image.jpg \ --output output/result.json \ --finger-index middle \ --debug output/debug_overlay.png # Force Sobel edge refinement (v1) python measure_finger.py \ --input image.jpg \ --output result.json \ --finger-index ring \ --edge-method sobel \ --sobel-threshold 15.0 \ --debug output/debug.png # Compare both methods python measure_finger.py \ --input image.jpg \ --output result.json \ --finger-index middle \ --edge-method compare \ --debug output/debug.png # Force contour method (v0) python measure_finger.py \ --input image.jpg \ --output result.json \ --finger-index index \ --edge-method contour ``` ## Architecture Overview ### Processing Pipeline (9 Phases) The measurement pipeline follows a strict sequential flow: 1. **Image Quality Check** - Blur detection, exposure validation, resolution check 2. **Credit Card Detection & Scale Calibration** - Detects card, verifies aspect ratio (~1.586), computes `px_per_cm` 3. **Hand & Finger Segmentation** - MediaPipe hand detection, finger isolation, mask generation 4. **Finger Contour Extraction** - Extracts outer contour from cleaned mask 5. **Finger Axis Estimation** - PCA-based principal axis calculation, determines palm-end vs tip-end 6. **Ring-Wearing Zone Localization** - Defines zone at 15%-25% of finger length from palm-side 7. **Width Measurement** - Samples 20 cross-sections perpendicular to axis, uses median width 8. **Confidence Scoring** - Multi-factor scoring (card 30%, finger 30%, measurement 40%) 9. **Debug Visualization** - Generates annotated overlay image ### Module Structure The codebase is organized into focused utility modules in `src/`: | Module | Primary Responsibilities | |--------|--------------------------| | `card_detection.py` | Credit card detection, perspective correction, scale calibration (`px_per_cm`) | | `finger_segmentation.py` | MediaPipe integration, hand/finger isolation, mask cleaning, contour extraction | | `geometry.py` | PCA axis estimation, ring zone localization, cross-section width measurement, line-contour intersections | | `image_quality.py` | Blur detection (Laplacian variance), exposure checks, resolution validation | | `confidence.py` | Component confidence scoring (card, finger, measurement), overall confidence computation | | `visualization.py` | Debug overlay generation with contours, zones, measurements, and annotations | ### Key Design Decisions **Ring-Wearing Zone Definition:** - Located at 15%-25% of finger length from palm-side end - Width measured by sampling 20 cross-sections within this zone - Final measurement is the **median width** (robust to outliers) **Axis Estimation:** - Uses PCA (Principal Component Analysis) on finger mask points - Determines palm-end vs tip-end using either: 1. MediaPipe landmarks (preferred, if available) 2. Thickness heuristic (thinner end is likely the tip) **Confidence Scoring:** - 3-component weighted average: Card (30%) + Finger (30%) + Measurement (40%) - Confidence levels: HIGH (>0.85), MEDIUM (0.6-0.85), LOW (<0.6) - Factors: card detection quality, finger mask area, width variance, aspect ratios **Measurement Approach:** - Perpendicular cross-sections to finger axis - Line-contour intersection algorithm finds left/right edges - Uses farthest pair of intersections to handle complex contours - Converts pixels to cm using calibrated scale factor --- ## v1 Architecture (Edge Refinement) ### What's New in v1 v1 improves measurement accuracy by replacing contour-based edge detection with gradient-based Sobel edge refinement. Key improvements: - **Landmark-based axis**: Uses MediaPipe finger landmarks (MCP→PIP→DIP→TIP) for more anatomically consistent axis estimation - **Sobel edge detection**: Bidirectional gradient filtering for pixel-precise edge localization - **Sub-pixel refinement**: Parabola fitting achieves <0.5px precision (~0.003cm at typical resolution) - **Quality-based fallback**: Automatically uses v0 contour method if Sobel quality insufficient - **Enhanced confidence**: Adds edge quality component (gradient strength, consistency, smoothness, symmetry) ### v1 Processing Pipeline (Enhanced Phases) **Phase 5a: Landmark-Based Axis Estimation (v1)** - Uses MediaPipe finger landmarks directly (4 points: MCP, PIP, DIP, TIP) - **Finger selection**: Defaults to index finger, can specify middle or ring finger via `--finger-index` - Orientation detection uses the **specified finger** for axis calculation (wrist → finger tip) - Image automatically rotated to canonical orientation (wrist at bottom, fingers pointing up) - Three axis calculation methods: - `endpoints`: Simple MCP→TIP vector - `linear_fit`: Linear regression on all 4 landmarks (default, most robust) - `median_direction`: Median of segment directions - Falls back to PCA if landmarks unavailable or quality check fails - Validation checks: NaN/inf, minimum spacing, monotonic progression, minimum length **Phase 7b: Sobel Edge Refinement (v1)** ``` 1. Extract ROI around ring zone → 2. Apply bidirectional Sobel filters → 3. Detect edges per cross-section → 4. Sub-pixel refinement → 5. Measure width ``` 1. **ROI Extraction** - Rectangular region around ring zone with padding (50px for gradient context) - Width estimation: `finger_length / 3.0` (conservative) - Optional rotation alignment (not used by default) 2. **Bidirectional Sobel Filtering** - Applies `cv2.Sobel` with configurable kernel size (3, 5, or 7) - Computes gradient_x (horizontal edges), gradient_y (vertical edges) - Calculates gradient magnitude and direction - Auto-detects filter orientation from ROI aspect ratio 3. **Edge Detection Per Cross-Section** - **Mask-constrained mode** (primary): - Finds leftmost/rightmost finger mask pixels (finger boundaries) - Searches ±10px around boundaries for strongest gradient - Combines anatomical accuracy (mask) with sub-pixel precision (gradient) - **Gradient-only mode** (fallback): Pure Sobel without mask constraint 4. **Sub-Pixel Edge Localization** - Parabola fitting: f(x) = ax² + bx + c - Samples gradient at x-1, x, x+1 - Finds parabola peak: x_peak = -b/(2a) - Constrains refinement to ±0.5 pixels - Achieves <0.5px precision (~0.003cm at 185 px/cm) 5. **Width Measurement** - Calculates width for each valid row - Outlier filtering using Median Absolute Deviation (MAD) - Removes measurements >3 MAD from median - Computes median, mean, std dev - Converts pixels to cm using scale factor **Phase 8b: Enhanced Confidence Scoring (v1)** - Adds 4th component: Edge Quality (20% weight) - Gradient strength: Avg magnitude at detected edges - Consistency: % of rows with valid edge pairs - Smoothness: Edge position variance (lower = better) - Symmetry: Left/right edge strength balance - Reweights other components: Card 25%, Finger 25%, Measurement 30% ### v1 Module Structure | Module | v1 Enhancements | |--------|-----------------| | `geometry.py` | Added `estimate_finger_axis_from_landmarks()`, `_validate_landmark_quality()`, landmark-based zone localization | | **`edge_refinement.py`** | **[NEW]** Complete Sobel edge refinement pipeline with sub-pixel precision | | `confidence.py` | Added `compute_edge_quality_confidence()`, dual-mode confidence calculation | | `debug_observer.py` | Added 9 edge refinement drawing functions for visualization | | `measure_finger.py` | CLI flags for edge method selection, method comparison mode | ### v1 CLI Flags | Flag | Values | Default | Description | |------|--------|---------|-------------| | `--finger-index` | auto, index, middle, ring, pinky | **index** | Which finger to measure and use for orientation | | `--edge-method` | auto, contour, sobel, compare | auto | Edge detection method | | `--sobel-threshold` | float | 15.0 | Minimum gradient magnitude | | `--sobel-kernel-size` | 3, 5, 7 | 3 | Sobel kernel size | | `--no-subpixel` | flag | False | Disable sub-pixel refinement | ### v1 Auto Mode Behavior When `--edge-method auto` (default): 1. Always computes contour measurement (v0 baseline) 2. Attempts Sobel edge refinement 3. Evaluates Sobel quality score (threshold: 0.7) 4. Checks consistency (>50% success rate required) 5. Verifies width reasonableness (0.8-3.5 cm) 6. Checks agreement with contour (<50% difference) 7. Uses Sobel if all checks pass, otherwise falls back to contour 8. Reports method used in `edge_method_used` field ### v1 Debug Output When `--debug` flag used, generates: - Main debug overlay (same as v0, shows final result) - `output/edge_refinement_debug/` subdirectory with 12 images: - **Stage A** (3): Landmark axis, ring zone, ROI extraction - **Stage B** (5): Sobel gradients, candidates, selected edges - **Stage C** (4): Sub-pixel refinement, widths, distribution, outliers ### v1 Failure Modes (Additional) - `sobel_edge_refinement_failed` - Sobel method explicitly requested but failed - `quality_score_low_X.XX` - Edge quality below threshold (auto fallback) - `consistency_low_X.XX` - Too few valid edge detections - `width_unreasonable` - Measured width outside realistic range - `disagreement_with_contour` - Sobel and contour differ by >50% --- ## Important Technical Details ### What This Measures The system measures the **external horizontal width** (outer diameter) of the finger at the ring-wearing zone. This is: - ✅ The width of soft tissue + bone at the ring-wearing position - ❌ NOT the inner diameter of a ring - Used as a geometric proxy for downstream ring size mapping (out of scope for v0) ### Coordinate Systems - Images use standard OpenCV format: (row, col) = (y, x) - Most geometry functions work in (x, y) format - Contours are Nx2 arrays in (x, y) format - Careful conversion needed between formats (see `geometry.py:35`) ### MediaPipe Integration - Uses pretrained hand landmark detection model (no custom training) - Provides 21 hand landmarks per hand - Each finger has 4 landmarks: MCP (base), PIP, DIP, TIP - Finger indices: 0=thumb, 1=index, 2=middle, 3=ring, 4=pinky - **Orientation detection**: Uses wrist → specified finger tip to determine hand rotation - **Automatic rotation**: Image rotated to canonical orientation (wrist at bottom, fingers up) based on selected finger ### Input Requirements For optimal results: - Resolution: 1080p or higher recommended - View angle: Near top-down view - **Finger**: One finger extended (index, middle, or ring). Specify with `--finger-index` - Credit card: Must show at least 3 corners, aspect ratio ~1.586 - Finger and card must be on the same plane - Good lighting, minimal blur ### Failure Modes The system can fail at various stages: - `card_not_detected` - Credit card not found or aspect ratio invalid - `hand_not_detected` - No hand detected by MediaPipe - `finger_isolation_failed` - Could not isolate specified finger - `finger_mask_too_small` - Mask area too small after cleaning - `contour_extraction_failed` - Could not extract valid contour - `axis_estimation_failed` - PCA failed or insufficient points - `zone_localization_failed` - Could not define ring zone - `width_measurement_failed` - No valid cross-section intersections ## Output Format ### JSON Output Structure ```json { "finger_outer_diameter_cm": 1.78, "confidence": 0.86, "scale_px_per_cm": 42.3, "quality_flags": { "card_detected": true, "finger_detected": true, "view_angle_ok": true }, "fail_reason": null } ``` ### Debug Visualization Features When `--debug` flag is used, generates an annotated image with: - Credit card contour and corners (green) - Finger contour (magenta, thick lines) - Finger axis and endpoints (cyan/yellow) - Ring-wearing zone band (yellow, semi-transparent) - Cross-section sampling lines (orange) - Measurement intersection points (blue circles) - Final measurement and confidence text (large, readable font) ## Code Patterns and Conventions ### Error Handling - Functions return `None` or raise exceptions on failure - Main pipeline (`measure_finger()`) returns structured output dict with `fail_reason` - Console logging provides detailed progress information ### Type Hints - Extensive use of type hints throughout - Dict return types with `Dict[str, Any]` for structured data - NumPy arrays typed as `np.ndarray` - Literal types for enums (e.g., `FingerIndex`) ### Data Flow - All major functions return dictionaries with consistent keys - Downstream functions accept upstream outputs directly - Debug visualization receives all intermediate results - Clean separation between detection, computation, and visualization ### Validation and Sanity Checks - Finger width should be in realistic range: 1.0-3.0 cm (typical: 1.4-2.4 cm) - Credit card aspect ratio should be close to 1.586 - View angle check: scale confidence should be >0.9 for accurate measurements - Minimum mask area threshold prevents false detections