Spaces:
Sleeping
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Standard Task Workflow
For tasks of implementing new features:
- Read PRD.md, Plan.md, Progress.md before coding
- Summarize current project state before implementation
- Carry out the implementatation; after that, build and test if possible
- Update Progress.md after changes
- Commit with a clear, concise message
For tasks of bug fixing:
- Summarize the bug, reason and solution before implementation
- Carry out the implementation to fix the bug; build and test afterwards;
- Update Progress.md after changes
- Commit with a clear, concise message
For tasks of reboot from a new codex session:
- Read doc/v0/PRD.md, doc/v0/Plan.md, doc/v0/Progress.md for baseline implementation
- Read doc/v1/PRD.md, doc/v1/Plan.md, doc/v1/Progress.md for edge refinement (v1)
- Assume this is a continuation of an existing project.
- Summarize your understanding of the current state and propose the next concrete step without writing code yet.
Project Overview
Ring Sizer is a local, terminal-executable computer vision program that measures the outer width (diameter) of a finger at the ring-wearing zone using a single RGB image. It uses a standard credit card (ISO/IEC 7810 ID-1: 85.60mm × 53.98mm) as a physical size reference for scale calibration.
Key characteristics:
- Single image input (JPG/PNG)
- v1: Dual edge detection - Landmark-based axis + Sobel gradient refinement
- MediaPipe-based hand and finger segmentation
- MediaPipe-based hand and finger segmentation
- Outputs JSON measurement data and optional debug visualization
- No cloud processing, runs entirely locally
- Python 3.8+ with OpenCV, NumPy, MediaPipe, and SciPy
Development Commands
Installation
# Create virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
Running the Program
# Basic measurement (defaults to index finger, auto edge detection)
python measure_finger.py --input input/test_image.jpg --output output/result.json
# Measure specific finger (index, middle, ring, or auto)
python measure_finger.py \
--input input/test_image.jpg \
--output output/result.json \
--finger-index ring
# With debug visualization
python measure_finger.py \
--input input/test_image.jpg \
--output output/result.json \
--finger-index middle \
--debug output/debug_overlay.png
# Force Sobel edge refinement (v1)
python measure_finger.py \
--input image.jpg \
--output result.json \
--finger-index ring \
--edge-method sobel \
--sobel-threshold 15.0 \
--debug output/debug.png
# Compare both methods
python measure_finger.py \
--input image.jpg \
--output result.json \
--finger-index middle \
--edge-method compare \
--debug output/debug.png
# Force contour method (v0)
python measure_finger.py \
--input image.jpg \
--output result.json \
--finger-index index \
--edge-method contour
Architecture Overview
Processing Pipeline (9 Phases)
The measurement pipeline follows a strict sequential flow:
- Image Quality Check - Blur detection, exposure validation, resolution check
- Credit Card Detection & Scale Calibration - Detects card, verifies aspect ratio (~1.586), computes
px_per_cm - Hand & Finger Segmentation - MediaPipe hand detection, finger isolation, mask generation
- Finger Contour Extraction - Extracts outer contour from cleaned mask
- Finger Axis Estimation - PCA-based principal axis calculation, determines palm-end vs tip-end
- Ring-Wearing Zone Localization - Defines zone at 15%-25% of finger length from palm-side
- Width Measurement - Samples 20 cross-sections perpendicular to axis, uses median width
- Confidence Scoring - Multi-factor scoring (card 30%, finger 30%, measurement 40%)
- Debug Visualization - Generates annotated overlay image
Module Structure
The codebase is organized into focused utility modules in src/:
| Module | Primary Responsibilities |
|---|---|
card_detection.py |
Credit card detection, perspective correction, scale calibration (px_per_cm) |
finger_segmentation.py |
MediaPipe integration, hand/finger isolation, mask cleaning, contour extraction |
geometry.py |
PCA axis estimation, ring zone localization, cross-section width measurement, line-contour intersections |
image_quality.py |
Blur detection (Laplacian variance), exposure checks, resolution validation |
confidence.py |
Component confidence scoring (card, finger, measurement), overall confidence computation |
visualization.py |
Debug overlay generation with contours, zones, measurements, and annotations |
Key Design Decisions
Ring-Wearing Zone Definition:
- Located at 15%-25% of finger length from palm-side end
- Width measured by sampling 20 cross-sections within this zone
- Final measurement is the median width (robust to outliers)
Axis Estimation:
- Uses PCA (Principal Component Analysis) on finger mask points
- Determines palm-end vs tip-end using either:
- MediaPipe landmarks (preferred, if available)
- Thickness heuristic (thinner end is likely the tip)
Confidence Scoring:
- 3-component weighted average: Card (30%) + Finger (30%) + Measurement (40%)
- Confidence levels: HIGH (>0.85), MEDIUM (0.6-0.85), LOW (<0.6)
- Factors: card detection quality, finger mask area, width variance, aspect ratios
Measurement Approach:
- Perpendicular cross-sections to finger axis
- Line-contour intersection algorithm finds left/right edges
- Uses farthest pair of intersections to handle complex contours
- Converts pixels to cm using calibrated scale factor
v1 Architecture (Edge Refinement)
What's New in v1
v1 improves measurement accuracy by replacing contour-based edge detection with gradient-based Sobel edge refinement. Key improvements:
- Landmark-based axis: Uses MediaPipe finger landmarks (MCP→PIP→DIP→TIP) for more anatomically consistent axis estimation
- Sobel edge detection: Bidirectional gradient filtering for pixel-precise edge localization
- Sub-pixel refinement: Parabola fitting achieves <0.5px precision (~0.003cm at typical resolution)
- Quality-based fallback: Automatically uses v0 contour method if Sobel quality insufficient
- Enhanced confidence: Adds edge quality component (gradient strength, consistency, smoothness, symmetry)
v1 Processing Pipeline (Enhanced Phases)
Phase 5a: Landmark-Based Axis Estimation (v1)
- Uses MediaPipe finger landmarks directly (4 points: MCP, PIP, DIP, TIP)
- Finger selection: Defaults to index finger, can specify middle or ring finger via
--finger-index - Orientation detection uses the specified finger for axis calculation (wrist → finger tip)
- Image automatically rotated to canonical orientation (wrist at bottom, fingers pointing up)
- Three axis calculation methods:
endpoints: Simple MCP→TIP vectorlinear_fit: Linear regression on all 4 landmarks (default, most robust)median_direction: Median of segment directions
- Falls back to PCA if landmarks unavailable or quality check fails
- Validation checks: NaN/inf, minimum spacing, monotonic progression, minimum length
Phase 7b: Sobel Edge Refinement (v1)
1. Extract ROI around ring zone → 2. Apply bidirectional Sobel filters →
3. Detect edges per cross-section → 4. Sub-pixel refinement → 5. Measure width
ROI Extraction
- Rectangular region around ring zone with padding (50px for gradient context)
- Width estimation:
finger_length / 3.0(conservative) - Optional rotation alignment (not used by default)
Bidirectional Sobel Filtering
- Applies
cv2.Sobelwith configurable kernel size (3, 5, or 7) - Computes gradient_x (horizontal edges), gradient_y (vertical edges)
- Calculates gradient magnitude and direction
- Auto-detects filter orientation from ROI aspect ratio
- Applies
Edge Detection Per Cross-Section
- Mask-constrained mode (primary):
- Finds leftmost/rightmost finger mask pixels (finger boundaries)
- Searches ±10px around boundaries for strongest gradient
- Combines anatomical accuracy (mask) with sub-pixel precision (gradient)
- Gradient-only mode (fallback): Pure Sobel without mask constraint
- Mask-constrained mode (primary):
Sub-Pixel Edge Localization
- Parabola fitting: f(x) = ax² + bx + c
- Samples gradient at x-1, x, x+1
- Finds parabola peak: x_peak = -b/(2a)
- Constrains refinement to ±0.5 pixels
- Achieves <0.5px precision (~0.003cm at 185 px/cm)
Width Measurement
- Calculates width for each valid row
- Outlier filtering using Median Absolute Deviation (MAD)
- Removes measurements >3 MAD from median
- Computes median, mean, std dev
- Converts pixels to cm using scale factor
Phase 8b: Enhanced Confidence Scoring (v1)
- Adds 4th component: Edge Quality (20% weight)
- Gradient strength: Avg magnitude at detected edges
- Consistency: % of rows with valid edge pairs
- Smoothness: Edge position variance (lower = better)
- Symmetry: Left/right edge strength balance
- Reweights other components: Card 25%, Finger 25%, Measurement 30%
v1 Module Structure
| Module | v1 Enhancements |
|---|---|
geometry.py |
Added estimate_finger_axis_from_landmarks(), _validate_landmark_quality(), landmark-based zone localization |
edge_refinement.py |
[NEW] Complete Sobel edge refinement pipeline with sub-pixel precision |
confidence.py |
Added compute_edge_quality_confidence(), dual-mode confidence calculation |
debug_observer.py |
Added 9 edge refinement drawing functions for visualization |
measure_finger.py |
CLI flags for edge method selection, method comparison mode |
v1 CLI Flags
| Flag | Values | Default | Description |
|---|---|---|---|
--finger-index |
auto, index, middle, ring, pinky | index | Which finger to measure and use for orientation |
--edge-method |
auto, contour, sobel, compare | auto | Edge detection method |
--sobel-threshold |
float | 15.0 | Minimum gradient magnitude |
--sobel-kernel-size |
3, 5, 7 | 3 | Sobel kernel size |
--no-subpixel |
flag | False | Disable sub-pixel refinement |
v1 Auto Mode Behavior
When --edge-method auto (default):
- Always computes contour measurement (v0 baseline)
- Attempts Sobel edge refinement
- Evaluates Sobel quality score (threshold: 0.7)
- Checks consistency (>50% success rate required)
- Verifies width reasonableness (0.8-3.5 cm)
- Checks agreement with contour (<50% difference)
- Uses Sobel if all checks pass, otherwise falls back to contour
- Reports method used in
edge_method_usedfield
v1 Debug Output
When --debug flag used, generates:
- Main debug overlay (same as v0, shows final result)
output/edge_refinement_debug/subdirectory with 12 images:- Stage A (3): Landmark axis, ring zone, ROI extraction
- Stage B (5): Sobel gradients, candidates, selected edges
- Stage C (4): Sub-pixel refinement, widths, distribution, outliers
v1 Failure Modes (Additional)
sobel_edge_refinement_failed- Sobel method explicitly requested but failedquality_score_low_X.XX- Edge quality below threshold (auto fallback)consistency_low_X.XX- Too few valid edge detectionswidth_unreasonable- Measured width outside realistic rangedisagreement_with_contour- Sobel and contour differ by >50%
Important Technical Details
What This Measures
The system measures the external horizontal width (outer diameter) of the finger at the ring-wearing zone. This is:
- ✅ The width of soft tissue + bone at the ring-wearing position
- ❌ NOT the inner diameter of a ring
- Used as a geometric proxy for downstream ring size mapping (out of scope for v0)
Coordinate Systems
- Images use standard OpenCV format: (row, col) = (y, x)
- Most geometry functions work in (x, y) format
- Contours are Nx2 arrays in (x, y) format
- Careful conversion needed between formats (see
geometry.py:35)
MediaPipe Integration
- Uses pretrained hand landmark detection model (no custom training)
- Provides 21 hand landmarks per hand
- Each finger has 4 landmarks: MCP (base), PIP, DIP, TIP
- Finger indices: 0=thumb, 1=index, 2=middle, 3=ring, 4=pinky
- Orientation detection: Uses wrist → specified finger tip to determine hand rotation
- Automatic rotation: Image rotated to canonical orientation (wrist at bottom, fingers up) based on selected finger
Input Requirements
For optimal results:
- Resolution: 1080p or higher recommended
- View angle: Near top-down view
- Finger: One finger extended (index, middle, or ring). Specify with
--finger-index - Credit card: Must show at least 3 corners, aspect ratio ~1.586
- Finger and card must be on the same plane
- Good lighting, minimal blur
Failure Modes
The system can fail at various stages:
card_not_detected- Credit card not found or aspect ratio invalidhand_not_detected- No hand detected by MediaPipefinger_isolation_failed- Could not isolate specified fingerfinger_mask_too_small- Mask area too small after cleaningcontour_extraction_failed- Could not extract valid contouraxis_estimation_failed- PCA failed or insufficient pointszone_localization_failed- Could not define ring zonewidth_measurement_failed- No valid cross-section intersections
Output Format
JSON Output Structure
{
"finger_outer_diameter_cm": 1.78,
"confidence": 0.86,
"scale_px_per_cm": 42.3,
"quality_flags": {
"card_detected": true,
"finger_detected": true,
"view_angle_ok": true
},
"fail_reason": null
}
Debug Visualization Features
When --debug flag is used, generates an annotated image with:
- Credit card contour and corners (green)
- Finger contour (magenta, thick lines)
- Finger axis and endpoints (cyan/yellow)
- Ring-wearing zone band (yellow, semi-transparent)
- Cross-section sampling lines (orange)
- Measurement intersection points (blue circles)
- Final measurement and confidence text (large, readable font)
Code Patterns and Conventions
Error Handling
- Functions return
Noneor raise exceptions on failure - Main pipeline (
measure_finger()) returns structured output dict withfail_reason - Console logging provides detailed progress information
Type Hints
- Extensive use of type hints throughout
- Dict return types with
Dict[str, Any]for structured data - NumPy arrays typed as
np.ndarray - Literal types for enums (e.g.,
FingerIndex)
Data Flow
- All major functions return dictionaries with consistent keys
- Downstream functions accept upstream outputs directly
- Debug visualization receives all intermediate results
- Clean separation between detection, computation, and visualization
Validation and Sanity Checks
- Finger width should be in realistic range: 1.0-3.0 cm (typical: 1.4-2.4 cm)
- Credit card aspect ratio should be close to 1.586
- View angle check: scale confidence should be >0.9 for accurate measurements
- Minimum mask area threshold prevents false detections