Spaces:

feng-x
/

ring-sizer

Sleeping

Read doc/v0/PRD.md, doc/v0/Plan.md, doc/v0/Progress.md for baseline implementation
Read doc/v1/PRD.md, doc/v1/Plan.md, doc/v1/Progress.md for edge refinement (v1)
Assume this is a continuation of an existing project.
Summarize your understanding of the current state and propose the next concrete step without writing code yet.

Project Overview

Ring Sizer is a local, terminal-executable computer vision program that measures the outer width (diameter) of a finger at the ring-wearing zone using a single RGB image. It uses a standard credit card (ISO/IEC 7810 ID-1: 85.60mm × 53.98mm) as a physical size reference for scale calibration.

Key characteristics:

Single image input (JPG/PNG)
v1: Dual edge detection - Landmark-based axis + Sobel gradient refinement
MediaPipe-based hand and finger segmentation
MediaPipe-based hand and finger segmentation
Outputs JSON measurement data and optional debug visualization
No cloud processing, runs entirely locally
Python 3.8+ with OpenCV, NumPy, MediaPipe, and SciPy

Development Commands

Installation

# Create virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Running the Program

# Basic measurement (defaults to index finger, auto edge detection)
python measure_finger.py --input input/test_image.jpg --output output/result.json

# Measure specific finger (index, middle, ring, or auto)
python measure_finger.py \
  --input input/test_image.jpg \
  --output output/result.json \
  --finger-index ring

# With debug visualization
python measure_finger.py \
  --input input/test_image.jpg \
  --output output/result.json \
  --finger-index middle \
  --debug output/debug_overlay.png

# Force Sobel edge refinement (v1)
python measure_finger.py \
  --input image.jpg \
  --output result.json \
  --finger-index ring \
  --edge-method sobel \
  --sobel-threshold 15.0 \
  --debug output/debug.png

# Compare both methods
python measure_finger.py \
  --input image.jpg \
  --output result.json \
  --finger-index middle \
  --edge-method compare \
  --debug output/debug.png

# Force contour method (v0)
python measure_finger.py \
  --input image.jpg \
  --output result.json \
  --finger-index index \
  --edge-method contour

Architecture Overview

Processing Pipeline (9 Phases)

The measurement pipeline follows a strict sequential flow:

Image Quality Check - Blur detection, exposure validation, resolution check
Credit Card Detection & Scale Calibration - Detects card, verifies aspect ratio (~1.586), computes px_per_cm
Hand & Finger Segmentation - MediaPipe hand detection, finger isolation, mask generation
Finger Contour Extraction - Extracts outer contour from cleaned mask
Finger Axis Estimation - PCA-based principal axis calculation, determines palm-end vs tip-end
Ring-Wearing Zone Localization - Defines zone at 15%-25% of finger length from palm-side
Width Measurement - Samples 20 cross-sections perpendicular to axis, uses median width
Confidence Scoring - Multi-factor scoring (card 30%, finger 30%, measurement 40%)
Debug Visualization - Generates annotated overlay image

Module Structure

The codebase is organized into focused utility modules in src/:

Module	Primary Responsibilities
`card_detection.py`	Credit card detection, perspective correction, scale calibration (`px_per_cm`)
`finger_segmentation.py`	MediaPipe integration, hand/finger isolation, mask cleaning, contour extraction
`geometry.py`	PCA axis estimation, ring zone localization, cross-section width measurement, line-contour intersections
`image_quality.py`	Blur detection (Laplacian variance), exposure checks, resolution validation
`confidence.py`	Component confidence scoring (card, finger, measurement), overall confidence computation
`visualization.py`	Debug overlay generation with contours, zones, measurements, and annotations

Key Design Decisions

Ring-Wearing Zone Definition:

Located at 15%-25% of finger length from palm-side end
Width measured by sampling 20 cross-sections within this zone
Final measurement is the median width (robust to outliers)

Axis Estimation:

Uses PCA (Principal Component Analysis) on finger mask points
Determines palm-end vs tip-end using either:
1. MediaPipe landmarks (preferred, if available)
2. Thickness heuristic (thinner end is likely the tip)

Confidence Scoring:

3-component weighted average: Card (30%) + Finger (30%) + Measurement (40%)
Confidence levels: HIGH (>0.85), MEDIUM (0.6-0.85), LOW (<0.6)
Factors: card detection quality, finger mask area, width variance, aspect ratios

Measurement Approach:

Perpendicular cross-sections to finger axis
Line-contour intersection algorithm finds left/right edges
Uses farthest pair of intersections to handle complex contours
Converts pixels to cm using calibrated scale factor

v1 Architecture (Edge Refinement)

What's New in v1

v1 improves measurement accuracy by replacing contour-based edge detection with gradient-based Sobel edge refinement. Key improvements:

Landmark-based axis: Uses MediaPipe finger landmarks (MCP→PIP→DIP→TIP) for more anatomically consistent axis estimation
Sobel edge detection: Bidirectional gradient filtering for pixel-precise edge localization
Sub-pixel refinement: Parabola fitting achieves <0.5px precision (~0.003cm at typical resolution)
Quality-based fallback: Automatically uses v0 contour method if Sobel quality insufficient
Enhanced confidence: Adds edge quality component (gradient strength, consistency, smoothness, symmetry)

v1 Processing Pipeline (Enhanced Phases)

Phase 5a: Landmark-Based Axis Estimation (v1)

Uses MediaPipe finger landmarks directly (4 points: MCP, PIP, DIP, TIP)
Finger selection: Defaults to index finger, can specify middle or ring finger via --finger-index
Orientation detection uses the specified finger for axis calculation (wrist → finger tip)
Image automatically rotated to canonical orientation (wrist at bottom, fingers pointing up)
Three axis calculation methods:
- endpoints: Simple MCP→TIP vector
- linear_fit: Linear regression on all 4 landmarks (default, most robust)
- median_direction: Median of segment directions
Falls back to PCA if landmarks unavailable or quality check fails
Validation checks: NaN/inf, minimum spacing, monotonic progression, minimum length

Phase 7b: Sobel Edge Refinement (v1)

1. Extract ROI around ring zone → 2. Apply bidirectional Sobel filters →
3. Detect edges per cross-section → 4. Sub-pixel refinement → 5. Measure width

ROI Extraction
- Rectangular region around ring zone with padding (50px for gradient context)
- Width estimation: finger_length / 3.0 (conservative)
- Optional rotation alignment (not used by default)
Bidirectional Sobel Filtering
- Applies cv2.Sobel with configurable kernel size (3, 5, or 7)
- Computes gradient_x (horizontal edges), gradient_y (vertical edges)
- Calculates gradient magnitude and direction
- Auto-detects filter orientation from ROI aspect ratio
Edge Detection Per Cross-Section
- Mask-constrained mode (primary):
  - Finds leftmost/rightmost finger mask pixels (finger boundaries)
  - Searches ±10px around boundaries for strongest gradient
  - Combines anatomical accuracy (mask) with sub-pixel precision (gradient)
- Gradient-only mode (fallback): Pure Sobel without mask constraint
Sub-Pixel Edge Localization
- Parabola fitting: f(x) = ax² + bx + c
- Samples gradient at x-1, x, x+1
- Finds parabola peak: x_peak = -b/(2a)
- Constrains refinement to ±0.5 pixels
- Achieves <0.5px precision (~0.003cm at 185 px/cm)
Width Measurement
- Calculates width for each valid row
- Outlier filtering using Median Absolute Deviation (MAD)
- Removes measurements >3 MAD from median
- Computes median, mean, std dev
- Converts pixels to cm using scale factor

Phase 8b: Enhanced Confidence Scoring (v1)

Adds 4th component: Edge Quality (20% weight)
- Gradient strength: Avg magnitude at detected edges
- Consistency: % of rows with valid edge pairs
- Smoothness: Edge position variance (lower = better)
- Symmetry: Left/right edge strength balance
Reweights other components: Card 25%, Finger 25%, Measurement 30%

v1 Module Structure

Module	v1 Enhancements
`geometry.py`	Added `estimate_finger_axis_from_landmarks()`, `_validate_landmark_quality()`, landmark-based zone localization
`edge_refinement.py`	[NEW] Complete Sobel edge refinement pipeline with sub-pixel precision
`confidence.py`	Added `compute_edge_quality_confidence()`, dual-mode confidence calculation
`debug_observer.py`	Added 9 edge refinement drawing functions for visualization
`measure_finger.py`	CLI flags for edge method selection, method comparison mode

v1 CLI Flags

Flag	Values	Default	Description
`--finger-index`	auto, index, middle, ring, pinky	index	Which finger to measure and use for orientation
`--edge-method`	auto, contour, sobel, compare	auto	Edge detection method
`--sobel-threshold`	float	15.0	Minimum gradient magnitude
`--sobel-kernel-size`	3, 5, 7	3	Sobel kernel size
`--no-subpixel`	flag	False	Disable sub-pixel refinement

v1 Auto Mode Behavior

When --edge-method auto (default):

Always computes contour measurement (v0 baseline)
Attempts Sobel edge refinement
Evaluates Sobel quality score (threshold: 0.7)
Checks consistency (>50% success rate required)
Verifies width reasonableness (0.8-3.5 cm)
Checks agreement with contour (<50% difference)
Uses Sobel if all checks pass, otherwise falls back to contour
Reports method used in edge_method_used field

v1 Debug Output

When --debug flag used, generates:

Main debug overlay (same as v0, shows final result)
output/edge_refinement_debug/ subdirectory with 12 images:
- Stage A (3): Landmark axis, ring zone, ROI extraction
- Stage B (5): Sobel gradients, candidates, selected edges
- Stage C (4): Sub-pixel refinement, widths, distribution, outliers

v1 Failure Modes (Additional)

sobel_edge_refinement_failed - Sobel method explicitly requested but failed
quality_score_low_X.XX - Edge quality below threshold (auto fallback)
consistency_low_X.XX - Too few valid edge detections
width_unreasonable - Measured width outside realistic range
disagreement_with_contour - Sobel and contour differ by >50%

Important Technical Details

What This Measures

The system measures the external horizontal width (outer diameter) of the finger at the ring-wearing zone. This is:

✅ The width of soft tissue + bone at the ring-wearing position
❌ NOT the inner diameter of a ring
Used as a geometric proxy for downstream ring size mapping (out of scope for v0)

Coordinate Systems

Images use standard OpenCV format: (row, col) = (y, x)
Most geometry functions work in (x, y) format
Contours are Nx2 arrays in (x, y) format
Careful conversion needed between formats (see geometry.py:35)

MediaPipe Integration

Uses pretrained hand landmark detection model (no custom training)
Provides 21 hand landmarks per hand
Each finger has 4 landmarks: MCP (base), PIP, DIP, TIP
Finger indices: 0=thumb, 1=index, 2=middle, 3=ring, 4=pinky
Orientation detection: Uses wrist → specified finger tip to determine hand rotation
Automatic rotation: Image rotated to canonical orientation (wrist at bottom, fingers up) based on selected finger

Input Requirements

For optimal results:

Resolution: 1080p or higher recommended
View angle: Near top-down view
Finger: One finger extended (index, middle, or ring). Specify with --finger-index
Credit card: Must show at least 3 corners, aspect ratio ~1.586
Finger and card must be on the same plane
Good lighting, minimal blur

Failure Modes

The system can fail at various stages:

card_not_detected - Credit card not found or aspect ratio invalid
hand_not_detected - No hand detected by MediaPipe
finger_isolation_failed - Could not isolate specified finger
finger_mask_too_small - Mask area too small after cleaning
contour_extraction_failed - Could not extract valid contour
axis_estimation_failed - PCA failed or insufficient points
zone_localization_failed - Could not define ring zone
width_measurement_failed - No valid cross-section intersections

Output Format

JSON Output Structure

{
  "finger_outer_diameter_cm": 1.78,
  "confidence": 0.86,
  "scale_px_per_cm": 42.3,
  "quality_flags": {
    "card_detected": true,
    "finger_detected": true,
    "view_angle_ok": true
  },
  "fail_reason": null
}

Debug Visualization Features

When --debug flag is used, generates an annotated image with:

Credit card contour and corners (green)
Finger contour (magenta, thick lines)
Finger axis and endpoints (cyan/yellow)
Ring-wearing zone band (yellow, semi-transparent)
Cross-section sampling lines (orange)
Measurement intersection points (blue circles)
Final measurement and confidence text (large, readable font)

Code Patterns and Conventions

Error Handling

Functions return None or raise exceptions on failure
Main pipeline (measure_finger()) returns structured output dict with fail_reason
Console logging provides detailed progress information

Type Hints

Extensive use of type hints throughout
Dict return types with Dict[str, Any] for structured data
NumPy arrays typed as np.ndarray
Literal types for enums (e.g., FingerIndex)

Data Flow

All major functions return dictionaries with consistent keys
Downstream functions accept upstream outputs directly
Debug visualization receives all intermediate results
Clean separation between detection, computation, and visualization

Validation and Sanity Checks

Finger width should be in realistic range: 1.0-3.0 cm (typical: 1.4-2.4 cm)
Credit card aspect ratio should be close to 1.586
View angle check: scale confidence should be >0.9 for accurate measurements
Minimum mask area threshold prevents false detections