Spaces:

feng-x
/

ring-sizer

Sleeping

App Files Files Community

ring-sizer / AGENTS.md

feng-x

Upload folder using huggingface_hub

347d1a8 verified 6 days ago

preview code

raw

history blame contribute delete

15.4 kB

	# CLAUDE.md

	This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

	## Standard Task Workflow

	For tasks of implementing new features:
	1. Read PRD.md, Plan.md, Progress.md before coding
	2. Summarize current project state before implementation
	3. Carry out the implementatation; after that, build and test if possible
	4. Update Progress.md after changes
	5. Commit with a clear, concise message

	For tasks of bug fixing:
	1. Summarize the bug, reason and solution before implementation
	2. Carry out the implementation to fix the bug; build and test afterwards;
	3. Update Progress.md after changes
	4. Commit with a clear, concise message

	For tasks of reboot from a new codex session:
	1. Read doc/v0/PRD.md, doc/v0/Plan.md, doc/v0/Progress.md for baseline implementation
	2. Read doc/v1/PRD.md, doc/v1/Plan.md, doc/v1/Progress.md for edge refinement (v1)
	3. Assume this is a continuation of an existing project.
	4. Summarize your understanding of the current state and propose the next concrete step without writing code yet.

	## Project Overview

	Ring Sizer is a local, terminal-executable computer vision program that measures the outer width (diameter) of a finger at the ring-wearing zone using a single RGB image. It uses a standard credit card (ISO/IEC 7810 ID-1: 85.60mm × 53.98mm) as a physical size reference for scale calibration.

	Key characteristics:
	- Single image input (JPG/PNG)
	- v1: Dual edge detection - Landmark-based axis + Sobel gradient refinement
	- MediaPipe-based hand and finger segmentation
	- MediaPipe-based hand and finger segmentation
	- Outputs JSON measurement data and optional debug visualization
	- No cloud processing, runs entirely locally
	- Python 3.8+ with OpenCV, NumPy, MediaPipe, and SciPy

	## Development Commands

	### Installation
	```bash
	# Create virtual environment (recommended)
	python -m venv .venv
	source .venv/bin/activate # On Windows: .venv\Scripts\activate

	# Install dependencies
	pip install -r requirements.txt
	```

	### Running the Program
	```bash
	# Basic measurement (defaults to index finger, auto edge detection)
	python measure_finger.py --input input/test_image.jpg --output output/result.json

	# Measure specific finger (index, middle, ring, or auto)
	python measure_finger.py \
	--input input/test_image.jpg \
	--output output/result.json \
	--finger-index ring

	# With debug visualization
	python measure_finger.py \
	--input input/test_image.jpg \
	--output output/result.json \
	--finger-index middle \
	--debug output/debug_overlay.png

	# Force Sobel edge refinement (v1)
	python measure_finger.py \
	--input image.jpg \
	--output result.json \
	--finger-index ring \
	--edge-method sobel \
	--sobel-threshold 15.0 \
	--debug output/debug.png

	# Compare both methods
	python measure_finger.py \
	--input image.jpg \
	--output result.json \
	--finger-index middle \
	--edge-method compare \
	--debug output/debug.png

	# Force contour method (v0)
	python measure_finger.py \
	--input image.jpg \
	--output result.json \
	--finger-index index \
	--edge-method contour
	```

	## Architecture Overview

	### Processing Pipeline (9 Phases)

	The measurement pipeline follows a strict sequential flow:

	1. Image Quality Check - Blur detection, exposure validation, resolution check
	2. Credit Card Detection & Scale Calibration - Detects card, verifies aspect ratio (~1.586), computes `px_per_cm`
	3. Hand & Finger Segmentation - MediaPipe hand detection, finger isolation, mask generation
	4. Finger Contour Extraction - Extracts outer contour from cleaned mask
	5. Finger Axis Estimation - PCA-based principal axis calculation, determines palm-end vs tip-end
	6. Ring-Wearing Zone Localization - Defines zone at 15%-25% of finger length from palm-side
	7. Width Measurement - Samples 20 cross-sections perpendicular to axis, uses median width
	8. Confidence Scoring - Multi-factor scoring (card 30%, finger 30%, measurement 40%)
	9. Debug Visualization - Generates annotated overlay image

	### Module Structure

	The codebase is organized into focused utility modules in `src/`:

	\| Module \| Primary Responsibilities \|
	\|--------\|--------------------------\|
	\| `card_detection.py` \| Credit card detection, perspective correction, scale calibration (`px_per_cm`) \|
	\| `finger_segmentation.py` \| MediaPipe integration, hand/finger isolation, mask cleaning, contour extraction \|
	\| `geometry.py` \| PCA axis estimation, ring zone localization, cross-section width measurement, line-contour intersections \|
	\| `image_quality.py` \| Blur detection (Laplacian variance), exposure checks, resolution validation \|
	\| `confidence.py` \| Component confidence scoring (card, finger, measurement), overall confidence computation \|
	\| `visualization.py` \| Debug overlay generation with contours, zones, measurements, and annotations \|

	### Key Design Decisions

	Ring-Wearing Zone Definition:
	- Located at 15%-25% of finger length from palm-side end
	- Width measured by sampling 20 cross-sections within this zone
	- Final measurement is the median width (robust to outliers)

	Axis Estimation:
	- Uses PCA (Principal Component Analysis) on finger mask points
	- Determines palm-end vs tip-end using either:
	1. MediaPipe landmarks (preferred, if available)
	2. Thickness heuristic (thinner end is likely the tip)

	Confidence Scoring:
	- 3-component weighted average: Card (30%) + Finger (30%) + Measurement (40%)
	- Confidence levels: HIGH (>0.85), MEDIUM (0.6-0.85), LOW (<0.6)
	- Factors: card detection quality, finger mask area, width variance, aspect ratios

	Measurement Approach:
	- Perpendicular cross-sections to finger axis
	- Line-contour intersection algorithm finds left/right edges
	- Uses farthest pair of intersections to handle complex contours
	- Converts pixels to cm using calibrated scale factor

	---

	## v1 Architecture (Edge Refinement)

	### What's New in v1

	v1 improves measurement accuracy by replacing contour-based edge detection with gradient-based Sobel edge refinement. Key improvements:

	- Landmark-based axis: Uses MediaPipe finger landmarks (MCP→PIP→DIP→TIP) for more anatomically consistent axis estimation
	- Sobel edge detection: Bidirectional gradient filtering for pixel-precise edge localization
	- Sub-pixel refinement: Parabola fitting achieves <0.5px precision (~0.003cm at typical resolution)
	- Quality-based fallback: Automatically uses v0 contour method if Sobel quality insufficient
	- Enhanced confidence: Adds edge quality component (gradient strength, consistency, smoothness, symmetry)

	### v1 Processing Pipeline (Enhanced Phases)

	Phase 5a: Landmark-Based Axis Estimation (v1)
	- Uses MediaPipe finger landmarks directly (4 points: MCP, PIP, DIP, TIP)
	- Finger selection: Defaults to index finger, can specify middle or ring finger via `--finger-index`
	- Orientation detection uses the specified finger for axis calculation (wrist → finger tip)
	- Image automatically rotated to canonical orientation (wrist at bottom, fingers pointing up)
	- Three axis calculation methods:
	- `endpoints`: Simple MCP→TIP vector
	- `linear_fit`: Linear regression on all 4 landmarks (default, most robust)
	- `median_direction`: Median of segment directions
	- Falls back to PCA if landmarks unavailable or quality check fails
	- Validation checks: NaN/inf, minimum spacing, monotonic progression, minimum length

	Phase 7b: Sobel Edge Refinement (v1)
	```
	1. Extract ROI around ring zone → 2. Apply bidirectional Sobel filters →
	3. Detect edges per cross-section → 4. Sub-pixel refinement → 5. Measure width
	```

	1. ROI Extraction
	- Rectangular region around ring zone with padding (50px for gradient context)
	- Width estimation: `finger_length / 3.0` (conservative)
	- Optional rotation alignment (not used by default)

	2. Bidirectional Sobel Filtering
	- Applies `cv2.Sobel` with configurable kernel size (3, 5, or 7)
	- Computes gradient_x (horizontal edges), gradient_y (vertical edges)
	- Calculates gradient magnitude and direction
	- Auto-detects filter orientation from ROI aspect ratio

	3. Edge Detection Per Cross-Section
	- Mask-constrained mode (primary):
	- Finds leftmost/rightmost finger mask pixels (finger boundaries)
	- Searches ±10px around boundaries for strongest gradient
	- Combines anatomical accuracy (mask) with sub-pixel precision (gradient)
	- Gradient-only mode (fallback): Pure Sobel without mask constraint

	4. Sub-Pixel Edge Localization
	- Parabola fitting: f(x) = ax² + bx + c
	- Samples gradient at x-1, x, x+1
	- Finds parabola peak: x_peak = -b/(2a)
	- Constrains refinement to ±0.5 pixels
	- Achieves <0.5px precision (~0.003cm at 185 px/cm)

	5. Width Measurement
	- Calculates width for each valid row
	- Outlier filtering using Median Absolute Deviation (MAD)
	- Removes measurements >3 MAD from median
	- Computes median, mean, std dev
	- Converts pixels to cm using scale factor

	Phase 8b: Enhanced Confidence Scoring (v1)
	- Adds 4th component: Edge Quality (20% weight)
	- Gradient strength: Avg magnitude at detected edges
	- Consistency: % of rows with valid edge pairs
	- Smoothness: Edge position variance (lower = better)
	- Symmetry: Left/right edge strength balance
	- Reweights other components: Card 25%, Finger 25%, Measurement 30%

	### v1 Module Structure

	\| Module \| v1 Enhancements \|
	\|--------\|-----------------\|
	\| `geometry.py` \| Added `estimate_finger_axis_from_landmarks()`, `_validate_landmark_quality()`, landmark-based zone localization \|
	\| `edge_refinement.py` \| [NEW] Complete Sobel edge refinement pipeline with sub-pixel precision \|
	\| `confidence.py` \| Added `compute_edge_quality_confidence()`, dual-mode confidence calculation \|
	\| `debug_observer.py` \| Added 9 edge refinement drawing functions for visualization \|
	\| `measure_finger.py` \| CLI flags for edge method selection, method comparison mode \|

	### v1 CLI Flags

	\| Flag \| Values \| Default \| Description \|
	\|------\|--------\|---------\|-------------\|
	\| `--finger-index` \| auto, index, middle, ring, pinky \| index \| Which finger to measure and use for orientation \|
	\| `--edge-method` \| auto, contour, sobel, compare \| auto \| Edge detection method \|
	\| `--sobel-threshold` \| float \| 15.0 \| Minimum gradient magnitude \|
	\| `--sobel-kernel-size` \| 3, 5, 7 \| 3 \| Sobel kernel size \|
	\| `--no-subpixel` \| flag \| False \| Disable sub-pixel refinement \|

	### v1 Auto Mode Behavior

	When `--edge-method auto` (default):
	1. Always computes contour measurement (v0 baseline)
	2. Attempts Sobel edge refinement
	3. Evaluates Sobel quality score (threshold: 0.7)
	4. Checks consistency (>50% success rate required)
	5. Verifies width reasonableness (0.8-3.5 cm)
	6. Checks agreement with contour (<50% difference)
	7. Uses Sobel if all checks pass, otherwise falls back to contour
	8. Reports method used in `edge_method_used` field

	### v1 Debug Output

	When `--debug` flag used, generates:
	- Main debug overlay (same as v0, shows final result)
	- `output/edge_refinement_debug/` subdirectory with 12 images:
	- Stage A (3): Landmark axis, ring zone, ROI extraction
	- Stage B (5): Sobel gradients, candidates, selected edges
	- Stage C (4): Sub-pixel refinement, widths, distribution, outliers

	### v1 Failure Modes (Additional)

	- `sobel_edge_refinement_failed` - Sobel method explicitly requested but failed
	- `quality_score_low_X.XX` - Edge quality below threshold (auto fallback)
	- `consistency_low_X.XX` - Too few valid edge detections
	- `width_unreasonable` - Measured width outside realistic range
	- `disagreement_with_contour` - Sobel and contour differ by >50%

	---

	## Important Technical Details

	### What This Measures
	The system measures the external horizontal width (outer diameter) of the finger at the ring-wearing zone. This is:
	- ✅ The width of soft tissue + bone at the ring-wearing position
	- ❌ NOT the inner diameter of a ring
	- Used as a geometric proxy for downstream ring size mapping (out of scope for v0)

	### Coordinate Systems
	- Images use standard OpenCV format: (row, col) = (y, x)
	- Most geometry functions work in (x, y) format
	- Contours are Nx2 arrays in (x, y) format
	- Careful conversion needed between formats (see `geometry.py:35`)

	### MediaPipe Integration
	- Uses pretrained hand landmark detection model (no custom training)
	- Provides 21 hand landmarks per hand
	- Each finger has 4 landmarks: MCP (base), PIP, DIP, TIP
	- Finger indices: 0=thumb, 1=index, 2=middle, 3=ring, 4=pinky
	- Orientation detection: Uses wrist → specified finger tip to determine hand rotation
	- Automatic rotation: Image rotated to canonical orientation (wrist at bottom, fingers up) based on selected finger

	### Input Requirements
	For optimal results:
	- Resolution: 1080p or higher recommended
	- View angle: Near top-down view
	- Finger: One finger extended (index, middle, or ring). Specify with `--finger-index`
	- Credit card: Must show at least 3 corners, aspect ratio ~1.586
	- Finger and card must be on the same plane
	- Good lighting, minimal blur

	### Failure Modes
	The system can fail at various stages:
	- `card_not_detected` - Credit card not found or aspect ratio invalid
	- `hand_not_detected` - No hand detected by MediaPipe
	- `finger_isolation_failed` - Could not isolate specified finger
	- `finger_mask_too_small` - Mask area too small after cleaning
	- `contour_extraction_failed` - Could not extract valid contour
	- `axis_estimation_failed` - PCA failed or insufficient points
	- `zone_localization_failed` - Could not define ring zone
	- `width_measurement_failed` - No valid cross-section intersections

	## Output Format

	### JSON Output Structure
	```json
	{
	"finger_outer_diameter_cm": 1.78,
	"confidence": 0.86,
	"scale_px_per_cm": 42.3,
	"quality_flags": {
	"card_detected": true,
	"finger_detected": true,
	"view_angle_ok": true
	},
	"fail_reason": null
	}
	```

	### Debug Visualization Features
	When `--debug` flag is used, generates an annotated image with:
	- Credit card contour and corners (green)
	- Finger contour (magenta, thick lines)
	- Finger axis and endpoints (cyan/yellow)
	- Ring-wearing zone band (yellow, semi-transparent)
	- Cross-section sampling lines (orange)
	- Measurement intersection points (blue circles)
	- Final measurement and confidence text (large, readable font)

	## Code Patterns and Conventions

	### Error Handling
	- Functions return `None` or raise exceptions on failure
	- Main pipeline (`measure_finger()`) returns structured output dict with `fail_reason`
	- Console logging provides detailed progress information

	### Type Hints
	- Extensive use of type hints throughout
	- Dict return types with `Dict[str, Any]` for structured data
	- NumPy arrays typed as `np.ndarray`
	- Literal types for enums (e.g., `FingerIndex`)

	### Data Flow
	- All major functions return dictionaries with consistent keys
	- Downstream functions accept upstream outputs directly
	- Debug visualization receives all intermediate results
	- Clean separation between detection, computation, and visualization

	### Validation and Sanity Checks
	- Finger width should be in realistic range: 1.0-3.0 cm (typical: 1.4-2.4 cm)
	- Credit card aspect ratio should be close to 1.586
	- View angle check: scale confidence should be >0.9 for accurate measurements
	- Minimum mask area threshold prevents false detections