Spaces:
Running
Running
File size: 15,427 Bytes
347d1a8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 |
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Standard Task Workflow
For tasks of implementing **new features**:
1. Read PRD.md, Plan.md, Progress.md before coding
2. Summarize current project state before implementation
3. Carry out the implementatation; after that, build and test if possible
4. Update Progress.md after changes
5. Commit with a clear, concise message
For tasks of **bug fixing**:
1. Summarize the bug, reason and solution before implementation
2. Carry out the implementation to fix the bug; build and test afterwards;
3. Update Progress.md after changes
4. Commit with a clear, concise message
For tasks of **reboot** from a new codex session:
1. Read doc/v0/PRD.md, doc/v0/Plan.md, doc/v0/Progress.md for baseline implementation
2. Read doc/v1/PRD.md, doc/v1/Plan.md, doc/v1/Progress.md for edge refinement (v1)
3. Assume this is a continuation of an existing project.
4. Summarize your understanding of the current state and propose the next concrete step without writing code yet.
## Project Overview
Ring Sizer is a **local, terminal-executable computer vision program** that measures the outer width (diameter) of a finger at the ring-wearing zone using a single RGB image. It uses a standard credit card (ISO/IEC 7810 ID-1: 85.60mm × 53.98mm) as a physical size reference for scale calibration.
**Key characteristics:**
- Single image input (JPG/PNG)
- **v1: Dual edge detection** - Landmark-based axis + Sobel gradient refinement
- MediaPipe-based hand and finger segmentation
- MediaPipe-based hand and finger segmentation
- Outputs JSON measurement data and optional debug visualization
- No cloud processing, runs entirely locally
- Python 3.8+ with OpenCV, NumPy, MediaPipe, and SciPy
## Development Commands
### Installation
```bash
# Create virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
```
### Running the Program
```bash
# Basic measurement (defaults to index finger, auto edge detection)
python measure_finger.py --input input/test_image.jpg --output output/result.json
# Measure specific finger (index, middle, ring, or auto)
python measure_finger.py \
--input input/test_image.jpg \
--output output/result.json \
--finger-index ring
# With debug visualization
python measure_finger.py \
--input input/test_image.jpg \
--output output/result.json \
--finger-index middle \
--debug output/debug_overlay.png
# Force Sobel edge refinement (v1)
python measure_finger.py \
--input image.jpg \
--output result.json \
--finger-index ring \
--edge-method sobel \
--sobel-threshold 15.0 \
--debug output/debug.png
# Compare both methods
python measure_finger.py \
--input image.jpg \
--output result.json \
--finger-index middle \
--edge-method compare \
--debug output/debug.png
# Force contour method (v0)
python measure_finger.py \
--input image.jpg \
--output result.json \
--finger-index index \
--edge-method contour
```
## Architecture Overview
### Processing Pipeline (9 Phases)
The measurement pipeline follows a strict sequential flow:
1. **Image Quality Check** - Blur detection, exposure validation, resolution check
2. **Credit Card Detection & Scale Calibration** - Detects card, verifies aspect ratio (~1.586), computes `px_per_cm`
3. **Hand & Finger Segmentation** - MediaPipe hand detection, finger isolation, mask generation
4. **Finger Contour Extraction** - Extracts outer contour from cleaned mask
5. **Finger Axis Estimation** - PCA-based principal axis calculation, determines palm-end vs tip-end
6. **Ring-Wearing Zone Localization** - Defines zone at 15%-25% of finger length from palm-side
7. **Width Measurement** - Samples 20 cross-sections perpendicular to axis, uses median width
8. **Confidence Scoring** - Multi-factor scoring (card 30%, finger 30%, measurement 40%)
9. **Debug Visualization** - Generates annotated overlay image
### Module Structure
The codebase is organized into focused utility modules in `src/`:
| Module | Primary Responsibilities |
|--------|--------------------------|
| `card_detection.py` | Credit card detection, perspective correction, scale calibration (`px_per_cm`) |
| `finger_segmentation.py` | MediaPipe integration, hand/finger isolation, mask cleaning, contour extraction |
| `geometry.py` | PCA axis estimation, ring zone localization, cross-section width measurement, line-contour intersections |
| `image_quality.py` | Blur detection (Laplacian variance), exposure checks, resolution validation |
| `confidence.py` | Component confidence scoring (card, finger, measurement), overall confidence computation |
| `visualization.py` | Debug overlay generation with contours, zones, measurements, and annotations |
### Key Design Decisions
**Ring-Wearing Zone Definition:**
- Located at 15%-25% of finger length from palm-side end
- Width measured by sampling 20 cross-sections within this zone
- Final measurement is the **median width** (robust to outliers)
**Axis Estimation:**
- Uses PCA (Principal Component Analysis) on finger mask points
- Determines palm-end vs tip-end using either:
1. MediaPipe landmarks (preferred, if available)
2. Thickness heuristic (thinner end is likely the tip)
**Confidence Scoring:**
- 3-component weighted average: Card (30%) + Finger (30%) + Measurement (40%)
- Confidence levels: HIGH (>0.85), MEDIUM (0.6-0.85), LOW (<0.6)
- Factors: card detection quality, finger mask area, width variance, aspect ratios
**Measurement Approach:**
- Perpendicular cross-sections to finger axis
- Line-contour intersection algorithm finds left/right edges
- Uses farthest pair of intersections to handle complex contours
- Converts pixels to cm using calibrated scale factor
---
## v1 Architecture (Edge Refinement)
### What's New in v1
v1 improves measurement accuracy by replacing contour-based edge detection with gradient-based Sobel edge refinement. Key improvements:
- **Landmark-based axis**: Uses MediaPipe finger landmarks (MCP→PIP→DIP→TIP) for more anatomically consistent axis estimation
- **Sobel edge detection**: Bidirectional gradient filtering for pixel-precise edge localization
- **Sub-pixel refinement**: Parabola fitting achieves <0.5px precision (~0.003cm at typical resolution)
- **Quality-based fallback**: Automatically uses v0 contour method if Sobel quality insufficient
- **Enhanced confidence**: Adds edge quality component (gradient strength, consistency, smoothness, symmetry)
### v1 Processing Pipeline (Enhanced Phases)
**Phase 5a: Landmark-Based Axis Estimation (v1)**
- Uses MediaPipe finger landmarks directly (4 points: MCP, PIP, DIP, TIP)
- **Finger selection**: Defaults to index finger, can specify middle or ring finger via `--finger-index`
- Orientation detection uses the **specified finger** for axis calculation (wrist → finger tip)
- Image automatically rotated to canonical orientation (wrist at bottom, fingers pointing up)
- Three axis calculation methods:
- `endpoints`: Simple MCP→TIP vector
- `linear_fit`: Linear regression on all 4 landmarks (default, most robust)
- `median_direction`: Median of segment directions
- Falls back to PCA if landmarks unavailable or quality check fails
- Validation checks: NaN/inf, minimum spacing, monotonic progression, minimum length
**Phase 7b: Sobel Edge Refinement (v1)**
```
1. Extract ROI around ring zone → 2. Apply bidirectional Sobel filters →
3. Detect edges per cross-section → 4. Sub-pixel refinement → 5. Measure width
```
1. **ROI Extraction**
- Rectangular region around ring zone with padding (50px for gradient context)
- Width estimation: `finger_length / 3.0` (conservative)
- Optional rotation alignment (not used by default)
2. **Bidirectional Sobel Filtering**
- Applies `cv2.Sobel` with configurable kernel size (3, 5, or 7)
- Computes gradient_x (horizontal edges), gradient_y (vertical edges)
- Calculates gradient magnitude and direction
- Auto-detects filter orientation from ROI aspect ratio
3. **Edge Detection Per Cross-Section**
- **Mask-constrained mode** (primary):
- Finds leftmost/rightmost finger mask pixels (finger boundaries)
- Searches ±10px around boundaries for strongest gradient
- Combines anatomical accuracy (mask) with sub-pixel precision (gradient)
- **Gradient-only mode** (fallback): Pure Sobel without mask constraint
4. **Sub-Pixel Edge Localization**
- Parabola fitting: f(x) = ax² + bx + c
- Samples gradient at x-1, x, x+1
- Finds parabola peak: x_peak = -b/(2a)
- Constrains refinement to ±0.5 pixels
- Achieves <0.5px precision (~0.003cm at 185 px/cm)
5. **Width Measurement**
- Calculates width for each valid row
- Outlier filtering using Median Absolute Deviation (MAD)
- Removes measurements >3 MAD from median
- Computes median, mean, std dev
- Converts pixels to cm using scale factor
**Phase 8b: Enhanced Confidence Scoring (v1)**
- Adds 4th component: Edge Quality (20% weight)
- Gradient strength: Avg magnitude at detected edges
- Consistency: % of rows with valid edge pairs
- Smoothness: Edge position variance (lower = better)
- Symmetry: Left/right edge strength balance
- Reweights other components: Card 25%, Finger 25%, Measurement 30%
### v1 Module Structure
| Module | v1 Enhancements |
|--------|-----------------|
| `geometry.py` | Added `estimate_finger_axis_from_landmarks()`, `_validate_landmark_quality()`, landmark-based zone localization |
| **`edge_refinement.py`** | **[NEW]** Complete Sobel edge refinement pipeline with sub-pixel precision |
| `confidence.py` | Added `compute_edge_quality_confidence()`, dual-mode confidence calculation |
| `debug_observer.py` | Added 9 edge refinement drawing functions for visualization |
| `measure_finger.py` | CLI flags for edge method selection, method comparison mode |
### v1 CLI Flags
| Flag | Values | Default | Description |
|------|--------|---------|-------------|
| `--finger-index` | auto, index, middle, ring, pinky | **index** | Which finger to measure and use for orientation |
| `--edge-method` | auto, contour, sobel, compare | auto | Edge detection method |
| `--sobel-threshold` | float | 15.0 | Minimum gradient magnitude |
| `--sobel-kernel-size` | 3, 5, 7 | 3 | Sobel kernel size |
| `--no-subpixel` | flag | False | Disable sub-pixel refinement |
### v1 Auto Mode Behavior
When `--edge-method auto` (default):
1. Always computes contour measurement (v0 baseline)
2. Attempts Sobel edge refinement
3. Evaluates Sobel quality score (threshold: 0.7)
4. Checks consistency (>50% success rate required)
5. Verifies width reasonableness (0.8-3.5 cm)
6. Checks agreement with contour (<50% difference)
7. Uses Sobel if all checks pass, otherwise falls back to contour
8. Reports method used in `edge_method_used` field
### v1 Debug Output
When `--debug` flag used, generates:
- Main debug overlay (same as v0, shows final result)
- `output/edge_refinement_debug/` subdirectory with 12 images:
- **Stage A** (3): Landmark axis, ring zone, ROI extraction
- **Stage B** (5): Sobel gradients, candidates, selected edges
- **Stage C** (4): Sub-pixel refinement, widths, distribution, outliers
### v1 Failure Modes (Additional)
- `sobel_edge_refinement_failed` - Sobel method explicitly requested but failed
- `quality_score_low_X.XX` - Edge quality below threshold (auto fallback)
- `consistency_low_X.XX` - Too few valid edge detections
- `width_unreasonable` - Measured width outside realistic range
- `disagreement_with_contour` - Sobel and contour differ by >50%
---
## Important Technical Details
### What This Measures
The system measures the **external horizontal width** (outer diameter) of the finger at the ring-wearing zone. This is:
- ✅ The width of soft tissue + bone at the ring-wearing position
- ❌ NOT the inner diameter of a ring
- Used as a geometric proxy for downstream ring size mapping (out of scope for v0)
### Coordinate Systems
- Images use standard OpenCV format: (row, col) = (y, x)
- Most geometry functions work in (x, y) format
- Contours are Nx2 arrays in (x, y) format
- Careful conversion needed between formats (see `geometry.py:35`)
### MediaPipe Integration
- Uses pretrained hand landmark detection model (no custom training)
- Provides 21 hand landmarks per hand
- Each finger has 4 landmarks: MCP (base), PIP, DIP, TIP
- Finger indices: 0=thumb, 1=index, 2=middle, 3=ring, 4=pinky
- **Orientation detection**: Uses wrist → specified finger tip to determine hand rotation
- **Automatic rotation**: Image rotated to canonical orientation (wrist at bottom, fingers up) based on selected finger
### Input Requirements
For optimal results:
- Resolution: 1080p or higher recommended
- View angle: Near top-down view
- **Finger**: One finger extended (index, middle, or ring). Specify with `--finger-index`
- Credit card: Must show at least 3 corners, aspect ratio ~1.586
- Finger and card must be on the same plane
- Good lighting, minimal blur
### Failure Modes
The system can fail at various stages:
- `card_not_detected` - Credit card not found or aspect ratio invalid
- `hand_not_detected` - No hand detected by MediaPipe
- `finger_isolation_failed` - Could not isolate specified finger
- `finger_mask_too_small` - Mask area too small after cleaning
- `contour_extraction_failed` - Could not extract valid contour
- `axis_estimation_failed` - PCA failed or insufficient points
- `zone_localization_failed` - Could not define ring zone
- `width_measurement_failed` - No valid cross-section intersections
## Output Format
### JSON Output Structure
```json
{
"finger_outer_diameter_cm": 1.78,
"confidence": 0.86,
"scale_px_per_cm": 42.3,
"quality_flags": {
"card_detected": true,
"finger_detected": true,
"view_angle_ok": true
},
"fail_reason": null
}
```
### Debug Visualization Features
When `--debug` flag is used, generates an annotated image with:
- Credit card contour and corners (green)
- Finger contour (magenta, thick lines)
- Finger axis and endpoints (cyan/yellow)
- Ring-wearing zone band (yellow, semi-transparent)
- Cross-section sampling lines (orange)
- Measurement intersection points (blue circles)
- Final measurement and confidence text (large, readable font)
## Code Patterns and Conventions
### Error Handling
- Functions return `None` or raise exceptions on failure
- Main pipeline (`measure_finger()`) returns structured output dict with `fail_reason`
- Console logging provides detailed progress information
### Type Hints
- Extensive use of type hints throughout
- Dict return types with `Dict[str, Any]` for structured data
- NumPy arrays typed as `np.ndarray`
- Literal types for enums (e.g., `FingerIndex`)
### Data Flow
- All major functions return dictionaries with consistent keys
- Downstream functions accept upstream outputs directly
- Debug visualization receives all intermediate results
- Clean separation between detection, computation, and visualization
### Validation and Sanity Checks
- Finger width should be in realistic range: 1.0-3.0 cm (typical: 1.4-2.4 cm)
- Credit card aspect ratio should be close to 1.586
- View angle check: scale confidence should be >0.9 for accurate measurements
- Minimum mask area threshold prevents false detections
|