File size: 15,427 Bytes
347d1a8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Standard Task Workflow

For tasks of implementing **new features**:
1. Read PRD.md, Plan.md, Progress.md before coding
2. Summarize current project state before implementation
3. Carry out the implementatation; after that, build and test if possible
4. Update Progress.md after changes
5. Commit with a clear, concise message

For tasks of **bug fixing**:
1. Summarize the bug, reason and solution before implementation
2. Carry out the implementation to fix the bug; build and test afterwards;
3. Update Progress.md after changes
4. Commit with a clear, concise message

For tasks of **reboot** from a new codex session:
1. Read doc/v0/PRD.md, doc/v0/Plan.md, doc/v0/Progress.md for baseline implementation
2. Read doc/v1/PRD.md, doc/v1/Plan.md, doc/v1/Progress.md for edge refinement (v1)
3. Assume this is a continuation of an existing project.
4. Summarize your understanding of the current state and propose the next concrete step without writing code yet.

## Project Overview

Ring Sizer is a **local, terminal-executable computer vision program** that measures the outer width (diameter) of a finger at the ring-wearing zone using a single RGB image. It uses a standard credit card (ISO/IEC 7810 ID-1: 85.60mm × 53.98mm) as a physical size reference for scale calibration.

**Key characteristics:**
- Single image input (JPG/PNG)
- **v1: Dual edge detection** - Landmark-based axis + Sobel gradient refinement
- MediaPipe-based hand and finger segmentation
- MediaPipe-based hand and finger segmentation
- Outputs JSON measurement data and optional debug visualization
- No cloud processing, runs entirely locally
- Python 3.8+ with OpenCV, NumPy, MediaPipe, and SciPy

## Development Commands

### Installation
```bash
# Create virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt
```

### Running the Program
```bash
# Basic measurement (defaults to index finger, auto edge detection)
python measure_finger.py --input input/test_image.jpg --output output/result.json

# Measure specific finger (index, middle, ring, or auto)
python measure_finger.py \
  --input input/test_image.jpg \
  --output output/result.json \
  --finger-index ring

# With debug visualization
python measure_finger.py \
  --input input/test_image.jpg \
  --output output/result.json \
  --finger-index middle \
  --debug output/debug_overlay.png

# Force Sobel edge refinement (v1)
python measure_finger.py \
  --input image.jpg \
  --output result.json \
  --finger-index ring \
  --edge-method sobel \
  --sobel-threshold 15.0 \
  --debug output/debug.png

# Compare both methods
python measure_finger.py \
  --input image.jpg \
  --output result.json \
  --finger-index middle \
  --edge-method compare \
  --debug output/debug.png

# Force contour method (v0)
python measure_finger.py \
  --input image.jpg \
  --output result.json \
  --finger-index index \
  --edge-method contour
```

## Architecture Overview

### Processing Pipeline (9 Phases)

The measurement pipeline follows a strict sequential flow:

1. **Image Quality Check** - Blur detection, exposure validation, resolution check
2. **Credit Card Detection & Scale Calibration** - Detects card, verifies aspect ratio (~1.586), computes `px_per_cm`
3. **Hand & Finger Segmentation** - MediaPipe hand detection, finger isolation, mask generation
4. **Finger Contour Extraction** - Extracts outer contour from cleaned mask
5. **Finger Axis Estimation** - PCA-based principal axis calculation, determines palm-end vs tip-end
6. **Ring-Wearing Zone Localization** - Defines zone at 15%-25% of finger length from palm-side
7. **Width Measurement** - Samples 20 cross-sections perpendicular to axis, uses median width
8. **Confidence Scoring** - Multi-factor scoring (card 30%, finger 30%, measurement 40%)
9. **Debug Visualization** - Generates annotated overlay image

### Module Structure

The codebase is organized into focused utility modules in `src/`:

| Module | Primary Responsibilities |
|--------|--------------------------|
| `card_detection.py` | Credit card detection, perspective correction, scale calibration (`px_per_cm`) |
| `finger_segmentation.py` | MediaPipe integration, hand/finger isolation, mask cleaning, contour extraction |
| `geometry.py` | PCA axis estimation, ring zone localization, cross-section width measurement, line-contour intersections |
| `image_quality.py` | Blur detection (Laplacian variance), exposure checks, resolution validation |
| `confidence.py` | Component confidence scoring (card, finger, measurement), overall confidence computation |
| `visualization.py` | Debug overlay generation with contours, zones, measurements, and annotations |

### Key Design Decisions

**Ring-Wearing Zone Definition:**
- Located at 15%-25% of finger length from palm-side end
- Width measured by sampling 20 cross-sections within this zone
- Final measurement is the **median width** (robust to outliers)

**Axis Estimation:**
- Uses PCA (Principal Component Analysis) on finger mask points
- Determines palm-end vs tip-end using either:
  1. MediaPipe landmarks (preferred, if available)
  2. Thickness heuristic (thinner end is likely the tip)

**Confidence Scoring:**
- 3-component weighted average: Card (30%) + Finger (30%) + Measurement (40%)
- Confidence levels: HIGH (>0.85), MEDIUM (0.6-0.85), LOW (<0.6)
- Factors: card detection quality, finger mask area, width variance, aspect ratios

**Measurement Approach:**
- Perpendicular cross-sections to finger axis
- Line-contour intersection algorithm finds left/right edges
- Uses farthest pair of intersections to handle complex contours
- Converts pixels to cm using calibrated scale factor

---

## v1 Architecture (Edge Refinement)

### What's New in v1

v1 improves measurement accuracy by replacing contour-based edge detection with gradient-based Sobel edge refinement. Key improvements:

- **Landmark-based axis**: Uses MediaPipe finger landmarks (MCP→PIP→DIP→TIP) for more anatomically consistent axis estimation
- **Sobel edge detection**: Bidirectional gradient filtering for pixel-precise edge localization
- **Sub-pixel refinement**: Parabola fitting achieves <0.5px precision (~0.003cm at typical resolution)
- **Quality-based fallback**: Automatically uses v0 contour method if Sobel quality insufficient
- **Enhanced confidence**: Adds edge quality component (gradient strength, consistency, smoothness, symmetry)

### v1 Processing Pipeline (Enhanced Phases)

**Phase 5a: Landmark-Based Axis Estimation (v1)**
- Uses MediaPipe finger landmarks directly (4 points: MCP, PIP, DIP, TIP)
- **Finger selection**: Defaults to index finger, can specify middle or ring finger via `--finger-index`
- Orientation detection uses the **specified finger** for axis calculation (wrist → finger tip)
- Image automatically rotated to canonical orientation (wrist at bottom, fingers pointing up)
- Three axis calculation methods:
  - `endpoints`: Simple MCP→TIP vector
  - `linear_fit`: Linear regression on all 4 landmarks (default, most robust)
  - `median_direction`: Median of segment directions
- Falls back to PCA if landmarks unavailable or quality check fails
- Validation checks: NaN/inf, minimum spacing, monotonic progression, minimum length

**Phase 7b: Sobel Edge Refinement (v1)**
```
1. Extract ROI around ring zone → 2. Apply bidirectional Sobel filters →
3. Detect edges per cross-section → 4. Sub-pixel refinement → 5. Measure width
```

1. **ROI Extraction**
   - Rectangular region around ring zone with padding (50px for gradient context)
   - Width estimation: `finger_length / 3.0` (conservative)
   - Optional rotation alignment (not used by default)

2. **Bidirectional Sobel Filtering**
   - Applies `cv2.Sobel` with configurable kernel size (3, 5, or 7)
   - Computes gradient_x (horizontal edges), gradient_y (vertical edges)
   - Calculates gradient magnitude and direction
   - Auto-detects filter orientation from ROI aspect ratio

3. **Edge Detection Per Cross-Section**
   - **Mask-constrained mode** (primary):
     - Finds leftmost/rightmost finger mask pixels (finger boundaries)
     - Searches ±10px around boundaries for strongest gradient
     - Combines anatomical accuracy (mask) with sub-pixel precision (gradient)
   - **Gradient-only mode** (fallback): Pure Sobel without mask constraint

4. **Sub-Pixel Edge Localization**
   - Parabola fitting: f(x) = ax² + bx + c
   - Samples gradient at x-1, x, x+1
   - Finds parabola peak: x_peak = -b/(2a)
   - Constrains refinement to ±0.5 pixels
   - Achieves <0.5px precision (~0.003cm at 185 px/cm)

5. **Width Measurement**
   - Calculates width for each valid row
   - Outlier filtering using Median Absolute Deviation (MAD)
   - Removes measurements >3 MAD from median
   - Computes median, mean, std dev
   - Converts pixels to cm using scale factor

**Phase 8b: Enhanced Confidence Scoring (v1)**
- Adds 4th component: Edge Quality (20% weight)
  - Gradient strength: Avg magnitude at detected edges
  - Consistency: % of rows with valid edge pairs
  - Smoothness: Edge position variance (lower = better)
  - Symmetry: Left/right edge strength balance
- Reweights other components: Card 25%, Finger 25%, Measurement 30%

### v1 Module Structure

| Module | v1 Enhancements |
|--------|-----------------|
| `geometry.py` | Added `estimate_finger_axis_from_landmarks()`, `_validate_landmark_quality()`, landmark-based zone localization |
| **`edge_refinement.py`** | **[NEW]** Complete Sobel edge refinement pipeline with sub-pixel precision |
| `confidence.py` | Added `compute_edge_quality_confidence()`, dual-mode confidence calculation |
| `debug_observer.py` | Added 9 edge refinement drawing functions for visualization |
| `measure_finger.py` | CLI flags for edge method selection, method comparison mode |

### v1 CLI Flags

| Flag | Values | Default | Description |
|------|--------|---------|-------------|
| `--finger-index` | auto, index, middle, ring, pinky | **index** | Which finger to measure and use for orientation |
| `--edge-method` | auto, contour, sobel, compare | auto | Edge detection method |
| `--sobel-threshold` | float | 15.0 | Minimum gradient magnitude |
| `--sobel-kernel-size` | 3, 5, 7 | 3 | Sobel kernel size |
| `--no-subpixel` | flag | False | Disable sub-pixel refinement |

### v1 Auto Mode Behavior

When `--edge-method auto` (default):
1. Always computes contour measurement (v0 baseline)
2. Attempts Sobel edge refinement
3. Evaluates Sobel quality score (threshold: 0.7)
4. Checks consistency (>50% success rate required)
5. Verifies width reasonableness (0.8-3.5 cm)
6. Checks agreement with contour (<50% difference)
7. Uses Sobel if all checks pass, otherwise falls back to contour
8. Reports method used in `edge_method_used` field

### v1 Debug Output

When `--debug` flag used, generates:
- Main debug overlay (same as v0, shows final result)
- `output/edge_refinement_debug/` subdirectory with 12 images:
  - **Stage A** (3): Landmark axis, ring zone, ROI extraction
  - **Stage B** (5): Sobel gradients, candidates, selected edges
  - **Stage C** (4): Sub-pixel refinement, widths, distribution, outliers

### v1 Failure Modes (Additional)

- `sobel_edge_refinement_failed` - Sobel method explicitly requested but failed
- `quality_score_low_X.XX` - Edge quality below threshold (auto fallback)
- `consistency_low_X.XX` - Too few valid edge detections
- `width_unreasonable` - Measured width outside realistic range
- `disagreement_with_contour` - Sobel and contour differ by >50%

---

## Important Technical Details

### What This Measures
The system measures the **external horizontal width** (outer diameter) of the finger at the ring-wearing zone. This is:
- ✅ The width of soft tissue + bone at the ring-wearing position
- ❌ NOT the inner diameter of a ring
- Used as a geometric proxy for downstream ring size mapping (out of scope for v0)

### Coordinate Systems
- Images use standard OpenCV format: (row, col) = (y, x)
- Most geometry functions work in (x, y) format
- Contours are Nx2 arrays in (x, y) format
- Careful conversion needed between formats (see `geometry.py:35`)

### MediaPipe Integration
- Uses pretrained hand landmark detection model (no custom training)
- Provides 21 hand landmarks per hand
- Each finger has 4 landmarks: MCP (base), PIP, DIP, TIP
- Finger indices: 0=thumb, 1=index, 2=middle, 3=ring, 4=pinky
- **Orientation detection**: Uses wrist → specified finger tip to determine hand rotation
- **Automatic rotation**: Image rotated to canonical orientation (wrist at bottom, fingers up) based on selected finger

### Input Requirements
For optimal results:
- Resolution: 1080p or higher recommended
- View angle: Near top-down view
- **Finger**: One finger extended (index, middle, or ring). Specify with `--finger-index`
- Credit card: Must show at least 3 corners, aspect ratio ~1.586
- Finger and card must be on the same plane
- Good lighting, minimal blur

### Failure Modes
The system can fail at various stages:
- `card_not_detected` - Credit card not found or aspect ratio invalid
- `hand_not_detected` - No hand detected by MediaPipe
- `finger_isolation_failed` - Could not isolate specified finger
- `finger_mask_too_small` - Mask area too small after cleaning
- `contour_extraction_failed` - Could not extract valid contour
- `axis_estimation_failed` - PCA failed or insufficient points
- `zone_localization_failed` - Could not define ring zone
- `width_measurement_failed` - No valid cross-section intersections

## Output Format

### JSON Output Structure
```json
{
  "finger_outer_diameter_cm": 1.78,
  "confidence": 0.86,
  "scale_px_per_cm": 42.3,
  "quality_flags": {
    "card_detected": true,
    "finger_detected": true,
    "view_angle_ok": true
  },
  "fail_reason": null
}
```

### Debug Visualization Features
When `--debug` flag is used, generates an annotated image with:
- Credit card contour and corners (green)
- Finger contour (magenta, thick lines)
- Finger axis and endpoints (cyan/yellow)
- Ring-wearing zone band (yellow, semi-transparent)
- Cross-section sampling lines (orange)
- Measurement intersection points (blue circles)
- Final measurement and confidence text (large, readable font)

## Code Patterns and Conventions

### Error Handling
- Functions return `None` or raise exceptions on failure
- Main pipeline (`measure_finger()`) returns structured output dict with `fail_reason`
- Console logging provides detailed progress information

### Type Hints
- Extensive use of type hints throughout
- Dict return types with `Dict[str, Any]` for structured data
- NumPy arrays typed as `np.ndarray`
- Literal types for enums (e.g., `FingerIndex`)

### Data Flow
- All major functions return dictionaries with consistent keys
- Downstream functions accept upstream outputs directly
- Debug visualization receives all intermediate results
- Clean separation between detection, computation, and visualization

### Validation and Sanity Checks
- Finger width should be in realistic range: 1.0-3.0 cm (typical: 1.4-2.4 cm)
- Credit card aspect ratio should be close to 1.586
- View angle check: scale confidence should be >0.9 for accurate measurements
- Minimum mask area threshold prevents false detections