feng-x commited on
Commit
347d1a8
·
verified ·
1 Parent(s): e19f48f

Upload folder using huggingface_hub

Browse files
.dockerignore ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ .venv/
2
+ .git/
3
+ __pycache__/
4
+ *.pyc
5
+ output/
6
+ web_demo/uploads/
7
+ web_demo/results/
8
+ doc/
9
+ .claude/
.gitattributes CHANGED
@@ -1,35 +1,3 @@
1
- *.7z filter=lfs diff=lfs merge=lfs -text
2
- *.arrow filter=lfs diff=lfs merge=lfs -text
3
- *.bin filter=lfs diff=lfs merge=lfs -text
4
- *.bz2 filter=lfs diff=lfs merge=lfs -text
5
- *.ckpt filter=lfs diff=lfs merge=lfs -text
6
- *.ftz filter=lfs diff=lfs merge=lfs -text
7
- *.gz filter=lfs diff=lfs merge=lfs -text
8
- *.h5 filter=lfs diff=lfs merge=lfs -text
9
- *.joblib filter=lfs diff=lfs merge=lfs -text
10
- *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
- *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
- *.model filter=lfs diff=lfs merge=lfs -text
13
- *.msgpack filter=lfs diff=lfs merge=lfs -text
14
- *.npy filter=lfs diff=lfs merge=lfs -text
15
- *.npz filter=lfs diff=lfs merge=lfs -text
16
- *.onnx filter=lfs diff=lfs merge=lfs -text
17
- *.ot filter=lfs diff=lfs merge=lfs -text
18
- *.parquet filter=lfs diff=lfs merge=lfs -text
19
- *.pb filter=lfs diff=lfs merge=lfs -text
20
- *.pickle filter=lfs diff=lfs merge=lfs -text
21
- *.pkl filter=lfs diff=lfs merge=lfs -text
22
- *.pt filter=lfs diff=lfs merge=lfs -text
23
- *.pth filter=lfs diff=lfs merge=lfs -text
24
- *.rar filter=lfs diff=lfs merge=lfs -text
25
- *.safetensors filter=lfs diff=lfs merge=lfs -text
26
- saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
- *.tar.* filter=lfs diff=lfs merge=lfs -text
28
- *.tar filter=lfs diff=lfs merge=lfs -text
29
- *.tflite filter=lfs diff=lfs merge=lfs -text
30
- *.tgz filter=lfs diff=lfs merge=lfs -text
31
- *.wasm filter=lfs diff=lfs merge=lfs -text
32
- *.xz filter=lfs diff=lfs merge=lfs -text
33
- *.zip filter=lfs diff=lfs merge=lfs -text
34
- *.zst filter=lfs diff=lfs merge=lfs -text
35
- *tfevents* filter=lfs diff=lfs merge=lfs -text
 
1
+ \*.png filter=lfs diff=lfs merge=lfs -text
2
+ \*.jpg filter=lfs diff=lfs merge=lfs -text
3
+ web_demo/static/examples/default_sample.jpg filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
.gitignore ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .Python
7
+ .venv/
8
+ ENV/
9
+
10
+ # IDE
11
+ .idea/
12
+ .vscode/
13
+ *.swp
14
+ *.swo
15
+
16
+ # Project outputs (keep structure, ignore contents)
17
+ output/
18
+ output/*.json
19
+ output/*.png
20
+ output/*.jpg
21
+ output/intermediate/
22
+ output/card_detection_debug/
23
+ output/finger_segmentation_debug/
24
+ output/edge_refinement_debug/
25
+ web_demo/uploads/
26
+ web_demo/results/
27
+
28
+ # Downloaded models (auto-downloaded on first run)
29
+ .model/*.task
30
+
31
+ # Test artifacts
32
+ input/
33
+ input/test*.jpg
34
+ input/*.heic
35
+
36
+ # OS
37
+ .DS_Store
38
+ Thumbs.db
.model/.gitkeep ADDED
File without changes
AGENTS.md ADDED
@@ -0,0 +1,364 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Standard Task Workflow
6
+
7
+ For tasks of implementing **new features**:
8
+ 1. Read PRD.md, Plan.md, Progress.md before coding
9
+ 2. Summarize current project state before implementation
10
+ 3. Carry out the implementatation; after that, build and test if possible
11
+ 4. Update Progress.md after changes
12
+ 5. Commit with a clear, concise message
13
+
14
+ For tasks of **bug fixing**:
15
+ 1. Summarize the bug, reason and solution before implementation
16
+ 2. Carry out the implementation to fix the bug; build and test afterwards;
17
+ 3. Update Progress.md after changes
18
+ 4. Commit with a clear, concise message
19
+
20
+ For tasks of **reboot** from a new codex session:
21
+ 1. Read doc/v0/PRD.md, doc/v0/Plan.md, doc/v0/Progress.md for baseline implementation
22
+ 2. Read doc/v1/PRD.md, doc/v1/Plan.md, doc/v1/Progress.md for edge refinement (v1)
23
+ 3. Assume this is a continuation of an existing project.
24
+ 4. Summarize your understanding of the current state and propose the next concrete step without writing code yet.
25
+
26
+ ## Project Overview
27
+
28
+ Ring Sizer is a **local, terminal-executable computer vision program** that measures the outer width (diameter) of a finger at the ring-wearing zone using a single RGB image. It uses a standard credit card (ISO/IEC 7810 ID-1: 85.60mm × 53.98mm) as a physical size reference for scale calibration.
29
+
30
+ **Key characteristics:**
31
+ - Single image input (JPG/PNG)
32
+ - **v1: Dual edge detection** - Landmark-based axis + Sobel gradient refinement
33
+ - MediaPipe-based hand and finger segmentation
34
+ - MediaPipe-based hand and finger segmentation
35
+ - Outputs JSON measurement data and optional debug visualization
36
+ - No cloud processing, runs entirely locally
37
+ - Python 3.8+ with OpenCV, NumPy, MediaPipe, and SciPy
38
+
39
+ ## Development Commands
40
+
41
+ ### Installation
42
+ ```bash
43
+ # Create virtual environment (recommended)
44
+ python -m venv .venv
45
+ source .venv/bin/activate # On Windows: .venv\Scripts\activate
46
+
47
+ # Install dependencies
48
+ pip install -r requirements.txt
49
+ ```
50
+
51
+ ### Running the Program
52
+ ```bash
53
+ # Basic measurement (defaults to index finger, auto edge detection)
54
+ python measure_finger.py --input input/test_image.jpg --output output/result.json
55
+
56
+ # Measure specific finger (index, middle, ring, or auto)
57
+ python measure_finger.py \
58
+ --input input/test_image.jpg \
59
+ --output output/result.json \
60
+ --finger-index ring
61
+
62
+ # With debug visualization
63
+ python measure_finger.py \
64
+ --input input/test_image.jpg \
65
+ --output output/result.json \
66
+ --finger-index middle \
67
+ --debug output/debug_overlay.png
68
+
69
+ # Force Sobel edge refinement (v1)
70
+ python measure_finger.py \
71
+ --input image.jpg \
72
+ --output result.json \
73
+ --finger-index ring \
74
+ --edge-method sobel \
75
+ --sobel-threshold 15.0 \
76
+ --debug output/debug.png
77
+
78
+ # Compare both methods
79
+ python measure_finger.py \
80
+ --input image.jpg \
81
+ --output result.json \
82
+ --finger-index middle \
83
+ --edge-method compare \
84
+ --debug output/debug.png
85
+
86
+ # Force contour method (v0)
87
+ python measure_finger.py \
88
+ --input image.jpg \
89
+ --output result.json \
90
+ --finger-index index \
91
+ --edge-method contour
92
+ ```
93
+
94
+ ## Architecture Overview
95
+
96
+ ### Processing Pipeline (9 Phases)
97
+
98
+ The measurement pipeline follows a strict sequential flow:
99
+
100
+ 1. **Image Quality Check** - Blur detection, exposure validation, resolution check
101
+ 2. **Credit Card Detection & Scale Calibration** - Detects card, verifies aspect ratio (~1.586), computes `px_per_cm`
102
+ 3. **Hand & Finger Segmentation** - MediaPipe hand detection, finger isolation, mask generation
103
+ 4. **Finger Contour Extraction** - Extracts outer contour from cleaned mask
104
+ 5. **Finger Axis Estimation** - PCA-based principal axis calculation, determines palm-end vs tip-end
105
+ 6. **Ring-Wearing Zone Localization** - Defines zone at 15%-25% of finger length from palm-side
106
+ 7. **Width Measurement** - Samples 20 cross-sections perpendicular to axis, uses median width
107
+ 8. **Confidence Scoring** - Multi-factor scoring (card 30%, finger 30%, measurement 40%)
108
+ 9. **Debug Visualization** - Generates annotated overlay image
109
+
110
+ ### Module Structure
111
+
112
+ The codebase is organized into focused utility modules in `src/`:
113
+
114
+ | Module | Primary Responsibilities |
115
+ |--------|--------------------------|
116
+ | `card_detection.py` | Credit card detection, perspective correction, scale calibration (`px_per_cm`) |
117
+ | `finger_segmentation.py` | MediaPipe integration, hand/finger isolation, mask cleaning, contour extraction |
118
+ | `geometry.py` | PCA axis estimation, ring zone localization, cross-section width measurement, line-contour intersections |
119
+ | `image_quality.py` | Blur detection (Laplacian variance), exposure checks, resolution validation |
120
+ | `confidence.py` | Component confidence scoring (card, finger, measurement), overall confidence computation |
121
+ | `visualization.py` | Debug overlay generation with contours, zones, measurements, and annotations |
122
+
123
+ ### Key Design Decisions
124
+
125
+ **Ring-Wearing Zone Definition:**
126
+ - Located at 15%-25% of finger length from palm-side end
127
+ - Width measured by sampling 20 cross-sections within this zone
128
+ - Final measurement is the **median width** (robust to outliers)
129
+
130
+ **Axis Estimation:**
131
+ - Uses PCA (Principal Component Analysis) on finger mask points
132
+ - Determines palm-end vs tip-end using either:
133
+ 1. MediaPipe landmarks (preferred, if available)
134
+ 2. Thickness heuristic (thinner end is likely the tip)
135
+
136
+ **Confidence Scoring:**
137
+ - 3-component weighted average: Card (30%) + Finger (30%) + Measurement (40%)
138
+ - Confidence levels: HIGH (>0.85), MEDIUM (0.6-0.85), LOW (<0.6)
139
+ - Factors: card detection quality, finger mask area, width variance, aspect ratios
140
+
141
+ **Measurement Approach:**
142
+ - Perpendicular cross-sections to finger axis
143
+ - Line-contour intersection algorithm finds left/right edges
144
+ - Uses farthest pair of intersections to handle complex contours
145
+ - Converts pixels to cm using calibrated scale factor
146
+
147
+ ---
148
+
149
+ ## v1 Architecture (Edge Refinement)
150
+
151
+ ### What's New in v1
152
+
153
+ v1 improves measurement accuracy by replacing contour-based edge detection with gradient-based Sobel edge refinement. Key improvements:
154
+
155
+ - **Landmark-based axis**: Uses MediaPipe finger landmarks (MCP→PIP→DIP→TIP) for more anatomically consistent axis estimation
156
+ - **Sobel edge detection**: Bidirectional gradient filtering for pixel-precise edge localization
157
+ - **Sub-pixel refinement**: Parabola fitting achieves <0.5px precision (~0.003cm at typical resolution)
158
+ - **Quality-based fallback**: Automatically uses v0 contour method if Sobel quality insufficient
159
+ - **Enhanced confidence**: Adds edge quality component (gradient strength, consistency, smoothness, symmetry)
160
+
161
+ ### v1 Processing Pipeline (Enhanced Phases)
162
+
163
+ **Phase 5a: Landmark-Based Axis Estimation (v1)**
164
+ - Uses MediaPipe finger landmarks directly (4 points: MCP, PIP, DIP, TIP)
165
+ - **Finger selection**: Defaults to index finger, can specify middle or ring finger via `--finger-index`
166
+ - Orientation detection uses the **specified finger** for axis calculation (wrist → finger tip)
167
+ - Image automatically rotated to canonical orientation (wrist at bottom, fingers pointing up)
168
+ - Three axis calculation methods:
169
+ - `endpoints`: Simple MCP→TIP vector
170
+ - `linear_fit`: Linear regression on all 4 landmarks (default, most robust)
171
+ - `median_direction`: Median of segment directions
172
+ - Falls back to PCA if landmarks unavailable or quality check fails
173
+ - Validation checks: NaN/inf, minimum spacing, monotonic progression, minimum length
174
+
175
+ **Phase 7b: Sobel Edge Refinement (v1)**
176
+ ```
177
+ 1. Extract ROI around ring zone → 2. Apply bidirectional Sobel filters →
178
+ 3. Detect edges per cross-section → 4. Sub-pixel refinement → 5. Measure width
179
+ ```
180
+
181
+ 1. **ROI Extraction**
182
+ - Rectangular region around ring zone with padding (50px for gradient context)
183
+ - Width estimation: `finger_length / 3.0` (conservative)
184
+ - Optional rotation alignment (not used by default)
185
+
186
+ 2. **Bidirectional Sobel Filtering**
187
+ - Applies `cv2.Sobel` with configurable kernel size (3, 5, or 7)
188
+ - Computes gradient_x (horizontal edges), gradient_y (vertical edges)
189
+ - Calculates gradient magnitude and direction
190
+ - Auto-detects filter orientation from ROI aspect ratio
191
+
192
+ 3. **Edge Detection Per Cross-Section**
193
+ - **Mask-constrained mode** (primary):
194
+ - Finds leftmost/rightmost finger mask pixels (finger boundaries)
195
+ - Searches ±10px around boundaries for strongest gradient
196
+ - Combines anatomical accuracy (mask) with sub-pixel precision (gradient)
197
+ - **Gradient-only mode** (fallback): Pure Sobel without mask constraint
198
+
199
+ 4. **Sub-Pixel Edge Localization**
200
+ - Parabola fitting: f(x) = ax² + bx + c
201
+ - Samples gradient at x-1, x, x+1
202
+ - Finds parabola peak: x_peak = -b/(2a)
203
+ - Constrains refinement to ±0.5 pixels
204
+ - Achieves <0.5px precision (~0.003cm at 185 px/cm)
205
+
206
+ 5. **Width Measurement**
207
+ - Calculates width for each valid row
208
+ - Outlier filtering using Median Absolute Deviation (MAD)
209
+ - Removes measurements >3 MAD from median
210
+ - Computes median, mean, std dev
211
+ - Converts pixels to cm using scale factor
212
+
213
+ **Phase 8b: Enhanced Confidence Scoring (v1)**
214
+ - Adds 4th component: Edge Quality (20% weight)
215
+ - Gradient strength: Avg magnitude at detected edges
216
+ - Consistency: % of rows with valid edge pairs
217
+ - Smoothness: Edge position variance (lower = better)
218
+ - Symmetry: Left/right edge strength balance
219
+ - Reweights other components: Card 25%, Finger 25%, Measurement 30%
220
+
221
+ ### v1 Module Structure
222
+
223
+ | Module | v1 Enhancements |
224
+ |--------|-----------------|
225
+ | `geometry.py` | Added `estimate_finger_axis_from_landmarks()`, `_validate_landmark_quality()`, landmark-based zone localization |
226
+ | **`edge_refinement.py`** | **[NEW]** Complete Sobel edge refinement pipeline with sub-pixel precision |
227
+ | `confidence.py` | Added `compute_edge_quality_confidence()`, dual-mode confidence calculation |
228
+ | `debug_observer.py` | Added 9 edge refinement drawing functions for visualization |
229
+ | `measure_finger.py` | CLI flags for edge method selection, method comparison mode |
230
+
231
+ ### v1 CLI Flags
232
+
233
+ | Flag | Values | Default | Description |
234
+ |------|--------|---------|-------------|
235
+ | `--finger-index` | auto, index, middle, ring, pinky | **index** | Which finger to measure and use for orientation |
236
+ | `--edge-method` | auto, contour, sobel, compare | auto | Edge detection method |
237
+ | `--sobel-threshold` | float | 15.0 | Minimum gradient magnitude |
238
+ | `--sobel-kernel-size` | 3, 5, 7 | 3 | Sobel kernel size |
239
+ | `--no-subpixel` | flag | False | Disable sub-pixel refinement |
240
+
241
+ ### v1 Auto Mode Behavior
242
+
243
+ When `--edge-method auto` (default):
244
+ 1. Always computes contour measurement (v0 baseline)
245
+ 2. Attempts Sobel edge refinement
246
+ 3. Evaluates Sobel quality score (threshold: 0.7)
247
+ 4. Checks consistency (>50% success rate required)
248
+ 5. Verifies width reasonableness (0.8-3.5 cm)
249
+ 6. Checks agreement with contour (<50% difference)
250
+ 7. Uses Sobel if all checks pass, otherwise falls back to contour
251
+ 8. Reports method used in `edge_method_used` field
252
+
253
+ ### v1 Debug Output
254
+
255
+ When `--debug` flag used, generates:
256
+ - Main debug overlay (same as v0, shows final result)
257
+ - `output/edge_refinement_debug/` subdirectory with 12 images:
258
+ - **Stage A** (3): Landmark axis, ring zone, ROI extraction
259
+ - **Stage B** (5): Sobel gradients, candidates, selected edges
260
+ - **Stage C** (4): Sub-pixel refinement, widths, distribution, outliers
261
+
262
+ ### v1 Failure Modes (Additional)
263
+
264
+ - `sobel_edge_refinement_failed` - Sobel method explicitly requested but failed
265
+ - `quality_score_low_X.XX` - Edge quality below threshold (auto fallback)
266
+ - `consistency_low_X.XX` - Too few valid edge detections
267
+ - `width_unreasonable` - Measured width outside realistic range
268
+ - `disagreement_with_contour` - Sobel and contour differ by >50%
269
+
270
+ ---
271
+
272
+ ## Important Technical Details
273
+
274
+ ### What This Measures
275
+ The system measures the **external horizontal width** (outer diameter) of the finger at the ring-wearing zone. This is:
276
+ - ✅ The width of soft tissue + bone at the ring-wearing position
277
+ - ❌ NOT the inner diameter of a ring
278
+ - Used as a geometric proxy for downstream ring size mapping (out of scope for v0)
279
+
280
+ ### Coordinate Systems
281
+ - Images use standard OpenCV format: (row, col) = (y, x)
282
+ - Most geometry functions work in (x, y) format
283
+ - Contours are Nx2 arrays in (x, y) format
284
+ - Careful conversion needed between formats (see `geometry.py:35`)
285
+
286
+ ### MediaPipe Integration
287
+ - Uses pretrained hand landmark detection model (no custom training)
288
+ - Provides 21 hand landmarks per hand
289
+ - Each finger has 4 landmarks: MCP (base), PIP, DIP, TIP
290
+ - Finger indices: 0=thumb, 1=index, 2=middle, 3=ring, 4=pinky
291
+ - **Orientation detection**: Uses wrist → specified finger tip to determine hand rotation
292
+ - **Automatic rotation**: Image rotated to canonical orientation (wrist at bottom, fingers up) based on selected finger
293
+
294
+ ### Input Requirements
295
+ For optimal results:
296
+ - Resolution: 1080p or higher recommended
297
+ - View angle: Near top-down view
298
+ - **Finger**: One finger extended (index, middle, or ring). Specify with `--finger-index`
299
+ - Credit card: Must show at least 3 corners, aspect ratio ~1.586
300
+ - Finger and card must be on the same plane
301
+ - Good lighting, minimal blur
302
+
303
+ ### Failure Modes
304
+ The system can fail at various stages:
305
+ - `card_not_detected` - Credit card not found or aspect ratio invalid
306
+ - `hand_not_detected` - No hand detected by MediaPipe
307
+ - `finger_isolation_failed` - Could not isolate specified finger
308
+ - `finger_mask_too_small` - Mask area too small after cleaning
309
+ - `contour_extraction_failed` - Could not extract valid contour
310
+ - `axis_estimation_failed` - PCA failed or insufficient points
311
+ - `zone_localization_failed` - Could not define ring zone
312
+ - `width_measurement_failed` - No valid cross-section intersections
313
+
314
+ ## Output Format
315
+
316
+ ### JSON Output Structure
317
+ ```json
318
+ {
319
+ "finger_outer_diameter_cm": 1.78,
320
+ "confidence": 0.86,
321
+ "scale_px_per_cm": 42.3,
322
+ "quality_flags": {
323
+ "card_detected": true,
324
+ "finger_detected": true,
325
+ "view_angle_ok": true
326
+ },
327
+ "fail_reason": null
328
+ }
329
+ ```
330
+
331
+ ### Debug Visualization Features
332
+ When `--debug` flag is used, generates an annotated image with:
333
+ - Credit card contour and corners (green)
334
+ - Finger contour (magenta, thick lines)
335
+ - Finger axis and endpoints (cyan/yellow)
336
+ - Ring-wearing zone band (yellow, semi-transparent)
337
+ - Cross-section sampling lines (orange)
338
+ - Measurement intersection points (blue circles)
339
+ - Final measurement and confidence text (large, readable font)
340
+
341
+ ## Code Patterns and Conventions
342
+
343
+ ### Error Handling
344
+ - Functions return `None` or raise exceptions on failure
345
+ - Main pipeline (`measure_finger()`) returns structured output dict with `fail_reason`
346
+ - Console logging provides detailed progress information
347
+
348
+ ### Type Hints
349
+ - Extensive use of type hints throughout
350
+ - Dict return types with `Dict[str, Any]` for structured data
351
+ - NumPy arrays typed as `np.ndarray`
352
+ - Literal types for enums (e.g., `FingerIndex`)
353
+
354
+ ### Data Flow
355
+ - All major functions return dictionaries with consistent keys
356
+ - Downstream functions accept upstream outputs directly
357
+ - Debug visualization receives all intermediate results
358
+ - Clean separation between detection, computation, and visualization
359
+
360
+ ### Validation and Sanity Checks
361
+ - Finger width should be in realistic range: 1.0-3.0 cm (typical: 1.4-2.4 cm)
362
+ - Credit card aspect ratio should be close to 1.586
363
+ - View angle check: scale confidence should be >0.9 for accurate measurements
364
+ - Minimum mask area threshold prevents false detections
CLAUDE.md ADDED
@@ -0,0 +1,364 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Standard Task Workflow
6
+
7
+ For tasks of implementing **new features**:
8
+ 1. Read PRD.md, Plan.md, Progress.md before coding
9
+ 2. Summarize current project state before implementation
10
+ 3. Carry out the implementatation; after that, build and test if possible
11
+ 4. Update Progress.md after changes
12
+ 5. Commit with a clear, concise message
13
+
14
+ For tasks of **bug fixing**:
15
+ 1. Summarize the bug, reason and solution before implementation
16
+ 2. Carry out the implementation to fix the bug; build and test afterwards;
17
+ 3. Update Progress.md after changes
18
+ 4. Commit with a clear, concise message
19
+
20
+ For tasks of **reboot** from a new codex session:
21
+ 1. Read doc/v0/PRD.md, doc/v0/Plan.md, doc/v0/Progress.md for baseline implementation
22
+ 2. Read doc/v1/PRD.md, doc/v1/Plan.md, doc/v1/Progress.md for edge refinement (v1)
23
+ 3. Assume this is a continuation of an existing project.
24
+ 4. Summarize your understanding of the current state and propose the next concrete step without writing code yet.
25
+
26
+ ## Project Overview
27
+
28
+ Ring Sizer is a **local, terminal-executable computer vision program** that measures the outer width (diameter) of a finger at the ring-wearing zone using a single RGB image. It uses a standard credit card (ISO/IEC 7810 ID-1: 85.60mm × 53.98mm) as a physical size reference for scale calibration.
29
+
30
+ **Key characteristics:**
31
+ - Single image input (JPG/PNG)
32
+ - **v1: Dual edge detection** - Landmark-based axis + Sobel gradient refinement
33
+ - MediaPipe-based hand and finger segmentation
34
+ - MediaPipe-based hand and finger segmentation
35
+ - Outputs JSON measurement data and optional debug visualization
36
+ - No cloud processing, runs entirely locally
37
+ - Python 3.8+ with OpenCV, NumPy, MediaPipe, and SciPy
38
+
39
+ ## Development Commands
40
+
41
+ ### Installation
42
+ ```bash
43
+ # Create virtual environment (recommended)
44
+ python -m venv .venv
45
+ source .venv/bin/activate # On Windows: .venv\Scripts\activate
46
+
47
+ # Install dependencies
48
+ pip install -r requirements.txt
49
+ ```
50
+
51
+ ### Running the Program
52
+ ```bash
53
+ # Basic measurement (defaults to index finger, auto edge detection)
54
+ python measure_finger.py --input input/test_image.jpg --output output/result.json
55
+
56
+ # Measure specific finger (index, middle, ring, or auto)
57
+ python measure_finger.py \
58
+ --input input/test_image.jpg \
59
+ --output output/result.json \
60
+ --finger-index ring
61
+
62
+ # With debug visualization
63
+ python measure_finger.py \
64
+ --input input/test_image.jpg \
65
+ --output output/result.json \
66
+ --finger-index middle \
67
+ --debug output/debug_overlay.png
68
+
69
+ # Force Sobel edge refinement (v1)
70
+ python measure_finger.py \
71
+ --input image.jpg \
72
+ --output result.json \
73
+ --finger-index ring \
74
+ --edge-method sobel \
75
+ --sobel-threshold 15.0 \
76
+ --debug output/debug.png
77
+
78
+ # Compare both methods
79
+ python measure_finger.py \
80
+ --input image.jpg \
81
+ --output result.json \
82
+ --finger-index middle \
83
+ --edge-method compare \
84
+ --debug output/debug.png
85
+
86
+ # Force contour method (v0)
87
+ python measure_finger.py \
88
+ --input image.jpg \
89
+ --output result.json \
90
+ --finger-index index \
91
+ --edge-method contour
92
+ ```
93
+
94
+ ## Architecture Overview
95
+
96
+ ### Processing Pipeline (9 Phases)
97
+
98
+ The measurement pipeline follows a strict sequential flow:
99
+
100
+ 1. **Image Quality Check** - Blur detection, exposure validation, resolution check
101
+ 2. **Credit Card Detection & Scale Calibration** - Detects card, verifies aspect ratio (~1.586), computes `px_per_cm`
102
+ 3. **Hand & Finger Segmentation** - MediaPipe hand detection, finger isolation, mask generation
103
+ 4. **Finger Contour Extraction** - Extracts outer contour from cleaned mask
104
+ 5. **Finger Axis Estimation** - PCA-based principal axis calculation, determines palm-end vs tip-end
105
+ 6. **Ring-Wearing Zone Localization** - Defines zone at 15%-25% of finger length from palm-side
106
+ 7. **Width Measurement** - Samples 20 cross-sections perpendicular to axis, uses median width
107
+ 8. **Confidence Scoring** - Multi-factor scoring (card 30%, finger 30%, measurement 40%)
108
+ 9. **Debug Visualization** - Generates annotated overlay image
109
+
110
+ ### Module Structure
111
+
112
+ The codebase is organized into focused utility modules in `src/`:
113
+
114
+ | Module | Primary Responsibilities |
115
+ |--------|--------------------------|
116
+ | `card_detection.py` | Credit card detection, perspective correction, scale calibration (`px_per_cm`) |
117
+ | `finger_segmentation.py` | MediaPipe integration, hand/finger isolation, mask cleaning, contour extraction |
118
+ | `geometry.py` | PCA axis estimation, ring zone localization, cross-section width measurement, line-contour intersections |
119
+ | `image_quality.py` | Blur detection (Laplacian variance), exposure checks, resolution validation |
120
+ | `confidence.py` | Component confidence scoring (card, finger, measurement), overall confidence computation |
121
+ | `visualization.py` | Debug overlay generation with contours, zones, measurements, and annotations |
122
+
123
+ ### Key Design Decisions
124
+
125
+ **Ring-Wearing Zone Definition:**
126
+ - Located at 15%-25% of finger length from palm-side end
127
+ - Width measured by sampling 20 cross-sections within this zone
128
+ - Final measurement is the **median width** (robust to outliers)
129
+
130
+ **Axis Estimation:**
131
+ - Uses PCA (Principal Component Analysis) on finger mask points
132
+ - Determines palm-end vs tip-end using either:
133
+ 1. MediaPipe landmarks (preferred, if available)
134
+ 2. Thickness heuristic (thinner end is likely the tip)
135
+
136
+ **Confidence Scoring:**
137
+ - 3-component weighted average: Card (30%) + Finger (30%) + Measurement (40%)
138
+ - Confidence levels: HIGH (>0.85), MEDIUM (0.6-0.85), LOW (<0.6)
139
+ - Factors: card detection quality, finger mask area, width variance, aspect ratios
140
+
141
+ **Measurement Approach:**
142
+ - Perpendicular cross-sections to finger axis
143
+ - Line-contour intersection algorithm finds left/right edges
144
+ - Uses farthest pair of intersections to handle complex contours
145
+ - Converts pixels to cm using calibrated scale factor
146
+
147
+ ---
148
+
149
+ ## v1 Architecture (Edge Refinement)
150
+
151
+ ### What's New in v1
152
+
153
+ v1 improves measurement accuracy by replacing contour-based edge detection with gradient-based Sobel edge refinement. Key improvements:
154
+
155
+ - **Landmark-based axis**: Uses MediaPipe finger landmarks (MCP→PIP→DIP→TIP) for more anatomically consistent axis estimation
156
+ - **Sobel edge detection**: Bidirectional gradient filtering for pixel-precise edge localization
157
+ - **Sub-pixel refinement**: Parabola fitting achieves <0.5px precision (~0.003cm at typical resolution)
158
+ - **Quality-based fallback**: Automatically uses v0 contour method if Sobel quality insufficient
159
+ - **Enhanced confidence**: Adds edge quality component (gradient strength, consistency, smoothness, symmetry)
160
+
161
+ ### v1 Processing Pipeline (Enhanced Phases)
162
+
163
+ **Phase 5a: Landmark-Based Axis Estimation (v1)**
164
+ - Uses MediaPipe finger landmarks directly (4 points: MCP, PIP, DIP, TIP)
165
+ - **Finger selection**: Defaults to index finger, can specify middle or ring finger via `--finger-index`
166
+ - Orientation detection uses the **specified finger** for axis calculation (wrist → finger tip)
167
+ - Image automatically rotated to canonical orientation (wrist at bottom, fingers pointing up)
168
+ - Three axis calculation methods:
169
+ - `endpoints`: Simple MCP→TIP vector
170
+ - `linear_fit`: Linear regression on all 4 landmarks (default, most robust)
171
+ - `median_direction`: Median of segment directions
172
+ - Falls back to PCA if landmarks unavailable or quality check fails
173
+ - Validation checks: NaN/inf, minimum spacing, monotonic progression, minimum length
174
+
175
+ **Phase 7b: Sobel Edge Refinement (v1)**
176
+ ```
177
+ 1. Extract ROI around ring zone → 2. Apply bidirectional Sobel filters →
178
+ 3. Detect edges per cross-section → 4. Sub-pixel refinement → 5. Measure width
179
+ ```
180
+
181
+ 1. **ROI Extraction**
182
+ - Rectangular region around ring zone with padding (50px for gradient context)
183
+ - Width estimation: `finger_length / 3.0` (conservative)
184
+ - Optional rotation alignment (not used by default)
185
+
186
+ 2. **Bidirectional Sobel Filtering**
187
+ - Applies `cv2.Sobel` with configurable kernel size (3, 5, or 7)
188
+ - Computes gradient_x (horizontal edges), gradient_y (vertical edges)
189
+ - Calculates gradient magnitude and direction
190
+ - Auto-detects filter orientation from ROI aspect ratio
191
+
192
+ 3. **Edge Detection Per Cross-Section**
193
+ - **Mask-constrained mode** (primary):
194
+ - Finds leftmost/rightmost finger mask pixels (finger boundaries)
195
+ - Searches ±10px around boundaries for strongest gradient
196
+ - Combines anatomical accuracy (mask) with sub-pixel precision (gradient)
197
+ - **Gradient-only mode** (fallback): Pure Sobel without mask constraint
198
+
199
+ 4. **Sub-Pixel Edge Localization**
200
+ - Parabola fitting: f(x) = ax² + bx + c
201
+ - Samples gradient at x-1, x, x+1
202
+ - Finds parabola peak: x_peak = -b/(2a)
203
+ - Constrains refinement to ±0.5 pixels
204
+ - Achieves <0.5px precision (~0.003cm at 185 px/cm)
205
+
206
+ 5. **Width Measurement**
207
+ - Calculates width for each valid row
208
+ - Outlier filtering using Median Absolute Deviation (MAD)
209
+ - Removes measurements >3 MAD from median
210
+ - Computes median, mean, std dev
211
+ - Converts pixels to cm using scale factor
212
+
213
+ **Phase 8b: Enhanced Confidence Scoring (v1)**
214
+ - Adds 4th component: Edge Quality (20% weight)
215
+ - Gradient strength: Avg magnitude at detected edges
216
+ - Consistency: % of rows with valid edge pairs
217
+ - Smoothness: Edge position variance (lower = better)
218
+ - Symmetry: Left/right edge strength balance
219
+ - Reweights other components: Card 25%, Finger 25%, Measurement 30%
220
+
221
+ ### v1 Module Structure
222
+
223
+ | Module | v1 Enhancements |
224
+ |--------|-----------------|
225
+ | `geometry.py` | Added `estimate_finger_axis_from_landmarks()`, `_validate_landmark_quality()`, landmark-based zone localization |
226
+ | **`edge_refinement.py`** | **[NEW]** Complete Sobel edge refinement pipeline with sub-pixel precision |
227
+ | `confidence.py` | Added `compute_edge_quality_confidence()`, dual-mode confidence calculation |
228
+ | `debug_observer.py` | Added 9 edge refinement drawing functions for visualization |
229
+ | `measure_finger.py` | CLI flags for edge method selection, method comparison mode |
230
+
231
+ ### v1 CLI Flags
232
+
233
+ | Flag | Values | Default | Description |
234
+ |------|--------|---------|-------------|
235
+ | `--finger-index` | auto, index, middle, ring, pinky | **index** | Which finger to measure and use for orientation |
236
+ | `--edge-method` | auto, contour, sobel, compare | auto | Edge detection method |
237
+ | `--sobel-threshold` | float | 15.0 | Minimum gradient magnitude |
238
+ | `--sobel-kernel-size` | 3, 5, 7 | 3 | Sobel kernel size |
239
+ | `--no-subpixel` | flag | False | Disable sub-pixel refinement |
240
+
241
+ ### v1 Auto Mode Behavior
242
+
243
+ When `--edge-method auto` (default):
244
+ 1. Always computes contour measurement (v0 baseline)
245
+ 2. Attempts Sobel edge refinement
246
+ 3. Evaluates Sobel quality score (threshold: 0.7)
247
+ 4. Checks consistency (>50% success rate required)
248
+ 5. Verifies width reasonableness (0.8-3.5 cm)
249
+ 6. Checks agreement with contour (<50% difference)
250
+ 7. Uses Sobel if all checks pass, otherwise falls back to contour
251
+ 8. Reports method used in `edge_method_used` field
252
+
253
+ ### v1 Debug Output
254
+
255
+ When `--debug` flag used, generates:
256
+ - Main debug overlay (same as v0, shows final result)
257
+ - `output/edge_refinement_debug/` subdirectory with 12 images:
258
+ - **Stage A** (3): Landmark axis, ring zone, ROI extraction
259
+ - **Stage B** (5): Sobel gradients, candidates, selected edges
260
+ - **Stage C** (4): Sub-pixel refinement, widths, distribution, outliers
261
+
262
+ ### v1 Failure Modes (Additional)
263
+
264
+ - `sobel_edge_refinement_failed` - Sobel method explicitly requested but failed
265
+ - `quality_score_low_X.XX` - Edge quality below threshold (auto fallback)
266
+ - `consistency_low_X.XX` - Too few valid edge detections
267
+ - `width_unreasonable` - Measured width outside realistic range
268
+ - `disagreement_with_contour` - Sobel and contour differ by >50%
269
+
270
+ ---
271
+
272
+ ## Important Technical Details
273
+
274
+ ### What This Measures
275
+ The system measures the **external horizontal width** (outer diameter) of the finger at the ring-wearing zone. This is:
276
+ - ✅ The width of soft tissue + bone at the ring-wearing position
277
+ - ❌ NOT the inner diameter of a ring
278
+ - Used as a geometric proxy for downstream ring size mapping (out of scope for v0)
279
+
280
+ ### Coordinate Systems
281
+ - Images use standard OpenCV format: (row, col) = (y, x)
282
+ - Most geometry functions work in (x, y) format
283
+ - Contours are Nx2 arrays in (x, y) format
284
+ - Careful conversion needed between formats (see `geometry.py:35`)
285
+
286
+ ### MediaPipe Integration
287
+ - Uses pretrained hand landmark detection model (no custom training)
288
+ - Provides 21 hand landmarks per hand
289
+ - Each finger has 4 landmarks: MCP (base), PIP, DIP, TIP
290
+ - Finger indices: 0=thumb, 1=index, 2=middle, 3=ring, 4=pinky
291
+ - **Orientation detection**: Uses wrist → specified finger tip to determine hand rotation
292
+ - **Automatic rotation**: Image rotated to canonical orientation (wrist at bottom, fingers up) based on selected finger
293
+
294
+ ### Input Requirements
295
+ For optimal results:
296
+ - Resolution: 1080p or higher recommended
297
+ - View angle: Near top-down view
298
+ - **Finger**: One finger extended (index, middle, or ring). Specify with `--finger-index`
299
+ - Credit card: Must show at least 3 corners, aspect ratio ~1.586
300
+ - Finger and card must be on the same plane
301
+ - Good lighting, minimal blur
302
+
303
+ ### Failure Modes
304
+ The system can fail at various stages:
305
+ - `card_not_detected` - Credit card not found or aspect ratio invalid
306
+ - `hand_not_detected` - No hand detected by MediaPipe
307
+ - `finger_isolation_failed` - Could not isolate specified finger
308
+ - `finger_mask_too_small` - Mask area too small after cleaning
309
+ - `contour_extraction_failed` - Could not extract valid contour
310
+ - `axis_estimation_failed` - PCA failed or insufficient points
311
+ - `zone_localization_failed` - Could not define ring zone
312
+ - `width_measurement_failed` - No valid cross-section intersections
313
+
314
+ ## Output Format
315
+
316
+ ### JSON Output Structure
317
+ ```json
318
+ {
319
+ "finger_outer_diameter_cm": 1.78,
320
+ "confidence": 0.86,
321
+ "scale_px_per_cm": 42.3,
322
+ "quality_flags": {
323
+ "card_detected": true,
324
+ "finger_detected": true,
325
+ "view_angle_ok": true
326
+ },
327
+ "fail_reason": null
328
+ }
329
+ ```
330
+
331
+ ### Debug Visualization Features
332
+ When `--debug` flag is used, generates an annotated image with:
333
+ - Credit card contour and corners (green)
334
+ - Finger contour (magenta, thick lines)
335
+ - Finger axis and endpoints (cyan/yellow)
336
+ - Ring-wearing zone band (yellow, semi-transparent)
337
+ - Cross-section sampling lines (orange)
338
+ - Measurement intersection points (blue circles)
339
+ - Final measurement and confidence text (large, readable font)
340
+
341
+ ## Code Patterns and Conventions
342
+
343
+ ### Error Handling
344
+ - Functions return `None` or raise exceptions on failure
345
+ - Main pipeline (`measure_finger()`) returns structured output dict with `fail_reason`
346
+ - Console logging provides detailed progress information
347
+
348
+ ### Type Hints
349
+ - Extensive use of type hints throughout
350
+ - Dict return types with `Dict[str, Any]` for structured data
351
+ - NumPy arrays typed as `np.ndarray`
352
+ - Literal types for enums (e.g., `FingerIndex`)
353
+
354
+ ### Data Flow
355
+ - All major functions return dictionaries with consistent keys
356
+ - Downstream functions accept upstream outputs directly
357
+ - Debug visualization receives all intermediate results
358
+ - Clean separation between detection, computation, and visualization
359
+
360
+ ### Validation and Sanity Checks
361
+ - Finger width should be in realistic range: 1.0-3.0 cm (typical: 1.4-2.4 cm)
362
+ - Credit card aspect ratio should be close to 1.586
363
+ - View angle check: scale confidence should be >0.9 for accurate measurements
364
+ - Minimum mask area threshold prevents false detections
Dockerfile ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.11-slim
2
+
3
+ # System deps required by OpenCV and MediaPipe
4
+ RUN apt-get update && \
5
+ apt-get install -y --no-install-recommends libgl1 libglib2.0-0 && \
6
+ rm -rf /var/lib/apt/lists/*
7
+
8
+ WORKDIR /app
9
+
10
+ COPY requirements.txt .
11
+ RUN pip install --no-cache-dir -r requirements.txt
12
+
13
+ COPY . .
14
+
15
+ # Ensure upload/result dirs exist
16
+ RUN mkdir -p web_demo/uploads web_demo/results
17
+
18
+ ENV PORT=7860
19
+
20
+ EXPOSE ${PORT}
21
+
22
+ CMD gunicorn --bind 0.0.0.0:${PORT} --timeout 120 --workers 2 web_demo.app:app
README.md CHANGED
@@ -1,10 +1,135 @@
1
  ---
2
  title: Ring Sizer
3
- emoji: 🌍
4
  colorFrom: blue
5
- colorTo: red
6
  sdk: docker
7
- pinned: false
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: Ring Sizer
3
+ emoji: "\U0001F48D"
4
  colorFrom: blue
5
+ colorTo: purple
6
  sdk: docker
7
+ app_port: 7860
8
  ---
9
 
10
+ # Ring Sizer
11
+
12
+ Local computer-vision CLI tool that measures **finger outer diameter** from a single image using a **credit card** as scale reference.
13
+
14
+ ## What it does
15
+ - Detects a credit card and computes `px/cm` scale.
16
+ - Detects hand/finger with MediaPipe.
17
+ - Measures finger width in the ring-wearing zone.
18
+ - Supports dual edge modes:
19
+ - `contour` (v0 baseline)
20
+ - `sobel` (v1 refinement)
21
+ - `auto` (default, Sobel with quality fallback)
22
+ - `compare` (returns both method stats)
23
+ - Writes JSON output and always writes a result PNG next to it.
24
+
25
+ ## Install
26
+ ```bash
27
+ python -m venv .venv
28
+ source .venv/bin/activate
29
+ pip install -r requirements.txt
30
+ ```
31
+
32
+ ## Run
33
+ ```bash
34
+ python measure_finger.py --input input/test_image.jpg --output output/result.json
35
+ ```
36
+
37
+ ### Common options
38
+ ```bash
39
+ # Enable intermediate debug folders (card/finger/edge stages)
40
+ python measure_finger.py --input image.jpg --output output/result.json --debug
41
+
42
+ # Finger selection
43
+ python measure_finger.py --input image.jpg --output output/result.json --finger-index ring
44
+
45
+ # Force method
46
+ python measure_finger.py --input image.jpg --output output/result.json --edge-method contour
47
+ python measure_finger.py --input image.jpg --output output/result.json --edge-method sobel
48
+
49
+ # Compare contour vs sobel
50
+ python measure_finger.py --input image.jpg --output output/result.json --edge-method compare
51
+
52
+ # Sobel tuning
53
+ python measure_finger.py --input image.jpg --output output/result.json \
54
+ --edge-method sobel --sobel-threshold 15 --sobel-kernel-size 3 --no-subpixel
55
+ ```
56
+
57
+ ## CLI flags (current)
58
+ - `--input` (required)
59
+ - `--output` (required)
60
+ - `--debug` (boolean; saves intermediate debug folders)
61
+ - `--save-intermediate`
62
+ - `--finger-index {auto,index,middle,ring,pinky}` (default `index`)
63
+ - `--confidence-threshold` (default `0.7`)
64
+ - `--edge-method {auto,contour,sobel,compare}` (default `auto`)
65
+ - `--sobel-threshold` (default `15.0`)
66
+ - `--sobel-kernel-size {3,5,7}` (default `3`)
67
+ - `--no-subpixel`
68
+ - `--skip-card-detection` (testing only)
69
+
70
+ ## Output JSON
71
+ ```json
72
+ {
73
+ "finger_outer_diameter_cm": 1.78,
74
+ "confidence": 0.91,
75
+ "scale_px_per_cm": 203.46,
76
+ "quality_flags": {
77
+ "card_detected": true,
78
+ "finger_detected": true,
79
+ "view_angle_ok": true
80
+ },
81
+ "fail_reason": null,
82
+ "edge_method_used": "contour_fallback",
83
+ "method_comparison": {
84
+ "contour": {
85
+ "width_cm": 1.82,
86
+ "width_px": 371.2,
87
+ "std_dev_px": 3.8,
88
+ "coefficient_variation": 0.01,
89
+ "num_samples": 20,
90
+ "method": "contour"
91
+ },
92
+ "sobel": {
93
+ "width_cm": 1.78,
94
+ "width_px": 362.0,
95
+ "std_dev_px": 3.1,
96
+ "coefficient_variation": 0.008,
97
+ "num_samples": 140,
98
+ "subpixel_used": true,
99
+ "success_rate": 0.42,
100
+ "edge_quality_score": 0.81,
101
+ "method": "sobel"
102
+ },
103
+ "difference": {
104
+ "absolute_cm": -0.04,
105
+ "absolute_px": -9.2,
106
+ "relative_pct": -2.2,
107
+ "precision_improvement": 0.7
108
+ },
109
+ "recommendation": {
110
+ "use_sobel": true,
111
+ "reason": "quality_acceptable",
112
+ "preferred_method": "sobel"
113
+ },
114
+ "quality_comparison": {
115
+ "contour_cv": 0.01,
116
+ "sobel_cv": 0.008,
117
+ "sobel_quality_score": 0.81,
118
+ "sobel_gradient_strength": 0.82,
119
+ "sobel_consistency": 0.42,
120
+ "sobel_smoothness": 0.91,
121
+ "sobel_symmetry": 0.95
122
+ }
123
+ }
124
+ }
125
+ ```
126
+
127
+ Notes:
128
+ - `edge_method_used` and `method_comparison` are optional (present when relevant).
129
+ - Result image path is auto-derived: `output/result.json` -> `output/result.png`.
130
+
131
+ ## Documentation map
132
+ - Requirement docs: `doc/v{i}/PRD.md`, `doc/v{i}/Plan.md`, `doc/v{i}/Progress.md`
133
+ - Algorithms index: `doc/algorithms/README.md`
134
+ - Scripts: `script/README.md`
135
+ - Web demo: `web_demo/README.md`
fly.toml ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ app = 'ring-size-cv'
2
+ primary_region = 'sjc'
3
+
4
+ [build]
5
+
6
+ [http_service]
7
+ internal_port = 8080
8
+ force_https = true
9
+ auto_stop_machines = 'stop'
10
+ auto_start_machines = true
11
+ min_machines_running = 0
12
+
13
+ [[vm]]
14
+ memory = '1gb'
15
+ cpu_kind = 'shared'
16
+ cpus = 1
measure_finger.py ADDED
@@ -0,0 +1,763 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """
3
+ Finger Outer Diameter Measurement Tool
4
+
5
+ Measures the outer width (diameter) of a finger at the ring-wearing zone
6
+ using a single RGB image with a credit card as a physical size reference.
7
+
8
+ Usage:
9
+ python measure_finger.py --input image.jpg --output result.json [--debug]
10
+ """
11
+
12
+ import argparse
13
+ import json
14
+ import sys
15
+ from pathlib import Path
16
+ from typing import Optional, Dict, Any, Literal
17
+
18
+ import cv2
19
+ import numpy as np
20
+
21
+ from src.image_quality import assess_image_quality
22
+ from src.card_detection import detect_credit_card, compute_scale_factor
23
+ from src.finger_segmentation import segment_hand, isolate_finger, clean_mask, get_finger_contour
24
+ from src.geometry import estimate_finger_axis, localize_ring_zone, localize_ring_zone_from_landmarks, compute_cross_section_width
25
+ from src.edge_refinement import refine_edges_sobel, should_use_sobel_measurement, compare_edge_methods
26
+ from src.confidence import (
27
+ compute_card_confidence,
28
+ compute_finger_confidence,
29
+ compute_measurement_confidence,
30
+ compute_edge_quality_confidence,
31
+ compute_overall_confidence,
32
+ )
33
+ from src.debug_observer import draw_comprehensive_edge_overlay
34
+
35
+ # Type alias for finger selection
36
+ FingerIndex = Literal["auto", "index", "middle", "ring", "pinky"]
37
+
38
+
39
+ def parse_args() -> argparse.Namespace:
40
+ """Parse command line arguments."""
41
+ parser = argparse.ArgumentParser(
42
+ description="Measure finger outer diameter from an image with credit card reference.",
43
+ formatter_class=argparse.RawDescriptionHelpFormatter,
44
+ epilog="""
45
+ Examples:
46
+ python measure_finger.py --input photo.jpg --output result.json
47
+ python measure_finger.py --input photo.jpg --output result.json --debug
48
+ python measure_finger.py --input photo.jpg --output result.json --finger-index ring
49
+ python measure_finger.py --input photo.jpg --output result.json --finger-index middle
50
+ """,
51
+ )
52
+
53
+ # Required arguments
54
+ parser.add_argument(
55
+ "--input",
56
+ type=str,
57
+ required=True,
58
+ help="Path to input image (JPG/PNG)",
59
+ )
60
+ parser.add_argument(
61
+ "--output",
62
+ type=str,
63
+ required=True,
64
+ help="Path to output JSON file",
65
+ )
66
+
67
+ # Optional arguments
68
+ parser.add_argument(
69
+ "--debug",
70
+ action="store_true",
71
+ default=False,
72
+ help="Save intermediate debug images (card_detection_debug/, edge_refinement_debug/, etc.)",
73
+ )
74
+ parser.add_argument(
75
+ "--save-intermediate",
76
+ action="store_true",
77
+ help="Save intermediate processing artifacts",
78
+ )
79
+ parser.add_argument(
80
+ "--finger-index",
81
+ type=str,
82
+ choices=["auto", "index", "middle", "ring", "pinky"],
83
+ default="index",
84
+ help="Which finger to measure (default: index). 'auto' detects the most extended finger.",
85
+ )
86
+ parser.add_argument(
87
+ "--confidence-threshold",
88
+ type=float,
89
+ default=0.7,
90
+ help="Minimum confidence threshold (default: 0.7)",
91
+ )
92
+
93
+ # v1 edge refinement options
94
+ parser.add_argument(
95
+ "--edge-method",
96
+ type=str,
97
+ default="auto",
98
+ choices=["auto", "contour", "sobel", "compare"],
99
+ help="Edge detection method: auto (quality-based), contour (v0), sobel (v1), compare (both) (default: auto)",
100
+ )
101
+ parser.add_argument(
102
+ "--sobel-threshold",
103
+ type=float,
104
+ default=15.0,
105
+ help="Minimum gradient magnitude for valid edge (default: 15.0)",
106
+ )
107
+ parser.add_argument(
108
+ "--sobel-kernel-size",
109
+ type=int,
110
+ default=3,
111
+ choices=[3, 5, 7],
112
+ help="Sobel kernel size (default: 3)",
113
+ )
114
+ parser.add_argument(
115
+ "--no-subpixel",
116
+ action="store_true",
117
+ help="Disable sub-pixel edge refinement",
118
+ )
119
+
120
+ # Testing/debugging options
121
+ parser.add_argument(
122
+ "--skip-card-detection",
123
+ action="store_true",
124
+ help="[TESTING ONLY] Skip card detection and use dummy scale (allows testing finger segmentation without card)",
125
+ )
126
+
127
+ return parser.parse_args()
128
+
129
+
130
+ def validate_input(input_path: str) -> Optional[str]:
131
+ """
132
+ Validate input file exists and is a supported image format.
133
+
134
+ Args:
135
+ input_path: Path to input image
136
+
137
+ Returns:
138
+ Error message if validation fails, None if valid
139
+ """
140
+ path = Path(input_path)
141
+
142
+ if not path.exists():
143
+ return f"Input file not found: {input_path}"
144
+
145
+ if not path.is_file():
146
+ return f"Input path is not a file: {input_path}"
147
+
148
+ suffix = path.suffix.lower()
149
+ if suffix not in [".jpg", ".jpeg", ".png"]:
150
+ return f"Unsupported image format: {suffix}. Use JPG or PNG."
151
+
152
+ return None
153
+
154
+
155
+ def load_image(input_path: str) -> Optional[np.ndarray]:
156
+ """
157
+ Load image from file.
158
+
159
+ Args:
160
+ input_path: Path to input image
161
+
162
+ Returns:
163
+ BGR image as numpy array, or None if load fails
164
+ """
165
+ image = cv2.imread(input_path)
166
+ return image
167
+
168
+
169
+ def create_output(
170
+ finger_diameter_cm: Optional[float] = None,
171
+ confidence: float = 0.0,
172
+ scale_px_per_cm: Optional[float] = None,
173
+ card_detected: bool = False,
174
+ finger_detected: bool = False,
175
+ view_angle_ok: bool = True,
176
+ fail_reason: Optional[str] = None,
177
+ edge_method_used: Optional[str] = None,
178
+ method_comparison: Optional[Dict[str, Any]] = None,
179
+ ) -> Dict[str, Any]:
180
+ """
181
+ Create output dictionary in the specified format.
182
+
183
+ Args:
184
+ finger_diameter_cm: Measured finger diameter in cm
185
+ confidence: Confidence score [0, 1]
186
+ scale_px_per_cm: Computed scale factor
187
+ card_detected: Whether credit card was detected
188
+ finger_detected: Whether finger was detected
189
+ view_angle_ok: Whether view angle is acceptable
190
+ fail_reason: Reason for failure if applicable
191
+ edge_method_used: Edge detection method used (v1)
192
+ method_comparison: Comparison data when using compare mode (v1)
193
+
194
+ Returns:
195
+ Output dictionary matching PRD specification
196
+ """
197
+ output = {
198
+ "finger_outer_diameter_cm": float(finger_diameter_cm) if finger_diameter_cm is not None else None,
199
+ "confidence": float(round(confidence, 3)),
200
+ "scale_px_per_cm": round(float(scale_px_per_cm), 2) if scale_px_per_cm is not None else None,
201
+ "quality_flags": {
202
+ "card_detected": bool(card_detected),
203
+ "finger_detected": bool(finger_detected),
204
+ "view_angle_ok": bool(view_angle_ok),
205
+ },
206
+ "fail_reason": fail_reason,
207
+ }
208
+
209
+ # Add v1 fields if applicable
210
+ if edge_method_used is not None:
211
+ output["edge_method_used"] = edge_method_used
212
+
213
+ if method_comparison is not None:
214
+ output["method_comparison"] = method_comparison
215
+
216
+ return output
217
+
218
+
219
+ def save_output(output: Dict[str, Any], output_path: str) -> None:
220
+ """Save output dictionary to JSON file."""
221
+ # Ensure output directory exists
222
+ Path(output_path).parent.mkdir(parents=True, exist_ok=True)
223
+
224
+ with open(output_path, "w") as f:
225
+ json.dump(output, f, indent=2)
226
+
227
+
228
+ def measure_finger(
229
+ image: np.ndarray,
230
+ finger_index: FingerIndex = "index",
231
+ confidence_threshold: float = 0.7,
232
+ save_intermediate: bool = False,
233
+ result_png_path: Optional[str] = None,
234
+ save_debug: bool = False,
235
+ edge_method: str = "auto",
236
+ sobel_threshold: float = 15.0,
237
+ sobel_kernel_size: int = 3,
238
+ use_subpixel: bool = True,
239
+ skip_card_detection: bool = False,
240
+ ) -> Dict[str, Any]:
241
+ """
242
+ Main measurement pipeline.
243
+
244
+ Args:
245
+ image: Input BGR image
246
+ finger_index: Which finger to measure
247
+ confidence_threshold: Minimum confidence threshold
248
+ save_intermediate: Whether to save intermediate artifacts
249
+ result_png_path: Path to save result visualization PNG (always generated)
250
+ save_debug: Whether to save intermediate debug images
251
+ edge_method: Edge detection method (auto, contour, sobel, compare)
252
+ sobel_threshold: Minimum gradient magnitude for valid edge
253
+ sobel_kernel_size: Sobel kernel size (3, 5, or 7)
254
+ use_subpixel: Enable sub-pixel edge refinement
255
+
256
+ Returns:
257
+ Output dictionary with measurement results
258
+ """
259
+ # Phase 2: Image quality check
260
+ quality = assess_image_quality(image)
261
+ print(f"Image quality: blur={quality['blur_score']:.1f}, "
262
+ f"brightness={quality['brightness']:.1f}, "
263
+ f"contrast={quality['contrast']:.1f}")
264
+
265
+ if not quality["passed"]:
266
+ for issue in quality["issues"]:
267
+ print(f" Warning: {issue}")
268
+ return create_output(fail_reason=quality["fail_reason"])
269
+
270
+ # Phase 3: Hand & finger segmentation (MOVED BEFORE CARD DETECTION)
271
+ # This allows us to rotate the image to canonical orientation first
272
+ # Create finger segmentation debug subdirectory if debug enabled
273
+ finger_debug_dir = None
274
+ if save_debug and result_png_path is not None:
275
+ finger_debug_dir = str(Path(result_png_path).parent / "finger_segmentation_debug")
276
+
277
+ hand_data = segment_hand(image, finger=finger_index, debug_dir=finger_debug_dir)
278
+
279
+ if hand_data is None:
280
+ print("No hand detected in image")
281
+ return create_output(
282
+ card_detected=False, # Card not yet detected
283
+ finger_detected=False,
284
+ fail_reason="hand_not_detected",
285
+ )
286
+
287
+ print(f"Hand detected: {hand_data['handedness']}, confidence={hand_data['confidence']:.2f}")
288
+ if "orientation_rotation" in hand_data:
289
+ print(f"Hand orientation normalized: {hand_data['orientation_rotation']}° rotation applied")
290
+
291
+ # Use canonical image for all downstream processing
292
+ # This ensures finger edges are vertical for optimal Sobel detection
293
+ if "canonical_image" in hand_data:
294
+ image_canonical = hand_data["canonical_image"]
295
+ print(f"Using canonical orientation image: {image_canonical.shape[1]}x{image_canonical.shape[0]}")
296
+ else:
297
+ image_canonical = image # Fallback if not available
298
+ print("Warning: Canonical image not available, using original")
299
+
300
+ # Phase 4: Credit card detection & scale calibration (NOW ON CANONICAL IMAGE)
301
+ # Create card detection debug subdirectory if debug enabled
302
+ card_debug_dir = None
303
+ if save_debug and result_png_path is not None:
304
+ card_debug_dir = str(Path(result_png_path).parent / "card_detection_debug")
305
+
306
+ # Allow skipping card detection for testing finger segmentation
307
+ if skip_card_detection:
308
+ print("⚠️ TESTING MODE: Skipping card detection (using dummy scale factor)")
309
+ card_result = None
310
+ px_per_cm = 100.0 # Dummy scale: 100 pixels/cm (measurements will be inaccurate)
311
+ scale_confidence = 0.5 # Low confidence to indicate dummy value
312
+ view_angle_ok = True
313
+ card_detected = False
314
+ else:
315
+ card_result = detect_credit_card(image_canonical, debug_dir=card_debug_dir)
316
+
317
+ if card_result is None:
318
+ print("Credit card not detected in image")
319
+ return create_output(
320
+ card_detected=False,
321
+ fail_reason="card_not_detected",
322
+ )
323
+
324
+ # Compute scale factor
325
+ px_per_cm, scale_confidence = compute_scale_factor(card_result["corners"])
326
+ print(f"Card detected: {card_result['width_px']:.0f}x{card_result['height_px']:.0f}px, "
327
+ f"aspect={card_result['aspect_ratio']:.3f}, confidence={card_result['confidence']:.2f}")
328
+ print(f"Scale: {px_per_cm:.2f} px/cm (confidence={scale_confidence:.2f})")
329
+
330
+ # Check for excessive perspective distortion (view angle)
331
+ view_angle_ok = scale_confidence > 0.9
332
+ card_detected = True
333
+
334
+ # Phase 5: Finger isolation (hand already segmented in Phase 3)
335
+ h_can, w_can = image_canonical.shape[:2]
336
+ finger_data = isolate_finger(hand_data, finger=finger_index, image_shape=(h_can, w_can))
337
+
338
+ if finger_data is None:
339
+ print(f"Could not isolate finger: {finger_index}")
340
+ return create_output(
341
+ card_detected=card_detected,
342
+ finger_detected=False,
343
+ scale_px_per_cm=px_per_cm,
344
+ view_angle_ok=view_angle_ok,
345
+ fail_reason="finger_isolation_failed",
346
+ )
347
+
348
+ print(f"Finger isolated: {finger_data['finger_name']}")
349
+
350
+ # Clean the finger mask
351
+ cleaned_mask = clean_mask(finger_data["mask"])
352
+
353
+ if cleaned_mask is None:
354
+ print("Finger mask too small or invalid")
355
+ return create_output(
356
+ card_detected=card_detected,
357
+ finger_detected=False,
358
+ scale_px_per_cm=px_per_cm,
359
+ view_angle_ok=view_angle_ok,
360
+ fail_reason="finger_mask_too_small",
361
+ )
362
+
363
+ # Extract finger contour
364
+ contour = get_finger_contour(cleaned_mask)
365
+
366
+ if contour is None:
367
+ print("Could not extract finger contour")
368
+ return create_output(
369
+ card_detected=card_detected,
370
+ finger_detected=False,
371
+ scale_px_per_cm=px_per_cm,
372
+ view_angle_ok=view_angle_ok,
373
+ fail_reason="contour_extraction_failed",
374
+ )
375
+
376
+ print(f"Finger contour extracted: {len(contour)} points")
377
+
378
+ # Phase 5: Estimate finger axis using PCA
379
+ try:
380
+ axis_data = estimate_finger_axis(
381
+ mask=cleaned_mask,
382
+ landmarks=finger_data.get("landmarks"),
383
+ )
384
+ print(f"Finger axis estimated: length={axis_data['length']:.1f}px, "
385
+ f"center=({axis_data['center'][0]:.0f}, {axis_data['center'][1]:.0f})")
386
+ except Exception as e:
387
+ print(f"Failed to estimate finger axis: {e}")
388
+ return create_output(
389
+ card_detected=card_detected,
390
+ finger_detected=True,
391
+ scale_px_per_cm=px_per_cm,
392
+ view_angle_ok=view_angle_ok,
393
+ fail_reason="axis_estimation_failed",
394
+ )
395
+
396
+ # Phase 5b: Precise finger alignment rotation
397
+ # Rotate image to make finger axis perfectly vertical for accurate width measurement
398
+ from src.geometry import (
399
+ calculate_angle_from_vertical,
400
+ rotate_image_precise,
401
+ rotate_axis_data,
402
+ rotate_contour,
403
+ transform_points_rotation
404
+ )
405
+
406
+ angle_from_vertical = calculate_angle_from_vertical(axis_data["direction"])
407
+ #rotation_threshold = 2.0 # Only rotate if > 2° off vertical
408
+ rotation_threshold = 0.0 # always rorate to upright
409
+
410
+ rotation_matrix = None # Track rotation for card corner transform in debug
411
+
412
+ if abs(angle_from_vertical) >= rotation_threshold:
413
+ print(f"Finger axis is {angle_from_vertical:.1f}° from vertical, applying precise rotation...")
414
+
415
+ # Rotate image
416
+ h_can, w_can = image_canonical.shape[:2]
417
+ rotation_center = (w_can / 2.0, h_can / 2.0)
418
+ image_canonical, rotation_matrix = rotate_image_precise(
419
+ image_canonical, angle_from_vertical, rotation_center
420
+ )
421
+
422
+ # Update axis data
423
+ axis_data = rotate_axis_data(axis_data, rotation_matrix)
424
+
425
+ # Update contour
426
+ contour = rotate_contour(contour, rotation_matrix)
427
+
428
+ # Update landmarks if available
429
+ if finger_data.get("landmarks") is not None:
430
+ landmarks_rotated = transform_points_rotation(
431
+ finger_data["landmarks"], rotation_matrix
432
+ )
433
+ finger_data["landmarks"] = landmarks_rotated
434
+
435
+ # Update cleaned mask
436
+ cleaned_mask = cv2.warpAffine(
437
+ cleaned_mask, rotation_matrix, (w_can, h_can),
438
+ flags=cv2.INTER_NEAREST,
439
+ borderMode=cv2.BORDER_CONSTANT,
440
+ borderValue=0
441
+ )
442
+
443
+ print(f"Rotation applied: {angle_from_vertical:.1f}° CW, finger now vertical")
444
+ else:
445
+ print(f"Finger axis is {angle_from_vertical:.1f}° from vertical (within {rotation_threshold}° threshold, no rotation needed)")
446
+
447
+ # Phase 6: Localize ring-wearing zone
448
+ try:
449
+ # Use anatomical mode if landmarks available, otherwise use percentage-based
450
+ landmarks = finger_data.get("landmarks")
451
+ if landmarks is not None and len(landmarks) == 4:
452
+ zone_data = localize_ring_zone_from_landmarks(
453
+ landmarks=landmarks,
454
+ axis_data=axis_data,
455
+ zone_type="anatomical"
456
+ )
457
+ zone_length_cm = zone_data["length"] / px_per_cm
458
+ print(f"Ring zone localized (anatomical): PIP to PIP-(DIP-PIP), "
459
+ f"length={zone_data['length']:.1f}px ({zone_length_cm:.2f}cm)")
460
+ else:
461
+ zone_data = localize_ring_zone(axis_data)
462
+ zone_length_cm = zone_data["length"] / px_per_cm
463
+ print(f"Ring zone localized (percentage): {zone_data['start_pct']*100:.0f}%-{zone_data['end_pct']*100:.0f}% "
464
+ f"from palm, length={zone_data['length']:.1f}px ({zone_length_cm:.2f}cm)")
465
+ except Exception as e:
466
+ print(f"Failed to localize ring zone: {e}")
467
+ return create_output(
468
+ card_detected=card_detected,
469
+ finger_detected=True,
470
+ scale_px_per_cm=px_per_cm,
471
+ view_angle_ok=view_angle_ok,
472
+ fail_reason="zone_localization_failed",
473
+ )
474
+
475
+ # Phase 7: Measure finger width at ring zone
476
+
477
+ # Phase 7a: Contour-based measurement (v0 method)
478
+ try:
479
+ contour_measurement = compute_cross_section_width(
480
+ contour=contour,
481
+ axis_data=axis_data,
482
+ zone_data=zone_data,
483
+ num_samples=20,
484
+ )
485
+
486
+ contour_width_cm = contour_measurement["median_width_px"] / px_per_cm
487
+ print(f"Contour width: {contour_width_cm:.4f}cm "
488
+ f"({contour_measurement['num_samples']} samples, "
489
+ f"std={contour_measurement['std_width_px']:.2f}px)")
490
+
491
+ except Exception as e:
492
+ print(f"Failed to measure finger width (contour): {e}")
493
+ return create_output(
494
+ card_detected=card_detected,
495
+ finger_detected=True,
496
+ scale_px_per_cm=px_per_cm,
497
+ view_angle_ok=view_angle_ok,
498
+ fail_reason="width_measurement_failed",
499
+ edge_method_used="contour",
500
+ )
501
+
502
+ # Phase 7b: Sobel-based measurement (v1 method)
503
+ sobel_measurement = None
504
+ sobel_failed = False
505
+
506
+ if edge_method in ["sobel", "auto", "compare"]:
507
+ try:
508
+ print(f"Running Sobel edge refinement (threshold={sobel_threshold}, kernel={sobel_kernel_size})...")
509
+
510
+ # Create debug directory for edge refinement if debug enabled
511
+ edge_debug_dir = None
512
+ if save_debug and result_png_path is not None:
513
+ edge_debug_dir = str(Path(result_png_path).parent / "edge_refinement_debug")
514
+
515
+ sobel_measurement = refine_edges_sobel(
516
+ image=image_canonical, # Use canonical orientation
517
+ axis_data=axis_data,
518
+ zone_data=zone_data,
519
+ scale_px_per_cm=px_per_cm,
520
+ finger_landmarks=finger_data.get("landmarks"),
521
+ sobel_threshold=sobel_threshold,
522
+ kernel_size=sobel_kernel_size,
523
+ use_subpixel=use_subpixel,
524
+ debug_dir=edge_debug_dir,
525
+ )
526
+
527
+ sobel_width_cm = sobel_measurement["median_width_cm"]
528
+ print(f"Sobel width: {sobel_width_cm:.4f}cm "
529
+ f"({sobel_measurement['num_samples']} samples, "
530
+ f"std={sobel_measurement['std_width_px']:.2f}px, "
531
+ f"quality={sobel_measurement['edge_quality']['overall_score']:.3f})")
532
+
533
+ except Exception as e:
534
+ print(f"Sobel edge refinement failed: {e}")
535
+ sobel_failed = True
536
+ if edge_method == "sobel":
537
+ # User explicitly requested Sobel, fail if it doesn't work
538
+ return create_output(
539
+ card_detected=card_detected,
540
+ finger_detected=True,
541
+ scale_px_per_cm=px_per_cm,
542
+ view_angle_ok=view_angle_ok,
543
+ fail_reason="sobel_edge_refinement_failed",
544
+ edge_method_used="sobel",
545
+ )
546
+
547
+ # Select measurement method based on edge_method flag
548
+ method_comparison_data = None
549
+
550
+ if edge_method == "contour":
551
+ # Use contour method only
552
+ final_measurement = contour_measurement
553
+ median_width_cm = contour_width_cm
554
+ edge_method_used = "contour"
555
+
556
+ elif edge_method == "sobel":
557
+ # Use Sobel method only (already handled failure case above)
558
+ final_measurement = sobel_measurement
559
+ median_width_cm = sobel_measurement["median_width_cm"]
560
+ edge_method_used = "sobel"
561
+
562
+ elif edge_method == "auto":
563
+ # Automatic selection based on quality
564
+ if sobel_measurement and not sobel_failed:
565
+ should_use_sobel, reason = should_use_sobel_measurement(sobel_measurement, contour_measurement)
566
+
567
+ if should_use_sobel:
568
+ final_measurement = sobel_measurement
569
+ median_width_cm = sobel_measurement["median_width_cm"]
570
+ edge_method_used = "sobel"
571
+ print(f"Auto-selected: Sobel (reason: {reason})")
572
+ else:
573
+ final_measurement = contour_measurement
574
+ median_width_cm = contour_width_cm
575
+ edge_method_used = "contour_fallback"
576
+ print(f"Auto-selected: Contour fallback (reason: {reason})")
577
+ else:
578
+ # Sobel failed, use contour
579
+ final_measurement = contour_measurement
580
+ median_width_cm = contour_width_cm
581
+ edge_method_used = "contour_fallback"
582
+ print(f"Auto-selected: Contour (Sobel not available)")
583
+
584
+ elif edge_method == "compare":
585
+ # Comparison mode: prefer Sobel if available, include comparison data
586
+ if sobel_measurement and not sobel_failed:
587
+ method_comparison_data = compare_edge_methods(
588
+ contour_measurement, sobel_measurement, px_per_cm
589
+ )
590
+
591
+ # Prefer Sobel in compare mode for output
592
+ final_measurement = sobel_measurement
593
+ median_width_cm = sobel_measurement["median_width_cm"]
594
+ edge_method_used = "compare"
595
+
596
+ print(f"Method comparison:")
597
+ print(f" Contour: {method_comparison_data['contour']['width_cm']:.4f}cm")
598
+ print(f" Sobel: {method_comparison_data['sobel']['width_cm']:.4f}cm")
599
+ print(f" Diff: {method_comparison_data['difference']['relative_pct']:+.2f}%")
600
+ print(f" Recommendation: {method_comparison_data['recommendation']['preferred_method']}")
601
+ else:
602
+ # Sobel failed, can't compare
603
+ final_measurement = contour_measurement
604
+ median_width_cm = contour_width_cm
605
+ edge_method_used = "contour"
606
+ print(f"Compare mode: Sobel failed, using contour only")
607
+
608
+ # Sanity check: finger width should be in realistic range (1.4-2.4 cm)
609
+ if median_width_cm < 1.0 or median_width_cm > 3.0:
610
+ print(f"Warning: Measured width {median_width_cm:.2f}cm is outside realistic range")
611
+
612
+ # Phase 8: Comprehensive confidence scoring
613
+ # Calculate component confidences
614
+ if card_result is not None:
615
+ card_conf = compute_card_confidence(card_result, scale_confidence)
616
+ else:
617
+ # Dummy card confidence when card detection skipped (testing mode)
618
+ card_conf = scale_confidence # Use dummy scale confidence (0.5)
619
+
620
+ # Calculate mask area for finger confidence
621
+ mask_area = np.sum(cleaned_mask > 0)
622
+ image_area = image.shape[0] * image.shape[1]
623
+ finger_conf = compute_finger_confidence(hand_data, finger_data, mask_area, image_area)
624
+
625
+ # Calculate measurement confidence
626
+ measurement_conf = compute_measurement_confidence(final_measurement, median_width_cm)
627
+
628
+ # Calculate edge quality confidence (v1)
629
+ edge_quality_conf = None
630
+ if edge_method_used in ["sobel", "compare"]:
631
+ edge_quality_conf = compute_edge_quality_confidence(
632
+ final_measurement.get("edge_quality")
633
+ )
634
+
635
+ # Compute overall confidence (v0 or v1 based on edge method)
636
+ confidence_breakdown = compute_overall_confidence(
637
+ card_conf,
638
+ finger_conf,
639
+ measurement_conf,
640
+ edge_method="sobel" if edge_method_used in ["sobel", "compare"] else "contour",
641
+ edge_quality_confidence=edge_quality_conf,
642
+ )
643
+
644
+ # Print confidence breakdown
645
+ conf_parts = [
646
+ f"card={confidence_breakdown['card']:.2f}",
647
+ f"finger={confidence_breakdown['finger']:.2f}",
648
+ f"measurement={confidence_breakdown['measurement']:.2f}",
649
+ ]
650
+ if confidence_breakdown.get('edge_quality') is not None:
651
+ conf_parts.append(f"edge={confidence_breakdown['edge_quality']:.2f}")
652
+
653
+ print(f"Confidence: {confidence_breakdown['overall']:.3f} ({confidence_breakdown['level']}) "
654
+ f"[{', '.join(conf_parts)}]")
655
+ if confidence_breakdown["overall"] < confidence_threshold:
656
+ print(f"Warning: Confidence {confidence_breakdown['overall']:.3f} is below threshold {confidence_threshold:.3f}")
657
+
658
+ # Phase 9: Result visualization (always generated)
659
+ if result_png_path is not None:
660
+ print(f"Generating result visualization...")
661
+
662
+ # Use comprehensive edge overlay (based on Sobel data) + card bounding box
663
+ if edge_method_used in ["sobel", "compare"] and sobel_measurement and not sobel_failed:
664
+ edge_data = sobel_measurement["edge_data"]
665
+ roi_bounds = sobel_measurement["roi_data"]["roi_bounds"]
666
+ width_data = sobel_measurement["width_data"]
667
+ width_data["median_width_cm"] = sobel_measurement["median_width_cm"]
668
+
669
+ debug_image = draw_comprehensive_edge_overlay(
670
+ full_image=image_canonical,
671
+ edge_data=edge_data,
672
+ roi_bounds=roi_bounds,
673
+ axis_data=axis_data,
674
+ zone_data=zone_data,
675
+ width_data=width_data,
676
+ scale_px_per_cm=px_per_cm,
677
+ )
678
+ else:
679
+ # Fallback: plain image with axis/zone annotations when Sobel unavailable
680
+ debug_image = image_canonical.copy()
681
+
682
+ # Draw card bounding box (transform corners if image was rotated)
683
+ if card_result is not None and "corners" in card_result:
684
+ corners = card_result["corners"]
685
+ if corners is not None:
686
+ pts = np.array(corners, dtype=np.float32)
687
+ if rotation_matrix is not None:
688
+ pts = transform_points_rotation(pts, rotation_matrix)
689
+ pts = pts.astype(np.int32).reshape((-1, 1, 2))
690
+ cv2.polylines(debug_image, [pts], isClosed=True,
691
+ color=(0, 255, 0), thickness=3, lineType=cv2.LINE_AA)
692
+
693
+ # Save result image
694
+ Path(result_png_path).parent.mkdir(parents=True, exist_ok=True)
695
+ cv2.imwrite(result_png_path, debug_image)
696
+ print(f"Result visualization saved to: {result_png_path}")
697
+
698
+
699
+ return create_output(
700
+ finger_diameter_cm=median_width_cm,
701
+ confidence=confidence_breakdown['overall'],
702
+ card_detected=card_detected,
703
+ finger_detected=True,
704
+ scale_px_per_cm=px_per_cm,
705
+ view_angle_ok=view_angle_ok,
706
+ fail_reason=None,
707
+ edge_method_used=edge_method_used,
708
+ method_comparison=method_comparison_data,
709
+ )
710
+
711
+
712
+ def main() -> int:
713
+ """Main entry point."""
714
+ args = parse_args()
715
+
716
+ # Validate input
717
+ error = validate_input(args.input)
718
+ if error:
719
+ print(f"Error: {error}", file=sys.stderr)
720
+ return 1
721
+
722
+ # Load image
723
+ image = load_image(args.input)
724
+ if image is None:
725
+ print(f"Error: Failed to load image: {args.input}", file=sys.stderr)
726
+ return 1
727
+
728
+ print(f"Loaded image: {args.input} ({image.shape[1]}x{image.shape[0]})")
729
+
730
+ # Derive result PNG path from output JSON path
731
+ result_png_path = str(Path(args.output).with_suffix(".png"))
732
+
733
+ # Run measurement pipeline
734
+ result = measure_finger(
735
+ image=image,
736
+ finger_index=args.finger_index,
737
+ confidence_threshold=args.confidence_threshold,
738
+ save_intermediate=args.save_intermediate,
739
+ result_png_path=result_png_path,
740
+ save_debug=args.debug,
741
+ edge_method=args.edge_method,
742
+ sobel_threshold=args.sobel_threshold,
743
+ sobel_kernel_size=args.sobel_kernel_size,
744
+ use_subpixel=not args.no_subpixel,
745
+ skip_card_detection=args.skip_card_detection,
746
+ )
747
+
748
+ # Save output
749
+ save_output(result, args.output)
750
+ print(f"Results saved to: {args.output}")
751
+
752
+ # Report result
753
+ if result["fail_reason"]:
754
+ print(f"Measurement failed: {result['fail_reason']}")
755
+ return 1
756
+ else:
757
+ print(f"Finger diameter: {result['finger_outer_diameter_cm']} cm")
758
+ print(f"Confidence: {result['confidence']}")
759
+ return 0
760
+
761
+
762
+ if __name__ == "__main__":
763
+ sys.exit(main())
requirements.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ opencv-python>=4.8.0
2
+ numpy>=1.24.0
3
+ mediapipe>=0.10.0
4
+ scipy>=1.11.0
5
+ scikit-learn>=1.3.0
6
+ flask>=3.0.0
7
+ gunicorn>=21.2.0
script/README.md ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Scripts
2
+
3
+ Utilities for local development/testing.
4
+
5
+ ## `script/test.sh`
6
+ Quick runner for `measure_finger.py`.
7
+
8
+ ### Usage
9
+ ```bash
10
+ ./script/test.sh
11
+ ./script/test.sh input/my_image.jpg
12
+ ./script/test.sh --no-debug
13
+ ./script/test.sh --skip-card-detection
14
+ ./script/test.sh --help
15
+ ```
16
+
17
+ ### Behavior
18
+ - Uses first image in `input/` when no image is passed.
19
+ - Creates `.venv` and installs deps if missing.
20
+ - Writes JSON to `output/test_result.json`.
21
+ - Result PNG is auto-generated as `output/test_result.png` by the main tool.
22
+ - `--debug` in this script toggles intermediate debug folders (default: enabled).
23
+
24
+ ## `script/build.sh`
25
+ Reserved for packaging/build automation (currently empty).
script/build.sh ADDED
File without changes
script/test.sh ADDED
@@ -0,0 +1,171 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/bash
2
+ # Quick test script for ring-sizer
3
+ # Usage:
4
+ # ./script/test.sh - Run basic test with debug output
5
+ # ./script/test.sh [image] - Test with specific image
6
+ # ./script/test.sh --no-debug - Run without debug visualization
7
+
8
+ set -e # Exit on error
9
+
10
+ # Colors for output
11
+ GREEN='\033[0;32m'
12
+ BLUE='\033[0;34m'
13
+ YELLOW='\033[1;33m'
14
+ NC='\033[0m' # No Color
15
+
16
+ # Get script directory and project root
17
+ SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
18
+ PROJECT_ROOT="$( cd "$SCRIPT_DIR/.." && pwd )"
19
+
20
+ # Change to project root
21
+ cd "$PROJECT_ROOT"
22
+
23
+ # Python executable
24
+ PYTHON=".venv/bin/python"
25
+
26
+ # Check if virtual environment exists
27
+ if [ ! -f "$PYTHON" ]; then
28
+ echo -e "${YELLOW}Virtual environment not found. Creating...${NC}"
29
+ python3 -m venv .venv
30
+ echo -e "${GREEN}Installing dependencies...${NC}"
31
+ .venv/bin/pip install -r requirements.txt
32
+ fi
33
+
34
+ # Default values
35
+ INPUT_IMAGE=""
36
+ OUTPUT_JSON="output/test_result.json"
37
+ ENABLE_DEBUG=true
38
+ SKIP_CARD=false
39
+ FINGER_INDEX="index"
40
+
41
+ # Parse arguments
42
+ while [ $# -gt 0 ]; do
43
+ case "$1" in
44
+ --no-debug)
45
+ ENABLE_DEBUG=false
46
+ shift
47
+ ;;
48
+ --skip-card-detection|--skip-card)
49
+ SKIP_CARD=true
50
+ shift
51
+ ;;
52
+ --finger-index|--finger|-f)
53
+ if [ -z "$2" ]; then
54
+ echo -e "${YELLOW}Error: --finger-index requires a value: auto|index|middle|ring|pinky${NC}"
55
+ exit 1
56
+ fi
57
+ case "$2" in
58
+ auto|index|middle|ring|pinky)
59
+ FINGER_INDEX="$2"
60
+ ;;
61
+ *)
62
+ echo -e "${YELLOW}Error: Invalid finger index '$2'. Use: auto|index|middle|ring|pinky${NC}"
63
+ exit 1
64
+ ;;
65
+ esac
66
+ shift 2
67
+ ;;
68
+ --help|-h)
69
+ echo "Usage: ./script/test.sh [OPTIONS] [IMAGE]"
70
+ echo ""
71
+ echo "Options:"
72
+ echo " --no-debug Run without debug visualization"
73
+ echo " --skip-card-detection Skip card detection (testing mode for finger segmentation)"
74
+ echo " --finger-index, -f Finger to measure: auto|index|middle|ring|pinky (default: index)"
75
+ echo " --help, -h Show this help message"
76
+ echo ""
77
+ echo "Examples:"
78
+ echo " ./script/test.sh # Use first available test image"
79
+ echo " ./script/test.sh input/my_image.jpg # Test with specific image"
80
+ echo " ./script/test.sh --no-debug # Skip debug output"
81
+ echo " ./script/test.sh --skip-card-detection # Test finger segmentation without card"
82
+ echo " ./script/test.sh -f ring # Measure ring finger"
83
+ exit 0
84
+ ;;
85
+ *)
86
+ INPUT_IMAGE="$1"
87
+ shift
88
+ ;;
89
+ esac
90
+ done
91
+
92
+ # Find first available test image if not specified
93
+ if [ -z "$INPUT_IMAGE" ]; then
94
+ echo -e "${BLUE}Looking for test images in input/...${NC}"
95
+
96
+ # Try to find any image file
97
+ for ext in jpg jpeg png heic; do
98
+ INPUT_IMAGE=$(find input/ -maxdepth 1 -type f -iname "*.$ext" | head -1)
99
+ if [ -n "$INPUT_IMAGE" ]; then
100
+ break
101
+ fi
102
+ done
103
+
104
+ if [ -z "$INPUT_IMAGE" ]; then
105
+ echo -e "${YELLOW}No test images found in input/ directory${NC}"
106
+ echo "Please add a test image to input/ or specify one as an argument:"
107
+ echo " ./script/test.sh path/to/image.jpg"
108
+ exit 1
109
+ fi
110
+ fi
111
+
112
+ # Check if input file exists
113
+ if [ ! -f "$INPUT_IMAGE" ]; then
114
+ echo -e "${YELLOW}Error: Input file not found: $INPUT_IMAGE${NC}"
115
+ exit 1
116
+ fi
117
+
118
+ # Create output directory if it doesn't exist
119
+ mkdir -p output
120
+ rm -rf output/*_debug/*
121
+
122
+ # Build command
123
+ #CMD="$PYTHON measure_finger.py --input $INPUT_IMAGE --output $OUTPUT_JSON --edge-method sobel --edge-detection-method canny_contour"
124
+ CMD="$PYTHON measure_finger.py --input $INPUT_IMAGE --output $OUTPUT_JSON --finger-index $FINGER_INDEX"
125
+
126
+ if [ "$ENABLE_DEBUG" = true ]; then
127
+ CMD="$CMD --debug"
128
+ fi
129
+
130
+ if [ "$SKIP_CARD" = true ]; then
131
+ CMD="$CMD --skip-card-detection"
132
+ fi
133
+
134
+ # Print test info
135
+ echo -e "${GREEN}========================================${NC}"
136
+ echo -e "${GREEN}Ring Sizer Quick Test${NC}"
137
+ echo -e "${GREEN}========================================${NC}"
138
+ echo -e "${BLUE}Input:${NC} $INPUT_IMAGE"
139
+ echo -e "${BLUE}Output:${NC} $OUTPUT_JSON"
140
+ echo -e "${BLUE}Finger:${NC} $FINGER_INDEX"
141
+ RESULT_PNG="${OUTPUT_JSON%.json}.png"
142
+ if [ "$ENABLE_DEBUG" = true ]; then
143
+ echo -e "${BLUE}Debug:${NC} enabled"
144
+ fi
145
+ if [ "$SKIP_CARD" = true ]; then
146
+ echo -e "${YELLOW}Mode:${NC} TESTING (card detection skipped)"
147
+ fi
148
+ echo -e "${GREEN}========================================${NC}"
149
+ echo ""
150
+
151
+ # Run the measurement
152
+ $CMD
153
+
154
+ # Print results
155
+ echo ""
156
+ echo -e "${GREEN}========================================${NC}"
157
+ echo -e "${GREEN}Test Complete!${NC}"
158
+ echo -e "${GREEN}========================================${NC}"
159
+
160
+ if [ -f "$OUTPUT_JSON" ]; then
161
+ echo -e "${BLUE}Results:${NC}"
162
+ cat "$OUTPUT_JSON" | python3 -m json.tool
163
+ echo ""
164
+ fi
165
+
166
+ if [ -f "$RESULT_PNG" ]; then
167
+ echo -e "${BLUE}Result image saved to:${NC} $RESULT_PNG"
168
+ echo ""
169
+ fi
170
+
171
+ echo -e "${GREEN}========================================${NC}"
src/__init__.py ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Utility modules for finger measurement.
3
+ """
4
+
5
+ from .card_detection import detect_credit_card, compute_scale_factor
6
+ from .finger_segmentation import segment_hand, isolate_finger, clean_mask, get_finger_contour
7
+ from .geometry import estimate_finger_axis, localize_ring_zone, compute_cross_section_width
8
+ from .image_quality import assess_image_quality, detect_blur, check_exposure
9
+ from .confidence import (
10
+ compute_card_confidence,
11
+ compute_finger_confidence,
12
+ compute_measurement_confidence,
13
+ compute_overall_confidence,
14
+ )
15
+ from .visualization import create_debug_visualization
16
+
17
+ __all__ = [
18
+ "detect_credit_card",
19
+ "compute_scale_factor",
20
+ "segment_hand",
21
+ "isolate_finger",
22
+ "clean_mask",
23
+ "get_finger_contour",
24
+ "estimate_finger_axis",
25
+ "localize_ring_zone",
26
+ "compute_cross_section_width",
27
+ "assess_image_quality",
28
+ "detect_blur",
29
+ "check_exposure",
30
+ "compute_card_confidence",
31
+ "compute_finger_confidence",
32
+ "compute_measurement_confidence",
33
+ "compute_overall_confidence",
34
+ "create_debug_visualization",
35
+ ]
src/card_detection.py ADDED
@@ -0,0 +1,612 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Credit card detection and scale calibration utilities.
3
+
4
+ This module handles:
5
+ - Detecting credit card contour in an image
6
+ - Verifying aspect ratio matches standard credit card
7
+ - Perspective rectification
8
+ - Computing pixels-per-cm scale factor
9
+ """
10
+
11
+ import cv2
12
+ import numpy as np
13
+ from typing import Optional, Tuple, Dict, Any, List
14
+ from pathlib import Path
15
+
16
+ # Import debug observer and drawing functions
17
+ from .debug_observer import DebugObserver, draw_contours_overlay, draw_candidates_with_scores
18
+
19
+ # Import shared visualization constants
20
+ from .viz_constants import (
21
+ FONT_FACE,
22
+ Color,
23
+ StrategyColor,
24
+ FontScale,
25
+ FontThickness,
26
+ Size,
27
+ Layout,
28
+ )
29
+
30
+ # Standard credit card dimensions (ISO/IEC 7810 ID-1)
31
+ CARD_WIDTH_MM = 85.60
32
+ CARD_HEIGHT_MM = 53.98
33
+ CARD_WIDTH_CM = CARD_WIDTH_MM / 10
34
+ CARD_HEIGHT_CM = CARD_HEIGHT_MM / 10
35
+ CARD_ASPECT_RATIO = CARD_WIDTH_MM / CARD_HEIGHT_MM # ~1.586
36
+
37
+ # Detection parameters
38
+ MIN_CARD_AREA_RATIO = 0.01 # Card must be at least 1% of image area
39
+ MAX_CARD_AREA_RATIO = 0.5 # Card must be at most 50% of image area
40
+
41
+
42
+ def order_corners(corners: np.ndarray) -> np.ndarray:
43
+ """
44
+ Order corners as: top-left, top-right, bottom-right, bottom-left.
45
+
46
+ Args:
47
+ corners: 4x2 array of corner points
48
+
49
+ Returns:
50
+ Ordered 4x2 array of corners
51
+ """
52
+ corners = corners.reshape(4, 2).astype(np.float32)
53
+
54
+ # Sort by sum (x+y): smallest = top-left, largest = bottom-right
55
+ s = corners.sum(axis=1)
56
+ tl_idx = np.argmin(s)
57
+ br_idx = np.argmax(s)
58
+
59
+ # Sort by diff (y-x): smallest = top-right, largest = bottom-left
60
+ d = np.diff(corners, axis=1).flatten()
61
+ tr_idx = np.argmin(d)
62
+ bl_idx = np.argmax(d)
63
+
64
+ return np.array([
65
+ corners[tl_idx],
66
+ corners[tr_idx],
67
+ corners[br_idx],
68
+ corners[bl_idx],
69
+ ], dtype=np.float32)
70
+
71
+
72
+ def get_quad_dimensions(corners: np.ndarray) -> Tuple[float, float]:
73
+ """
74
+ Get width and height of a quadrilateral from ordered corners.
75
+
76
+ Args:
77
+ corners: Ordered 4x2 array (TL, TR, BR, BL)
78
+
79
+ Returns:
80
+ Tuple of (width, height) in pixels
81
+ """
82
+ # Width: average of top and bottom edges
83
+ top_width = np.linalg.norm(corners[1] - corners[0])
84
+ bottom_width = np.linalg.norm(corners[2] - corners[3])
85
+ width = (top_width + bottom_width) / 2
86
+
87
+ # Height: average of left and right edges
88
+ left_height = np.linalg.norm(corners[3] - corners[0])
89
+ right_height = np.linalg.norm(corners[2] - corners[1])
90
+ height = (left_height + right_height) / 2
91
+
92
+ return width, height
93
+
94
+
95
+ def score_card_candidate(
96
+ contour: np.ndarray,
97
+ corners: np.ndarray,
98
+ image_area: float,
99
+ aspect_ratio_tolerance: float = 0.15,
100
+ ) -> Tuple[float, Dict[str, Any]]:
101
+ """
102
+ Score a quadrilateral candidate for being a credit card.
103
+
104
+ Since candidates come from minAreaRect, corners are always a perfect
105
+ rectangle. Scoring focuses on aspect ratio match and area coverage.
106
+
107
+ Args:
108
+ contour: Original contour (minAreaRect box points)
109
+ corners: 4 corner points
110
+ image_area: Total image area for relative sizing
111
+ aspect_ratio_tolerance: Allowed deviation from standard ratio
112
+
113
+ Returns:
114
+ Tuple of (score, details_dict)
115
+ """
116
+ ordered = order_corners(corners)
117
+ width, height = get_quad_dimensions(ordered)
118
+ area = cv2.contourArea(corners)
119
+
120
+ details = {
121
+ "corners": ordered,
122
+ "width": width,
123
+ "height": height,
124
+ "area": area,
125
+ }
126
+
127
+ # Check area ratio
128
+ area_ratio = area / image_area
129
+ if area_ratio < MIN_CARD_AREA_RATIO or area_ratio > MAX_CARD_AREA_RATIO:
130
+ details["reject_reason"] = f"area_ratio={area_ratio:.3f}"
131
+ return 0.0, details
132
+
133
+ # Safeguard against zero dimensions
134
+ if width <= 0 or height <= 0:
135
+ details["reject_reason"] = "invalid_dimensions"
136
+ return 0.0, details
137
+
138
+ # Calculate aspect ratio (always use larger/smaller for consistency)
139
+ if width > height:
140
+ aspect_ratio = width / height
141
+ else:
142
+ aspect_ratio = height / width
143
+ details["aspect_ratio"] = aspect_ratio
144
+
145
+ # Check aspect ratio against credit card standard
146
+ ratio_diff = abs(aspect_ratio - CARD_ASPECT_RATIO) / CARD_ASPECT_RATIO
147
+ if ratio_diff > aspect_ratio_tolerance:
148
+ details["reject_reason"] = f"aspect_ratio={aspect_ratio:.3f}, expected~{CARD_ASPECT_RATIO:.3f}"
149
+ return 0.0, details
150
+
151
+ # Compute score (higher is better)
152
+ # minAreaRect always produces perfect rectangles, so no angle check needed.
153
+ # Score based on area size and aspect ratio match.
154
+ area_score = min(area_ratio / 0.1, 1.0) # Normalize to max at 10% of image
155
+ ratio_score = 1.0 - ratio_diff / aspect_ratio_tolerance
156
+
157
+ score = 0.5 * area_score + 0.5 * ratio_score
158
+ details["score_components"] = {
159
+ "area": area_score,
160
+ "ratio": ratio_score,
161
+ }
162
+
163
+ return score, details
164
+
165
+
166
+ def find_card_contours(
167
+ image: np.ndarray,
168
+ image_area: float,
169
+ aspect_ratio_tolerance: float = 0.15,
170
+ min_score: float = 0.3,
171
+ debug_dir: Optional[str] = None,
172
+ ) -> List[np.ndarray]:
173
+ """
174
+ Find potential card contours using a waterfall of detection strategies.
175
+
176
+ Strategies are tried in order: Canny → Adaptive → Otsu → Color.
177
+ If a strategy produces a candidate scoring above min_score, subsequent
178
+ strategies are skipped.
179
+
180
+ Args:
181
+ image: Input BGR image
182
+ image_area: Total image area in pixels
183
+ aspect_ratio_tolerance: Allowed deviation from standard aspect ratio
184
+ min_score: Minimum score to accept a strategy's candidates
185
+ debug_dir: Optional directory to save debug images
186
+
187
+ Returns:
188
+ List of 4-point contour approximations from the first successful strategy
189
+ """
190
+ # Create debug observer if debug mode enabled
191
+ observer = DebugObserver(debug_dir) if debug_dir else None
192
+
193
+ h, w = image.shape[:2]
194
+ min_area = h * w * 0.01 # At least 1% of image
195
+ max_area = h * w * 0.5 # At most 50% of image
196
+
197
+ # Save original image
198
+ if observer:
199
+ observer.save_stage("01_original", image)
200
+
201
+ # Convert to grayscale
202
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
203
+ if observer:
204
+ observer.save_stage("02_grayscale", gray)
205
+
206
+ # Apply bilateral filter to reduce noise while keeping edges
207
+ filtered = cv2.bilateralFilter(gray, 11, 75, 75)
208
+ if observer:
209
+ observer.save_stage("03_bilateral_filtered", filtered)
210
+
211
+ def extract_quads(contours, epsilon_factor=0.02, min_rectangularity=0.7,
212
+ aspect_tolerance=0.15):
213
+ """Extract quadrilaterals from contours using minAreaRect.
214
+
215
+ Shape constraints:
216
+ - Rectangularity (contour_area / rect_area): rejects irregular shapes
217
+ - Aspect ratio: rejects rectangles that don't match card proportions
218
+ """
219
+ quads = []
220
+ for contour in contours:
221
+ contour_area = cv2.contourArea(contour)
222
+ if contour_area < min_area or contour_area > max_area:
223
+ continue
224
+
225
+ peri = cv2.arcLength(contour, True)
226
+ approx = cv2.approxPolyDP(contour, epsilon_factor * peri, True)
227
+
228
+ if len(approx) < 4:
229
+ continue
230
+
231
+ rect = cv2.minAreaRect(contour)
232
+ box = cv2.boxPoints(rect).astype(np.float32)
233
+
234
+ rect_area = cv2.contourArea(box)
235
+ if rect_area <= 0:
236
+ continue
237
+ rectangularity = contour_area / rect_area
238
+ if rectangularity < min_rectangularity:
239
+ continue
240
+
241
+ (_, _), (bw, bh), _ = rect
242
+ if bw <= 0 or bh <= 0:
243
+ continue
244
+ aspect = max(bw, bh) / min(bw, bh)
245
+ if abs(aspect - CARD_ASPECT_RATIO) / CARD_ASPECT_RATIO > aspect_tolerance:
246
+ continue
247
+
248
+ quads.append(box.reshape(4, 1, 2))
249
+
250
+ return quads
251
+
252
+ def dedup_quads(quads, center_threshold=50):
253
+ """Remove near-duplicate boxes, keeping the largest when centers overlap.
254
+
255
+ Two boxes are considered duplicates if their centers are within
256
+ center_threshold pixels of each other.
257
+ """
258
+ if len(quads) <= 1:
259
+ return quads
260
+
261
+ # Sort by area descending so largest comes first
262
+ quads_with_area = [(q, cv2.contourArea(q)) for q in quads]
263
+ quads_with_area.sort(key=lambda x: x[1], reverse=True)
264
+
265
+ kept = []
266
+ for quad, area in quads_with_area:
267
+ center = quad.reshape(4, 2).mean(axis=0)
268
+ is_dup = False
269
+ for kept_quad in kept:
270
+ kept_center = kept_quad.reshape(4, 2).mean(axis=0)
271
+ dist = np.linalg.norm(center - kept_center)
272
+ if dist < center_threshold:
273
+ is_dup = True
274
+ break
275
+ if not is_dup:
276
+ kept.append(quad)
277
+
278
+ return kept
279
+
280
+ def score_best(quads):
281
+ """Return the best score among quads."""
282
+ best = 0.0
283
+ for q in quads:
284
+ corners = q.reshape(4, 2)
285
+ score, _ = score_card_candidate(
286
+ q, corners, image_area, aspect_ratio_tolerance
287
+ )
288
+ best = max(best, score)
289
+ return best
290
+
291
+ # --- Waterfall: try strategies in order, stop on first success ---
292
+
293
+ # Strategy 1: Canny edge detection with various thresholds
294
+ canny_candidates = []
295
+ canny_configs = [(20, 60), (30, 100), (50, 150), (75, 200), (100, 250)]
296
+ saved_canny_indices = [0, 2, 4]
297
+
298
+ for idx, (canny_low, canny_high) in enumerate(canny_configs):
299
+ edges = cv2.Canny(filtered, canny_low, canny_high)
300
+
301
+ if idx in saved_canny_indices and observer:
302
+ observer.save_stage(f"04_canny_{canny_low}_{canny_high}", edges)
303
+
304
+ kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 5))
305
+ edges_morphed = cv2.morphologyEx(edges, cv2.MORPH_CLOSE, kernel)
306
+
307
+ if idx == 2 and observer:
308
+ observer.save_stage("07_canny_morphology", edges_morphed)
309
+
310
+ contours, _ = cv2.findContours(edges_morphed, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
311
+ canny_candidates.extend(extract_quads(contours))
312
+
313
+ canny_candidates = dedup_quads(canny_candidates)
314
+
315
+ if observer and canny_candidates:
316
+ observer.draw_and_save("08_canny_contours", image,
317
+ draw_contours_overlay, canny_candidates, "Canny Edge Detection", StrategyColor.CANNY)
318
+
319
+ if canny_candidates and score_best(canny_candidates) >= min_score:
320
+ return canny_candidates
321
+
322
+ # Strategy 2: Adaptive thresholding (for varying lighting)
323
+ adaptive_candidates = []
324
+ adaptive_configs = [(11, 2), (21, 5), (31, 10), (51, 10)]
325
+ saved_adaptive = [0, 2]
326
+
327
+ for idx, (block_size, C) in enumerate(adaptive_configs):
328
+ thresh = cv2.adaptiveThreshold(
329
+ filtered, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
330
+ cv2.THRESH_BINARY, block_size, C
331
+ )
332
+
333
+ if idx in saved_adaptive and observer:
334
+ if idx == 0:
335
+ observer.save_stage("09_adaptive_11_2", thresh)
336
+ elif idx == 2:
337
+ observer.save_stage("10_adaptive_31_10", thresh)
338
+
339
+ for img in [thresh, 255 - thresh]:
340
+ contours, _ = cv2.findContours(img, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
341
+ adaptive_candidates.extend(extract_quads(contours))
342
+
343
+ adaptive_candidates = dedup_quads(adaptive_candidates)
344
+
345
+ if observer and adaptive_candidates:
346
+ observer.draw_and_save("11_adaptive_contours", image,
347
+ draw_contours_overlay, adaptive_candidates, "Adaptive Thresholding", StrategyColor.ADAPTIVE)
348
+
349
+ if adaptive_candidates and score_best(adaptive_candidates) >= min_score:
350
+ return adaptive_candidates
351
+
352
+ # Strategy 3: Otsu's thresholding
353
+ otsu_candidates = []
354
+ _, otsu = cv2.threshold(filtered, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
355
+ if observer:
356
+ observer.save_stage("12_otsu_binary", otsu)
357
+
358
+ otsu_inverted = 255 - otsu
359
+ if observer:
360
+ observer.save_stage("13_otsu_inverted", otsu_inverted)
361
+
362
+ for img in [otsu, otsu_inverted]:
363
+ kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
364
+ img_morphed = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
365
+ contours, _ = cv2.findContours(img_morphed, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
366
+ otsu_candidates.extend(extract_quads(contours))
367
+
368
+ otsu_candidates = dedup_quads(otsu_candidates)
369
+
370
+ if observer and otsu_candidates:
371
+ observer.draw_and_save("14_otsu_contours", image,
372
+ draw_contours_overlay, otsu_candidates, "Otsu Thresholding", StrategyColor.OTSU)
373
+
374
+ if otsu_candidates and score_best(otsu_candidates) >= min_score:
375
+ return otsu_candidates
376
+
377
+ # Strategy 4: Color-based segmentation (gray card on light background)
378
+ color_candidates = []
379
+ hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
380
+ sat = hsv[:, :, 1]
381
+ if observer:
382
+ observer.save_stage("15_hsv_saturation", sat)
383
+
384
+ _, low_sat_mask = cv2.threshold(sat, 30, 255, cv2.THRESH_BINARY_INV)
385
+ if observer:
386
+ observer.save_stage("16_low_sat_mask", low_sat_mask)
387
+
388
+ val = hsv[:, :, 2]
389
+ gray_mask = cv2.bitwise_and(low_sat_mask, cv2.inRange(val, 80, 200))
390
+
391
+ kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (7, 7))
392
+ gray_mask = cv2.morphologyEx(gray_mask, cv2.MORPH_CLOSE, kernel)
393
+ gray_mask = cv2.morphologyEx(gray_mask, cv2.MORPH_OPEN, kernel)
394
+ if observer:
395
+ observer.save_stage("17_gray_mask", gray_mask)
396
+
397
+ contours, _ = cv2.findContours(gray_mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
398
+ color_candidates = dedup_quads(extract_quads(contours, epsilon_factor=0.03))
399
+
400
+ if observer and color_candidates:
401
+ observer.draw_and_save("18_color_contours", image,
402
+ draw_contours_overlay, color_candidates, "Color-Based Detection", StrategyColor.COLOR_BASED)
403
+
404
+ if color_candidates and score_best(color_candidates) >= min_score:
405
+ return color_candidates
406
+
407
+ # No strategy succeeded — return all collected candidates as last resort
408
+ all_candidates = canny_candidates + adaptive_candidates + otsu_candidates + color_candidates
409
+ if observer and all_candidates:
410
+ observer.draw_and_save("19_all_candidates", image,
411
+ draw_contours_overlay, all_candidates, "All Candidates (fallback)", StrategyColor.ALL_CANDIDATES)
412
+ return all_candidates
413
+
414
+
415
+ def detect_credit_card(
416
+ image: np.ndarray,
417
+ aspect_ratio_tolerance: float = 0.15,
418
+ debug_dir: Optional[str] = None,
419
+ ) -> Optional[Dict[str, Any]]:
420
+ """
421
+ Detect a credit card in the image.
422
+
423
+ Args:
424
+ image: Input BGR image
425
+ aspect_ratio_tolerance: Allowed deviation from standard aspect ratio
426
+ debug_dir: Optional directory to save debug images
427
+
428
+ Returns:
429
+ Dictionary containing:
430
+ - corners: 4x2 array of corner points (ordered)
431
+ - contour: Full contour points
432
+ - confidence: Detection confidence score
433
+ - width_px, height_px: Detected dimensions
434
+ - aspect_ratio: Detected aspect ratio
435
+ Or None if no card detected
436
+ """
437
+ # Create debug observer if debug mode enabled
438
+ observer = DebugObserver(debug_dir) if debug_dir else None
439
+
440
+ if observer:
441
+ print(f"Saving card detection debug images to: {debug_dir}")
442
+
443
+ h, w = image.shape[:2]
444
+ image_area = h * w
445
+
446
+ # Find candidate contours (waterfall: stops after first successful strategy)
447
+ candidates = find_card_contours(
448
+ image, image_area=image_area,
449
+ aspect_ratio_tolerance=aspect_ratio_tolerance,
450
+ debug_dir=debug_dir,
451
+ )
452
+
453
+ if not candidates:
454
+ if observer:
455
+ print(" No candidates found")
456
+ return None
457
+
458
+ # Score each candidate
459
+ best_score = 0.0
460
+ best_result = None
461
+ all_scored = []
462
+
463
+ for contour in candidates:
464
+ corners = contour.reshape(4, 2)
465
+ score, details = score_card_candidate(
466
+ contour, corners, image_area, aspect_ratio_tolerance
467
+ )
468
+
469
+ all_scored.append((corners, score, details))
470
+
471
+ if score > best_score:
472
+ best_score = score
473
+ best_result = details
474
+
475
+ # Sort by score (descending) and take top 5
476
+ all_scored.sort(key=lambda x: x[1], reverse=True)
477
+ top_candidates = all_scored[:5]
478
+
479
+ # Save scored candidates visualization
480
+ if observer and top_candidates:
481
+ observer.draw_and_save("20_scored_candidates", image,
482
+ draw_candidates_with_scores, top_candidates, "Top 5 Candidates")
483
+
484
+ if best_result is None or best_score < 0.3:
485
+ if observer:
486
+ print(f" Best score {best_score:.2f} below threshold 0.3")
487
+ return None
488
+
489
+ # Save final detection
490
+ if observer:
491
+ final_overlay = image.copy()
492
+ corners = best_result["corners"].astype(np.int32)
493
+ cv2.polylines(final_overlay, [corners], True, Color.GREEN, Size.CONTOUR_THICK)
494
+
495
+ # Draw corners
496
+ for pt in corners:
497
+ cv2.circle(final_overlay, tuple(pt), Size.CORNER_RADIUS + 2, Color.RED, -1)
498
+
499
+ # Add details text
500
+ text_y = Layout.TITLE_Y
501
+ details_text = [
502
+ "Final Detection",
503
+ f"Score: {best_score:.3f}",
504
+ f"Aspect Ratio: {best_result['aspect_ratio']:.3f}",
505
+ f"Dimensions: {best_result['width']:.0f}x{best_result['height']:.0f}px",
506
+ ]
507
+
508
+ for text in details_text:
509
+ cv2.putText(
510
+ final_overlay, text, (Layout.TEXT_OFFSET_X, text_y),
511
+ FONT_FACE, FontScale.SUBTITLE, Color.WHITE,
512
+ FontThickness.SUBTITLE_OUTLINE, cv2.LINE_AA
513
+ )
514
+ cv2.putText(
515
+ final_overlay, text, (Layout.TEXT_OFFSET_X, text_y),
516
+ FONT_FACE, FontScale.SUBTITLE, Color.GREEN,
517
+ FontThickness.SUBTITLE, cv2.LINE_AA
518
+ )
519
+ text_y += Layout.LINE_SPACING
520
+
521
+ observer.save_stage("21_final_detection", final_overlay)
522
+ print(f" Saved 21 debug images")
523
+
524
+ return {
525
+ "corners": best_result["corners"],
526
+ "contour": best_result["corners"],
527
+ "confidence": best_score,
528
+ "width_px": best_result["width"],
529
+ "height_px": best_result["height"],
530
+ "aspect_ratio": best_result["aspect_ratio"],
531
+ }
532
+
533
+
534
+ def rectify_card(
535
+ image: np.ndarray,
536
+ corners: np.ndarray,
537
+ output_width: int = 856,
538
+ ) -> Tuple[np.ndarray, np.ndarray]:
539
+ """
540
+ Apply perspective transform to rectify the card region.
541
+
542
+ Args:
543
+ image: Input BGR image
544
+ corners: Ordered 4x2 array of corner points (TL, TR, BR, BL)
545
+ output_width: Width of output image (height computed from aspect ratio)
546
+
547
+ Returns:
548
+ Tuple of (rectified_image, transform_matrix)
549
+ """
550
+ corners = corners.astype(np.float32)
551
+
552
+ # Determine if card is in portrait or landscape orientation
553
+ width, height = get_quad_dimensions(corners)
554
+
555
+ if width > height:
556
+ # Landscape orientation
557
+ out_w = output_width
558
+ out_h = int(output_width / CARD_ASPECT_RATIO)
559
+ else:
560
+ # Portrait orientation (rotated 90°)
561
+ out_h = output_width
562
+ out_w = int(output_width / CARD_ASPECT_RATIO)
563
+
564
+ # Destination points
565
+ dst = np.array([
566
+ [0, 0],
567
+ [out_w - 1, 0],
568
+ [out_w - 1, out_h - 1],
569
+ [0, out_h - 1],
570
+ ], dtype=np.float32)
571
+
572
+ # Compute perspective transform
573
+ M = cv2.getPerspectiveTransform(corners, dst)
574
+
575
+ # Apply transform
576
+ rectified = cv2.warpPerspective(image, M, (out_w, out_h))
577
+
578
+ return rectified, M
579
+
580
+
581
+ def compute_scale_factor(
582
+ corners: np.ndarray,
583
+ ) -> Tuple[float, float]:
584
+ """
585
+ Compute pixels-per-cm scale factor from detected card corners.
586
+
587
+ Args:
588
+ corners: Ordered 4x2 array of corner points
589
+
590
+ Returns:
591
+ Tuple of (px_per_cm, confidence)
592
+ """
593
+ width_px, height_px = get_quad_dimensions(corners)
594
+
595
+ # Determine orientation and compute scale
596
+ if width_px > height_px:
597
+ # Landscape: width corresponds to card width (8.56 cm)
598
+ px_per_cm_w = width_px / CARD_WIDTH_CM
599
+ px_per_cm_h = height_px / CARD_HEIGHT_CM
600
+ else:
601
+ # Portrait: width corresponds to card height (5.398 cm)
602
+ px_per_cm_w = width_px / CARD_HEIGHT_CM
603
+ px_per_cm_h = height_px / CARD_WIDTH_CM
604
+
605
+ # Average the two estimates
606
+ px_per_cm = (px_per_cm_w + px_per_cm_h) / 2
607
+
608
+ # Confidence based on consistency between width and height estimates
609
+ consistency = 1.0 - abs(px_per_cm_w - px_per_cm_h) / max(px_per_cm_w, px_per_cm_h)
610
+ confidence = max(0.0, min(1.0, consistency))
611
+
612
+ return px_per_cm, confidence
src/confidence.py ADDED
@@ -0,0 +1,311 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Confidence scoring utilities.
3
+
4
+ This module handles:
5
+ - Card detection confidence
6
+ - Finger detection confidence
7
+ - Measurement stability confidence
8
+ - Edge quality confidence (v1)
9
+ - Aggregate confidence calculation
10
+
11
+ All thresholds and weights are imported from confidence_constants.py.
12
+ """
13
+
14
+ import logging
15
+ import numpy as np
16
+ from typing import Dict, Any, Optional, Literal
17
+
18
+ from .confidence_constants import (
19
+ # Card confidence constants
20
+ CARD_IDEAL_ASPECT_RATIO,
21
+ CARD_MAX_ASPECT_DEVIATION,
22
+ CARD_WEIGHT_DETECTION,
23
+ CARD_WEIGHT_ASPECT,
24
+ CARD_WEIGHT_SCALE,
25
+ # Finger confidence constants
26
+ FINGER_IDEAL_MIN_AREA_FRACTION,
27
+ FINGER_IDEAL_MAX_AREA_FRACTION,
28
+ FINGER_WEIGHT_HAND_DETECTION,
29
+ FINGER_WEIGHT_MASK_VALIDITY,
30
+ # Measurement confidence constants
31
+ MEASUREMENT_CV_POOR,
32
+ MEASUREMENT_CONSISTENCY_THRESHOLD,
33
+ MEASUREMENT_OUTLIER_STD_MULTIPLIER,
34
+ MEASUREMENT_WIDTH_TYPICAL_MIN,
35
+ MEASUREMENT_WIDTH_TYPICAL_MAX,
36
+ MEASUREMENT_WIDTH_ABSOLUTE_MIN,
37
+ MEASUREMENT_WIDTH_ABSOLUTE_MAX,
38
+ MEASUREMENT_WEIGHT_VARIANCE,
39
+ MEASUREMENT_WEIGHT_CONSISTENCY,
40
+ MEASUREMENT_WEIGHT_OUTLIERS,
41
+ MEASUREMENT_WEIGHT_RANGE,
42
+ MEASUREMENT_RANGE_SCORE_IDEAL,
43
+ MEASUREMENT_RANGE_SCORE_BORDERLINE,
44
+ MEASUREMENT_RANGE_SCORE_OUTSIDE,
45
+ # Overall confidence constants
46
+ V0_WEIGHT_CARD,
47
+ V0_WEIGHT_FINGER,
48
+ V0_WEIGHT_MEASUREMENT,
49
+ V1_WEIGHT_CARD,
50
+ V1_WEIGHT_FINGER,
51
+ V1_WEIGHT_EDGE_QUALITY,
52
+ V1_WEIGHT_MEASUREMENT,
53
+ CONFIDENCE_LEVEL_HIGH_THRESHOLD,
54
+ CONFIDENCE_LEVEL_MEDIUM_THRESHOLD,
55
+ )
56
+
57
+ logger = logging.getLogger(__name__)
58
+
59
+ EdgeMethod = Literal["contour", "sobel", "sobel_fallback"]
60
+
61
+
62
+ def compute_card_confidence(
63
+ card_result: Dict[str, Any],
64
+ scale_confidence: float,
65
+ ) -> float:
66
+ """
67
+ Compute confidence score from card detection.
68
+
69
+ Uses constants:
70
+ - CARD_IDEAL_ASPECT_RATIO: ISO/IEC 7810 ID-1 aspect ratio
71
+ - CARD_MAX_ASPECT_DEVIATION: Maximum acceptable deviation (0.15)
72
+ - CARD_WEIGHT_*: Component weights (detection: 50%, aspect: 25%, scale: 25%)
73
+
74
+ Args:
75
+ card_result: Output from detect_credit_card()
76
+ scale_confidence: Scale calibration confidence
77
+
78
+ Returns:
79
+ Card confidence score [0, 1]
80
+ """
81
+ # Base confidence from card detection
82
+ detection_conf = card_result.get("confidence", 0.0)
83
+
84
+ # Aspect ratio deviation penalty
85
+ aspect_ratio = card_result.get("aspect_ratio", 0.0)
86
+ aspect_deviation = abs(aspect_ratio - CARD_IDEAL_ASPECT_RATIO) / CARD_IDEAL_ASPECT_RATIO
87
+
88
+ # Penalize deviation beyond threshold
89
+ aspect_score = max(0, 1.0 - (aspect_deviation / CARD_MAX_ASPECT_DEVIATION))
90
+
91
+ # Combine components with weights
92
+ card_conf = (
93
+ CARD_WEIGHT_DETECTION * detection_conf +
94
+ CARD_WEIGHT_ASPECT * aspect_score +
95
+ CARD_WEIGHT_SCALE * scale_confidence
96
+ )
97
+
98
+ return float(np.clip(card_conf, 0, 1))
99
+
100
+
101
+ def compute_finger_confidence(
102
+ hand_data: Dict[str, Any],
103
+ finger_data: Dict[str, Any],
104
+ mask_area: int,
105
+ image_area: int,
106
+ ) -> float:
107
+ """
108
+ Compute confidence score from finger detection.
109
+
110
+ Uses constants:
111
+ - FINGER_IDEAL_MIN_AREA_FRACTION: Minimum ideal mask area (0.5% of image)
112
+ - FINGER_IDEAL_MAX_AREA_FRACTION: Maximum ideal mask area (5% of image)
113
+ - FINGER_WEIGHT_*: Component weights (hand: 70%, mask: 30%)
114
+
115
+ Args:
116
+ hand_data: Output from segment_hand()
117
+ finger_data: Output from isolate_finger()
118
+ mask_area: Area of cleaned finger mask in pixels
119
+ image_area: Total image area in pixels
120
+
121
+ Returns:
122
+ Finger confidence score [0, 1]
123
+ """
124
+ # Hand landmark detection confidence from MediaPipe
125
+ hand_conf = hand_data.get("confidence", 0.0)
126
+
127
+ # Mask area validity (should be reasonable fraction of image)
128
+ mask_fraction = mask_area / image_area
129
+ # Ideal range: FINGER_IDEAL_MIN_AREA_FRACTION to FINGER_IDEAL_MAX_AREA_FRACTION
130
+ if mask_fraction < FINGER_IDEAL_MIN_AREA_FRACTION:
131
+ area_score = mask_fraction / FINGER_IDEAL_MIN_AREA_FRACTION
132
+ elif mask_fraction > FINGER_IDEAL_MAX_AREA_FRACTION:
133
+ area_score = max(0, 1.0 - (mask_fraction - FINGER_IDEAL_MAX_AREA_FRACTION) / FINGER_IDEAL_MAX_AREA_FRACTION)
134
+ else:
135
+ area_score = 1.0
136
+
137
+ # Combine components with weights
138
+ finger_conf = FINGER_WEIGHT_HAND_DETECTION * hand_conf + FINGER_WEIGHT_MASK_VALIDITY * area_score
139
+
140
+ return float(np.clip(finger_conf, 0, 1))
141
+
142
+
143
+ def compute_measurement_confidence(
144
+ width_data: Dict[str, Any],
145
+ median_width_cm: float,
146
+ ) -> float:
147
+ """
148
+ Compute confidence score from measurement stability.
149
+
150
+ Uses constants:
151
+ - MEASUREMENT_CV_POOR: Coefficient of variation threshold (0.15)
152
+ - MEASUREMENT_CONSISTENCY_THRESHOLD: Median-mean difference threshold (0.1)
153
+ - MEASUREMENT_OUTLIER_STD_MULTIPLIER: Outlier detection threshold (2.0)
154
+ - MEASUREMENT_WIDTH_*: Realistic width ranges (1.0-3.0 cm)
155
+ - MEASUREMENT_WEIGHT_*: Component weights (variance: 40%, consistency: 20%, outliers: 20%, range: 20%)
156
+ - MEASUREMENT_RANGE_SCORE_*: Range score values
157
+
158
+ Args:
159
+ width_data: Output from compute_cross_section_width()
160
+ median_width_cm: Median width in centimeters
161
+
162
+ Returns:
163
+ Measurement confidence score [0, 1]
164
+ """
165
+ widths_px = np.array(width_data.get("widths_px", []))
166
+
167
+ if len(widths_px) == 0:
168
+ return 0.0
169
+
170
+ median_px = width_data.get("median_width_px", 0.0)
171
+ mean_px = width_data.get("mean_width_px", 0.0)
172
+ std_px = width_data.get("std_width_px", 0.0)
173
+
174
+ # 1. Variance score (lower variance = higher confidence)
175
+ coefficient_of_variation = std_px / (median_px + 1e-8)
176
+ # CV < MEASUREMENT_CV_POOR is acceptable
177
+ variance_score = max(0, 1.0 - coefficient_of_variation / MEASUREMENT_CV_POOR)
178
+
179
+ # 2. Median-Mean consistency
180
+ median_mean_diff = abs(median_px - mean_px) / (median_px + 1e-8)
181
+ consistency_score = max(0, 1.0 - median_mean_diff / MEASUREMENT_CONSISTENCY_THRESHOLD)
182
+
183
+ # 3. Outlier ratio (measurements far from median)
184
+ outlier_threshold = MEASUREMENT_OUTLIER_STD_MULTIPLIER * std_px
185
+ outliers = np.sum(np.abs(widths_px - median_px) > outlier_threshold)
186
+ outlier_ratio = outliers / len(widths_px)
187
+ outlier_score = max(0, 1.0 - outlier_ratio)
188
+
189
+ # 4. Realistic range check
190
+ if MEASUREMENT_WIDTH_TYPICAL_MIN <= median_width_cm <= MEASUREMENT_WIDTH_TYPICAL_MAX:
191
+ range_score = MEASUREMENT_RANGE_SCORE_IDEAL
192
+ elif MEASUREMENT_WIDTH_ABSOLUTE_MIN <= median_width_cm <= MEASUREMENT_WIDTH_ABSOLUTE_MAX:
193
+ # Borderline acceptable
194
+ range_score = MEASUREMENT_RANGE_SCORE_BORDERLINE
195
+ else:
196
+ # Outside realistic range
197
+ range_score = MEASUREMENT_RANGE_SCORE_OUTSIDE
198
+
199
+ # Combine components with weights
200
+ measurement_conf = (
201
+ MEASUREMENT_WEIGHT_VARIANCE * variance_score +
202
+ MEASUREMENT_WEIGHT_CONSISTENCY * consistency_score +
203
+ MEASUREMENT_WEIGHT_OUTLIERS * outlier_score +
204
+ MEASUREMENT_WEIGHT_RANGE * range_score
205
+ )
206
+
207
+ return float(np.clip(measurement_conf, 0, 1))
208
+
209
+
210
+ def compute_edge_quality_confidence(
211
+ edge_quality_data: Optional[Dict[str, Any]] = None
212
+ ) -> float:
213
+ """
214
+ Compute confidence score from edge quality (v1 Sobel method).
215
+
216
+ Args:
217
+ edge_quality_data: Output from compute_edge_quality_score()
218
+ None if using contour method (v0)
219
+
220
+ Returns:
221
+ Edge quality confidence score [0, 1]
222
+ Returns 1.0 for contour method (not applicable)
223
+ """
224
+ if edge_quality_data is None:
225
+ # Contour method - edge quality not applicable
226
+ return 1.0
227
+
228
+ # Use overall edge quality score directly
229
+ # It's already a weighted combination of 4 metrics
230
+ edge_conf = edge_quality_data.get("overall_score", 0.0)
231
+
232
+ return float(np.clip(edge_conf, 0, 1))
233
+
234
+
235
+ def compute_overall_confidence(
236
+ card_confidence: float,
237
+ finger_confidence: float,
238
+ measurement_confidence: float,
239
+ edge_method: EdgeMethod = "contour",
240
+ edge_quality_confidence: Optional[float] = None,
241
+ ) -> Dict[str, Any]:
242
+ """
243
+ Compute overall confidence by combining component scores.
244
+
245
+ Supports both v0 (contour) and v1 (Sobel) confidence calculation:
246
+ - v0 (contour): 3 components with V0_WEIGHT_* constants
247
+ - v1 (sobel): 4 components with V1_WEIGHT_* constants
248
+
249
+ Uses constants:
250
+ - V0_WEIGHT_*: v0 component weights (card: 30%, finger: 30%, measurement: 40%)
251
+ - V1_WEIGHT_*: v1 component weights (card: 25%, finger: 25%, edge: 20%, measurement: 30%)
252
+ - CONFIDENCE_LEVEL_*_THRESHOLD: Level thresholds (high: >0.85, medium: >=0.6)
253
+
254
+ Args:
255
+ card_confidence: Card detection confidence
256
+ finger_confidence: Finger detection confidence
257
+ measurement_confidence: Measurement stability confidence
258
+ edge_method: Edge detection method used
259
+ edge_quality_confidence: Edge quality confidence (v1 only)
260
+
261
+ Returns:
262
+ Dictionary containing:
263
+ - overall: Overall confidence [0, 1]
264
+ - card: Card component score
265
+ - finger: Finger component score
266
+ - measurement: Measurement component score
267
+ - edge_quality: Edge quality score (v1 only, None for v0)
268
+ - level: "high", "medium", or "low"
269
+ - method: Edge method used
270
+ """
271
+ result = {
272
+ "card": float(card_confidence),
273
+ "finger": float(finger_confidence),
274
+ "measurement": float(measurement_confidence),
275
+ "method": edge_method,
276
+ }
277
+
278
+ # Calculate overall confidence based on method
279
+ if edge_method == "sobel" and edge_quality_confidence is not None:
280
+ # v1 scoring: 4 components with V1_WEIGHT_* constants
281
+ overall = (
282
+ V1_WEIGHT_CARD * card_confidence +
283
+ V1_WEIGHT_FINGER * finger_confidence +
284
+ V1_WEIGHT_EDGE_QUALITY * edge_quality_confidence +
285
+ V1_WEIGHT_MEASUREMENT * measurement_confidence
286
+ )
287
+ result["edge_quality"] = float(edge_quality_confidence)
288
+
289
+ else:
290
+ # v0 scoring: 3 components with V0_WEIGHT_* constants (contour method or sobel fallback)
291
+ overall = (
292
+ V0_WEIGHT_CARD * card_confidence +
293
+ V0_WEIGHT_FINGER * finger_confidence +
294
+ V0_WEIGHT_MEASUREMENT * measurement_confidence
295
+ )
296
+ result["edge_quality"] = None
297
+
298
+ overall = float(np.clip(overall, 0, 1))
299
+
300
+ # Classify confidence level using threshold constants
301
+ if overall > CONFIDENCE_LEVEL_HIGH_THRESHOLD:
302
+ level = "high"
303
+ elif overall >= CONFIDENCE_LEVEL_MEDIUM_THRESHOLD:
304
+ level = "medium"
305
+ else:
306
+ level = "low"
307
+
308
+ result["overall"] = overall
309
+ result["level"] = level
310
+
311
+ return result
src/confidence_constants.py ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Constants for confidence scoring module.
3
+
4
+ This module contains thresholds and weights used in confidence calculation
5
+ for card detection, finger detection, and measurement stability.
6
+ """
7
+
8
+ # =============================================================================
9
+ # Card Confidence Constants
10
+ # =============================================================================
11
+
12
+ # Ideal credit card aspect ratio (ISO/IEC 7810 ID-1)
13
+ CARD_IDEAL_ASPECT_RATIO = 85.60 / 53.98 # ≈ 1.586
14
+
15
+ # Maximum acceptable aspect ratio deviation (fraction)
16
+ CARD_MAX_ASPECT_DEVIATION = 0.15 # 15%
17
+
18
+ # Card confidence component weights
19
+ CARD_WEIGHT_DETECTION = 0.5 # Detection quality: 50%
20
+ CARD_WEIGHT_ASPECT = 0.25 # Aspect ratio: 25%
21
+ CARD_WEIGHT_SCALE = 0.25 # Scale calibration: 25%
22
+
23
+
24
+ # =============================================================================
25
+ # Finger Confidence Constants
26
+ # =============================================================================
27
+
28
+ # Ideal mask area fraction of total image area
29
+ FINGER_IDEAL_MIN_AREA_FRACTION = 0.005 # 0.5% of image
30
+ FINGER_IDEAL_MAX_AREA_FRACTION = 0.05 # 5% of image
31
+
32
+ # Finger confidence component weights
33
+ FINGER_WEIGHT_HAND_DETECTION = 0.7 # Hand detection: 70%
34
+ FINGER_WEIGHT_MASK_VALIDITY = 0.3 # Mask validity: 30%
35
+
36
+
37
+ # =============================================================================
38
+ # Measurement Confidence Constants
39
+ # =============================================================================
40
+
41
+ # Coefficient of variation thresholds
42
+ # CV = std_dev / mean
43
+ MEASUREMENT_CV_EXCELLENT = 0.05 # CV < 0.05 is excellent
44
+ MEASUREMENT_CV_POOR = 0.15 # CV < 0.15 is acceptable
45
+
46
+ # Median-mean consistency threshold (fractional difference)
47
+ MEASUREMENT_CONSISTENCY_THRESHOLD = 0.1 # 10% difference acceptable
48
+
49
+ # Outlier detection threshold (multiples of std dev)
50
+ MEASUREMENT_OUTLIER_STD_MULTIPLIER = 2.0
51
+
52
+ # Realistic finger width range (cm)
53
+ MEASUREMENT_WIDTH_TYPICAL_MIN = 1.4 # Typical minimum
54
+ MEASUREMENT_WIDTH_TYPICAL_MAX = 2.4 # Typical maximum
55
+ MEASUREMENT_WIDTH_ABSOLUTE_MIN = 1.0 # Absolute minimum (borderline)
56
+ MEASUREMENT_WIDTH_ABSOLUTE_MAX = 3.0 # Absolute maximum (borderline)
57
+
58
+ # Measurement confidence component weights
59
+ MEASUREMENT_WEIGHT_VARIANCE = 0.4 # Variance: 40%
60
+ MEASUREMENT_WEIGHT_CONSISTENCY = 0.2 # Consistency: 20%
61
+ MEASUREMENT_WEIGHT_OUTLIERS = 0.2 # Outliers: 20%
62
+ MEASUREMENT_WEIGHT_RANGE = 0.2 # Range: 20%
63
+
64
+ # Range score values
65
+ MEASUREMENT_RANGE_SCORE_IDEAL = 1.0 # Within typical range
66
+ MEASUREMENT_RANGE_SCORE_BORDERLINE = 0.7 # Within absolute range
67
+ MEASUREMENT_RANGE_SCORE_OUTSIDE = 0.3 # Outside realistic range
68
+
69
+
70
+ # =============================================================================
71
+ # Overall Confidence Constants
72
+ # =============================================================================
73
+
74
+ # v0 (Contour method) component weights
75
+ V0_WEIGHT_CARD = 0.30 # Card: 30%
76
+ V0_WEIGHT_FINGER = 0.30 # Finger: 30%
77
+ V0_WEIGHT_MEASUREMENT = 0.40 # Measurement: 40%
78
+
79
+ # v1 (Sobel method) component weights
80
+ V1_WEIGHT_CARD = 0.25 # Card: 25%
81
+ V1_WEIGHT_FINGER = 0.25 # Finger: 25%
82
+ V1_WEIGHT_EDGE_QUALITY = 0.20 # Edge quality: 20%
83
+ V1_WEIGHT_MEASUREMENT = 0.30 # Measurement: 30%
84
+
85
+ # Confidence level thresholds
86
+ CONFIDENCE_LEVEL_HIGH_THRESHOLD = 0.85 # > 0.85 = high
87
+ CONFIDENCE_LEVEL_MEDIUM_THRESHOLD = 0.6 # >= 0.6 = medium, < 0.6 = low
src/debug_observer.py ADDED
@@ -0,0 +1,1283 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Debug visualization observer for the ring measurement pipeline.
3
+
4
+ This module provides a non-intrusive way to capture and visualize intermediate
5
+ processing stages without polluting core algorithm implementations.
6
+
7
+ It also contains all drawing utility functions used for debug visualizations.
8
+ """
9
+
10
+ import cv2
11
+ import numpy as np
12
+ from typing import Optional, Dict, Any, Callable, List, Tuple
13
+ from pathlib import Path
14
+
15
+ # Import visualization constants
16
+ from src.viz_constants import (
17
+ FONT_FACE, FontScale, FontThickness, Color, Size, Layout
18
+ )
19
+
20
+
21
+ class DebugObserver:
22
+ """
23
+ Observer for capturing and saving intermediate processing stages.
24
+
25
+ This class provides methods to save images and visualizations during
26
+ algorithm execution without requiring core functions to handle I/O directly.
27
+ """
28
+
29
+ def __init__(self, debug_dir: str):
30
+ """
31
+ Initialize debug observer.
32
+
33
+ Args:
34
+ debug_dir: Directory where debug images will be saved
35
+ """
36
+ self.debug_dir = Path(debug_dir)
37
+ self.debug_dir.mkdir(parents=True, exist_ok=True)
38
+ self._stage_counter = {}
39
+
40
+ def save_stage(self, name: str, image: np.ndarray) -> None:
41
+ """
42
+ Save an intermediate processing stage image.
43
+
44
+ Args:
45
+ name: Stage name (used as filename prefix)
46
+ image: Image to save
47
+ """
48
+ if image is None or image.size == 0:
49
+ return
50
+
51
+ # Add counter for stages with multiple saves
52
+ if name in self._stage_counter:
53
+ self._stage_counter[name] += 1
54
+ filename = f"{name}_{self._stage_counter[name]}.png"
55
+ else:
56
+ self._stage_counter[name] = 0
57
+ filename = f"{name}.png"
58
+
59
+ self._save_with_compression(image, filename)
60
+
61
+ def draw_and_save(self, name: str, image: np.ndarray,
62
+ draw_func: Callable, *args, **kwargs) -> None:
63
+ """
64
+ Apply a drawing function to an image and save the result.
65
+
66
+ Args:
67
+ name: Stage name for the output file
68
+ image: Base image to draw on
69
+ draw_func: Function that takes (image, *args, **kwargs) and returns annotated image
70
+ *args, **kwargs: Arguments to pass to draw_func
71
+ """
72
+ if image is None or image.size == 0:
73
+ return
74
+
75
+ annotated = draw_func(image, *args, **kwargs)
76
+ self.save_stage(name, annotated)
77
+
78
+ def _save_with_compression(self, image: np.ndarray, filename: str) -> None:
79
+ """
80
+ Save image with compression and optional downsampling.
81
+
82
+ Args:
83
+ image: Image to save
84
+ filename: Output filename
85
+ """
86
+ output_path = self.debug_dir / filename
87
+
88
+ # Downsample if too large (max 1920px dimension)
89
+ h, w = image.shape[:2]
90
+ max_dim = 1920
91
+ if max(h, w) > max_dim:
92
+ scale = max_dim / max(h, w)
93
+ new_w = int(w * scale)
94
+ new_h = int(h * scale)
95
+ image = cv2.resize(image, (new_w, new_h), interpolation=cv2.INTER_AREA)
96
+
97
+ # PNG compression
98
+ cv2.imwrite(str(output_path), image, [cv2.IMWRITE_PNG_COMPRESSION, 6])
99
+
100
+
101
+ # Backward compatibility helper
102
+ def save_debug_image(image: np.ndarray, filename: str, debug_dir: Optional[str]) -> None:
103
+ """
104
+ Legacy function for saving debug images.
105
+
106
+ This function is kept for backward compatibility during migration.
107
+ New code should use DebugObserver directly.
108
+
109
+ Args:
110
+ image: Image to save
111
+ filename: Output filename
112
+ debug_dir: Directory to save to (if None, skip saving)
113
+ """
114
+ if debug_dir is None:
115
+ return
116
+
117
+ observer = DebugObserver(debug_dir)
118
+ observer._save_with_compression(image, filename)
119
+
120
+
121
+ # =============================================================================
122
+ # Drawing Functions for Debug Visualization
123
+ # =============================================================================
124
+
125
+ # Hand landmark and finger constants (from finger_segmentation.py)
126
+ FINGER_LANDMARKS = {
127
+ "index": [5, 6, 7, 8],
128
+ "middle": [9, 10, 11, 12],
129
+ "ring": [13, 14, 15, 16],
130
+ "pinky": [17, 18, 19, 20],
131
+ }
132
+
133
+ THUMB_LANDMARKS = [1, 2, 3, 4]
134
+
135
+ HAND_CONNECTIONS = [
136
+ # Palm
137
+ (0, 1), (0, 5), (0, 17), (5, 9), (9, 13), (13, 17),
138
+ # Thumb
139
+ (1, 2), (2, 3), (3, 4),
140
+ # Index
141
+ (5, 6), (6, 7), (7, 8),
142
+ # Middle
143
+ (9, 10), (10, 11), (11, 12),
144
+ # Ring
145
+ (13, 14), (14, 15), (15, 16),
146
+ # Pinky
147
+ (17, 18), (18, 19), (19, 20),
148
+ ]
149
+
150
+ FINGER_COLORS = {
151
+ "thumb": Color.RED,
152
+ "index": Color.CYAN,
153
+ "middle": Color.YELLOW,
154
+ "ring": Color.MAGENTA,
155
+ "pinky": Color.ORANGE,
156
+ }
157
+
158
+
159
+ # --- Finger Segmentation Drawing Functions ---
160
+
161
+ def draw_landmarks_overlay(image: np.ndarray, landmarks: np.ndarray, label: bool = True) -> np.ndarray:
162
+ """
163
+ Draw hand landmarks as numbered circles.
164
+
165
+ Args:
166
+ image: Input image
167
+ landmarks: 21x2 array of landmark positions
168
+ label: Whether to draw landmark numbers
169
+
170
+ Returns:
171
+ Image with landmarks drawn
172
+ """
173
+ overlay = image.copy()
174
+
175
+ for i, (x, y) in enumerate(landmarks):
176
+ # Draw circle
177
+ cv2.circle(overlay, (int(x), int(y)), Size.ENDPOINT_RADIUS, Color.GREEN, -1)
178
+ cv2.circle(overlay, (int(x), int(y)), Size.ENDPOINT_RADIUS, Color.BLACK, 2)
179
+
180
+ # Draw number
181
+ if label:
182
+ text = str(i)
183
+ text_size = cv2.getTextSize(text, FONT_FACE, FontScale.SMALL, FontThickness.BODY)[0]
184
+ text_x = int(x - text_size[0] / 2)
185
+ text_y = int(y + text_size[1] / 2)
186
+
187
+ # Black outline
188
+ cv2.putText(overlay, text, (text_x, text_y), FONT_FACE, FontScale.SMALL,
189
+ Color.BLACK, FontThickness.BODY + 2, cv2.LINE_AA)
190
+ # White text
191
+ cv2.putText(overlay, text, (text_x, text_y), FONT_FACE, FontScale.SMALL,
192
+ Color.WHITE, FontThickness.BODY, cv2.LINE_AA)
193
+
194
+ return overlay
195
+
196
+
197
+ def draw_hand_skeleton(image: np.ndarray, landmarks: np.ndarray) -> np.ndarray:
198
+ """
199
+ Draw hand skeleton with connections between landmarks.
200
+
201
+ Args:
202
+ image: Input image
203
+ landmarks: 21x2 array of landmark positions
204
+
205
+ Returns:
206
+ Image with skeleton drawn
207
+ """
208
+ overlay = image.copy()
209
+
210
+ # Draw connections
211
+ for idx1, idx2 in HAND_CONNECTIONS:
212
+ pt1 = (int(landmarks[idx1, 0]), int(landmarks[idx1, 1]))
213
+ pt2 = (int(landmarks[idx2, 0]), int(landmarks[idx2, 1]))
214
+ cv2.line(overlay, pt1, pt2, Color.CYAN, Size.LINE_THICK, cv2.LINE_AA)
215
+
216
+ # Draw landmarks on top
217
+ for i, (x, y) in enumerate(landmarks):
218
+ cv2.circle(overlay, (int(x), int(y)), Size.CORNER_RADIUS, Color.GREEN, -1)
219
+ cv2.circle(overlay, (int(x), int(y)), Size.CORNER_RADIUS, Color.BLACK, 2)
220
+
221
+ return overlay
222
+
223
+
224
+ def draw_detection_info(image: np.ndarray, confidence: float, handedness: str, rotation: int) -> np.ndarray:
225
+ """
226
+ Draw detection metadata on image.
227
+
228
+ Args:
229
+ image: Input image
230
+ confidence: Detection confidence (0-1)
231
+ handedness: "Left" or "Right"
232
+ rotation: Rotation code (0, 1, 2, 3)
233
+
234
+ Returns:
235
+ Image with text overlay
236
+ """
237
+ overlay = image.copy()
238
+
239
+ rotation_names = {0: "None", 1: "90° CW", 2: "180°", 3: "90° CCW"}
240
+ rotation_name = rotation_names.get(rotation, "Unknown")
241
+
242
+ lines = [
243
+ f"Confidence: {confidence:.3f}",
244
+ f"Hand: {handedness}",
245
+ f"Rotation: {rotation_name}",
246
+ ]
247
+
248
+ y = Layout.TITLE_Y
249
+ for line in lines:
250
+ # Black outline
251
+ cv2.putText(overlay, line, (Layout.TEXT_OFFSET_X, y), FONT_FACE, FontScale.BODY,
252
+ Color.BLACK, FontThickness.LABEL_OUTLINE, cv2.LINE_AA)
253
+ # White text
254
+ cv2.putText(overlay, line, (Layout.TEXT_OFFSET_X, y), FONT_FACE, FontScale.BODY,
255
+ Color.WHITE, FontThickness.LABEL, cv2.LINE_AA)
256
+ y += Layout.LINE_SPACING
257
+
258
+ return overlay
259
+
260
+
261
+ def draw_finger_regions(image: np.ndarray, landmarks: np.ndarray) -> np.ndarray:
262
+ """
263
+ Draw individual finger regions in different colors.
264
+
265
+ Args:
266
+ image: Input image
267
+ landmarks: 21x2 array of landmark positions
268
+
269
+ Returns:
270
+ Image with colored finger regions
271
+ """
272
+ h, w = image.shape[:2]
273
+ overlay = image.copy()
274
+ mask_overlay = np.zeros((h, w, 3), dtype=np.uint8)
275
+
276
+ # Draw thumb
277
+ thumb_pts = landmarks[THUMB_LANDMARKS].astype(np.int32)
278
+ cv2.fillConvexPoly(mask_overlay, thumb_pts, FINGER_COLORS["thumb"])
279
+
280
+ # Draw each finger
281
+ for finger_name, indices in FINGER_LANDMARKS.items():
282
+ finger_pts = landmarks[indices].astype(np.int32)
283
+ cv2.fillConvexPoly(mask_overlay, finger_pts, FINGER_COLORS[finger_name])
284
+
285
+ # Blend with original
286
+ overlay = cv2.addWeighted(overlay, 0.6, mask_overlay, 0.4, 0)
287
+
288
+ return overlay
289
+
290
+
291
+ def draw_extension_scores(image: np.ndarray, scores: Dict[str, float], selected: str) -> np.ndarray:
292
+ """
293
+ Draw finger extension scores.
294
+
295
+ Args:
296
+ image: Input image
297
+ scores: Dict mapping finger name to extension score
298
+ selected: Name of selected finger
299
+
300
+ Returns:
301
+ Image with scores drawn
302
+ """
303
+ overlay = image.copy()
304
+
305
+ # Sort by score
306
+ sorted_fingers = sorted(scores.items(), key=lambda x: x[1], reverse=True)
307
+
308
+ y = Layout.TITLE_Y
309
+ for finger_name, score in sorted_fingers:
310
+ is_selected = (finger_name == selected)
311
+ color = Color.GREEN if is_selected else Color.WHITE
312
+ text = f"{finger_name.capitalize()}: {score:.1f}" + (" ✓" if is_selected else "")
313
+
314
+ # Black outline
315
+ cv2.putText(overlay, text, (Layout.TEXT_OFFSET_X, y), FONT_FACE, FontScale.BODY,
316
+ Color.BLACK, FontThickness.LABEL_OUTLINE, cv2.LINE_AA)
317
+ # Colored text
318
+ cv2.putText(overlay, text, (Layout.TEXT_OFFSET_X, y), FONT_FACE, FontScale.BODY,
319
+ color, FontThickness.LABEL, cv2.LINE_AA)
320
+ y += Layout.LINE_SPACING
321
+
322
+ return overlay
323
+
324
+
325
+ def draw_component_stats(image: np.ndarray, labels: np.ndarray, stats: np.ndarray,
326
+ selected_idx: int) -> np.ndarray:
327
+ """
328
+ Draw connected component statistics.
329
+
330
+ Args:
331
+ image: Input image
332
+ labels: Connected component labels
333
+ stats: Component statistics from cv2.connectedComponentsWithStats
334
+ selected_idx: Index of selected component
335
+
336
+ Returns:
337
+ Image with colored components and stats
338
+ """
339
+ overlay = image.copy()
340
+
341
+ # Create colored component visualization
342
+ num_labels = stats.shape[0]
343
+ colors = np.random.randint(0, 255, size=(num_labels, 3), dtype=np.uint8)
344
+ colors[0] = [0, 0, 0] # Background is black
345
+ colors[selected_idx] = Color.GREEN # Selected is green
346
+
347
+ colored = colors[labels]
348
+ overlay = cv2.addWeighted(overlay, 0.5, colored, 0.5, 0)
349
+
350
+ # Draw text stats
351
+ y = Layout.TITLE_Y
352
+ lines = [
353
+ f"Components: {num_labels - 1}", # Exclude background
354
+ f"Selected area: {stats[selected_idx, cv2.CC_STAT_AREA]} px",
355
+ ]
356
+
357
+ for line in lines:
358
+ cv2.putText(overlay, line, (Layout.TEXT_OFFSET_X, y), FONT_FACE, FontScale.BODY,
359
+ Color.BLACK, FontThickness.LABEL_OUTLINE, cv2.LINE_AA)
360
+ cv2.putText(overlay, line, (Layout.TEXT_OFFSET_X, y), FONT_FACE, FontScale.BODY,
361
+ Color.WHITE, FontThickness.LABEL, cv2.LINE_AA)
362
+ y += Layout.LINE_SPACING
363
+
364
+ return overlay
365
+
366
+
367
+ # --- Card Detection Drawing Functions ---
368
+
369
+ def draw_contours_overlay(
370
+ image: np.ndarray,
371
+ contours: List[np.ndarray],
372
+ title: str,
373
+ color: Optional[Tuple[int, int, int]] = None,
374
+ ) -> np.ndarray:
375
+ """
376
+ Draw contours on an image overlay.
377
+
378
+ Args:
379
+ image: Original image
380
+ contours: List of contours to draw
381
+ title: Title for the visualization
382
+ color: BGR color for contours (default: Color.GREEN)
383
+
384
+ Returns:
385
+ Annotated image
386
+ """
387
+ if color is None:
388
+ color = Color.GREEN
389
+
390
+ overlay = image.copy()
391
+
392
+ # Draw all contours
393
+ for contour in contours:
394
+ if len(contour) == 4:
395
+ # Draw quadrilateral
396
+ pts = contour.reshape(4, 2).astype(np.int32)
397
+ cv2.polylines(overlay, [pts], True, color, Size.CONTOUR_NORMAL)
398
+
399
+ # Add title with outline for visibility
400
+ cv2.putText(
401
+ overlay, title, (Layout.TEXT_OFFSET_X, Layout.TITLE_Y),
402
+ FONT_FACE, FontScale.TITLE, Color.WHITE,
403
+ FontThickness.TITLE_OUTLINE, cv2.LINE_AA
404
+ )
405
+ cv2.putText(
406
+ overlay, title, (Layout.TEXT_OFFSET_X, Layout.TITLE_Y),
407
+ FONT_FACE, FontScale.TITLE, color,
408
+ FontThickness.TITLE, cv2.LINE_AA
409
+ )
410
+
411
+ # Add count with outline
412
+ count_text = f"Candidates: {len(contours)}"
413
+ cv2.putText(
414
+ overlay, count_text, (Layout.TEXT_OFFSET_X, Layout.SUBTITLE_Y),
415
+ FONT_FACE, FontScale.SUBTITLE, Color.WHITE,
416
+ FontThickness.SUBTITLE_OUTLINE, cv2.LINE_AA
417
+ )
418
+ cv2.putText(
419
+ overlay, count_text, (Layout.TEXT_OFFSET_X, Layout.SUBTITLE_Y),
420
+ FONT_FACE, FontScale.SUBTITLE, color,
421
+ FontThickness.SUBTITLE, cv2.LINE_AA
422
+ )
423
+
424
+ return overlay
425
+
426
+
427
+ def draw_candidates_with_scores(
428
+ image: np.ndarray,
429
+ candidates: List[Tuple[np.ndarray, float, Dict[str, Any]]],
430
+ title: str,
431
+ ) -> np.ndarray:
432
+ """
433
+ Draw candidate contours with scores and details.
434
+
435
+ Args:
436
+ image: Original image
437
+ candidates: List of (corners, score, details) tuples
438
+ title: Title for the visualization
439
+
440
+ Returns:
441
+ Annotated image
442
+ """
443
+ overlay = image.copy()
444
+
445
+ # Color palette for candidates (different colors for ranking)
446
+ colors = [
447
+ Color.GREEN, # Green - best
448
+ Color.YELLOW, # Yellow
449
+ Color.ORANGE, # Orange
450
+ Color.MAGENTA, # Magenta
451
+ Color.PINK # Pink
452
+ ]
453
+
454
+ for idx, (corners, score, details) in enumerate(candidates):
455
+ color = colors[idx % len(colors)]
456
+
457
+ # Draw quadrilateral
458
+ pts = corners.reshape(4, 2).astype(np.int32)
459
+ cv2.polylines(overlay, [pts], True, color, Size.CONTOUR_NORMAL)
460
+
461
+ # Draw corner circles
462
+ for pt in pts:
463
+ cv2.circle(overlay, tuple(pt), Size.CORNER_RADIUS, color, -1)
464
+
465
+ # Prepare annotation text
466
+ if score > 0:
467
+ aspect_ratio = details.get("aspect_ratio", 0)
468
+ area_ratio = details.get("area", 0) / (image.shape[0] * image.shape[1])
469
+ text = f"#{idx+1} Score:{score:.2f} AR:{aspect_ratio:.2f} Area:{area_ratio:.2%}"
470
+ else:
471
+ reject_reason = details.get("reject_reason", "unknown")
472
+ text = f"#{idx+1} REJECT: {reject_reason}"
473
+
474
+ # Position text near first corner
475
+ text_pos = (int(pts[0][0]) + 10, int(pts[0][1]) - 10)
476
+
477
+ # Draw text with outline for visibility
478
+ cv2.putText(
479
+ overlay, text, text_pos,
480
+ FONT_FACE, FontScale.LABEL, Color.BLACK,
481
+ FontThickness.LABEL_OUTLINE, cv2.LINE_AA
482
+ )
483
+ cv2.putText(
484
+ overlay, text, text_pos,
485
+ FONT_FACE, FontScale.LABEL, color,
486
+ FontThickness.LABEL, cv2.LINE_AA
487
+ )
488
+
489
+ # Add title with outline
490
+ cv2.putText(
491
+ overlay, title, (Layout.TEXT_OFFSET_X, Layout.TITLE_Y),
492
+ FONT_FACE, FontScale.TITLE, Color.WHITE,
493
+ FontThickness.TITLE_OUTLINE, cv2.LINE_AA
494
+ )
495
+ cv2.putText(
496
+ overlay, title, (Layout.TEXT_OFFSET_X, Layout.TITLE_Y),
497
+ FONT_FACE, FontScale.TITLE, Color.CYAN,
498
+ FontThickness.TITLE, cv2.LINE_AA
499
+ )
500
+
501
+ return overlay
502
+
503
+
504
+ # --- Edge Refinement Drawing Functions (v1 Phase 5) ---
505
+
506
+ def draw_landmark_axis(
507
+ image: np.ndarray,
508
+ axis_data: Dict[str, Any],
509
+ finger_landmarks: Optional[np.ndarray]
510
+ ) -> np.ndarray:
511
+ """
512
+ Draw finger landmarks with axis overlay.
513
+
514
+ Shows:
515
+ - 4 finger landmarks (MCP, PIP, DIP, TIP)
516
+ - Calculated finger axis
517
+ - Axis endpoints
518
+ - Landmark-based vs PCA method indicator
519
+ """
520
+ vis = image.copy()
521
+
522
+ # Draw finger landmarks if available
523
+ if finger_landmarks is not None and len(finger_landmarks) == 4:
524
+ landmark_names = ["MCP", "PIP", "DIP", "TIP"]
525
+ for i, (landmark, name) in enumerate(zip(finger_landmarks, landmark_names)):
526
+ pt = tuple(landmark.astype(int))
527
+ # Draw landmark
528
+ cv2.circle(vis, pt, Size.ENDPOINT_RADIUS, Color.YELLOW, -1)
529
+ cv2.circle(vis, pt, Size.ENDPOINT_RADIUS, Color.BLACK, 2)
530
+ # Draw label
531
+ cv2.putText(
532
+ vis, name, (pt[0] + 20, pt[1] - 20),
533
+ FONT_FACE, FontScale.LABEL,
534
+ Color.BLACK, FontThickness.LABEL_OUTLINE
535
+ )
536
+ cv2.putText(
537
+ vis, name, (pt[0] + 20, pt[1] - 20),
538
+ FONT_FACE, FontScale.LABEL,
539
+ Color.YELLOW, FontThickness.LABEL
540
+ )
541
+
542
+ # Draw axis line
543
+ # Use actual anatomical endpoints (MCP to TIP) if available
544
+ if "palm_end" in axis_data and "tip_end" in axis_data:
545
+ start = axis_data["palm_end"] # MCP (palm-side)
546
+ end = axis_data["tip_end"] # TIP (fingertip)
547
+ else:
548
+ # Fallback to geometric center method (for PCA or old data)
549
+ center = axis_data["center"]
550
+ direction = axis_data["direction"]
551
+ length = axis_data["length"]
552
+ start = center - direction * (length / 2.0)
553
+ end = center + direction * (length / 2.0)
554
+
555
+ # Draw axis
556
+ cv2.line(
557
+ vis,
558
+ tuple(start.astype(int)),
559
+ tuple(end.astype(int)),
560
+ Color.CYAN, Size.LINE_THICK
561
+ )
562
+
563
+ # Draw endpoints
564
+ cv2.circle(vis, tuple(start.astype(int)), Size.ENDPOINT_RADIUS, Color.CYAN, -1)
565
+ cv2.circle(vis, tuple(end.astype(int)), Size.ENDPOINT_RADIUS, Color.MAGENTA, -1)
566
+
567
+ # Add method indicator
568
+ method = axis_data.get("method", "unknown")
569
+ text = f"Axis Method: {method}"
570
+ cv2.putText(
571
+ vis, text, (50, 100),
572
+ FONT_FACE, FontScale.TITLE,
573
+ Color.BLACK, FontThickness.TITLE_OUTLINE
574
+ )
575
+ cv2.putText(
576
+ vis, text, (50, 100),
577
+ FONT_FACE, FontScale.TITLE,
578
+ Color.CYAN, FontThickness.TITLE
579
+ )
580
+
581
+ return vis
582
+
583
+
584
+ def draw_ring_zone_roi(
585
+ image: np.ndarray,
586
+ zone_data: Dict[str, Any],
587
+ roi_bounds: Tuple[int, int, int, int]
588
+ ) -> np.ndarray:
589
+ """
590
+ Draw ring zone and ROI bounds.
591
+
592
+ Shows:
593
+ - Ring-wearing zone band
594
+ - ROI bounding box
595
+ - Zone start/end points
596
+ """
597
+ vis = image.copy()
598
+
599
+ # Draw ring zone
600
+ start_point = zone_data["start_point"]
601
+ end_point = zone_data["end_point"]
602
+
603
+ cv2.circle(vis, tuple(start_point.astype(int)), Size.ENDPOINT_RADIUS, Color.GREEN, -1)
604
+ cv2.circle(vis, tuple(end_point.astype(int)), Size.ENDPOINT_RADIUS, Color.RED, -1)
605
+ cv2.line(
606
+ vis,
607
+ tuple(start_point.astype(int)),
608
+ tuple(end_point.astype(int)),
609
+ Color.YELLOW, Size.LINE_THICK * 2
610
+ )
611
+
612
+ # Draw ROI bounding box
613
+ x_min, y_min, x_max, y_max = roi_bounds
614
+ cv2.rectangle(vis, (x_min, y_min), (x_max, y_max), Color.GREEN, Size.LINE_THICK)
615
+
616
+ # Add labels
617
+ text = "Ring Zone + ROI Bounds"
618
+ cv2.putText(
619
+ vis, text, (50, 100),
620
+ FONT_FACE, FontScale.TITLE,
621
+ Color.BLACK, FontThickness.TITLE_OUTLINE
622
+ )
623
+ cv2.putText(
624
+ vis, text, (50, 100),
625
+ FONT_FACE, FontScale.TITLE,
626
+ Color.GREEN, FontThickness.TITLE
627
+ )
628
+
629
+ return vis
630
+
631
+
632
+ def draw_roi_extraction(
633
+ roi_image: np.ndarray,
634
+ roi_mask: Optional[np.ndarray]
635
+ ) -> np.ndarray:
636
+ """
637
+ Draw extracted ROI with optional mask overlay.
638
+ """
639
+ # Convert grayscale to BGR for visualization
640
+ if len(roi_image.shape) == 2:
641
+ vis = cv2.cvtColor(roi_image, cv2.COLOR_GRAY2BGR)
642
+ else:
643
+ vis = roi_image.copy()
644
+
645
+ # Overlay mask if available
646
+ if roi_mask is not None:
647
+ mask_colored = np.zeros_like(vis)
648
+ mask_colored[:, :, 1] = roi_mask # Green channel
649
+ vis = cv2.addWeighted(vis, 0.7, mask_colored, 0.3, 0)
650
+
651
+ return vis
652
+
653
+
654
+ def draw_gradient_visualization(
655
+ gradient: np.ndarray,
656
+ colormap: int = cv2.COLORMAP_JET
657
+ ) -> np.ndarray:
658
+ """
659
+ Visualize gradient with color mapping.
660
+ """
661
+ grad_vis = np.clip(gradient, 0, 255).astype(np.uint8)
662
+ return cv2.applyColorMap(grad_vis, colormap)
663
+
664
+
665
+ def draw_edge_candidates(
666
+ roi_image: np.ndarray,
667
+ gradient_magnitude: np.ndarray,
668
+ threshold: float
669
+ ) -> np.ndarray:
670
+ """
671
+ Draw all pixels above gradient threshold (raw threshold, before spatial filtering).
672
+
673
+ This shows ALL pixels where gradient > threshold, including background noise.
674
+ Use draw_filtered_edge_candidates() to see only spatially-filtered candidates.
675
+ """
676
+ # Convert ROI to BGR
677
+ if len(roi_image.shape) == 2:
678
+ vis = cv2.cvtColor(roi_image, cv2.COLOR_GRAY2BGR)
679
+ else:
680
+ vis = roi_image.copy()
681
+
682
+ # Find edge candidates
683
+ candidates = gradient_magnitude > threshold
684
+
685
+ # Overlay candidates in cyan
686
+ vis[candidates] = Color.CYAN
687
+
688
+ # Add annotation explaining this is raw threshold
689
+ count = np.sum(candidates)
690
+ text1 = f"All pixels > {threshold:.1f}"
691
+ text2 = "(Before spatial filtering)"
692
+ text3 = f"Count: {count:,}"
693
+
694
+ cv2.putText(vis, text1, (20, 40), FONT_FACE, 1.5, Color.WHITE, 4)
695
+ cv2.putText(vis, text1, (20, 40), FONT_FACE, 1.5, Color.BLACK, 2)
696
+
697
+ cv2.putText(vis, text2, (20, 80), FONT_FACE, 1.2, Color.WHITE, 4)
698
+ cv2.putText(vis, text2, (20, 80), FONT_FACE, 1.2, Color.YELLOW, 2)
699
+
700
+ cv2.putText(vis, text3, (20, 120), FONT_FACE, 1.2, Color.WHITE, 4)
701
+ cv2.putText(vis, text3, (20, 120), FONT_FACE, 1.2, Color.CYAN, 2)
702
+
703
+ return vis
704
+
705
+
706
+ def draw_filtered_edge_candidates(
707
+ roi_image: np.ndarray,
708
+ gradient_magnitude: np.ndarray,
709
+ threshold: float,
710
+ roi_mask: Optional[np.ndarray],
711
+ axis_center: np.ndarray,
712
+ axis_direction: np.ndarray
713
+ ) -> np.ndarray:
714
+ """
715
+ Draw only the spatially-filtered edge candidates that the algorithm actually considers.
716
+
717
+ Shows pixels that pass BOTH gradient threshold AND spatial filtering:
718
+ - Mask-constrained mode: Within finger mask boundaries
719
+ - Axis-expansion mode: Along search path from axis outward
720
+
721
+ This matches what detect_edges_per_row() actually evaluates.
722
+
723
+ Args:
724
+ roi_image: ROI image
725
+ gradient_magnitude: Gradient magnitude array
726
+ threshold: Gradient threshold
727
+ roi_mask: Optional finger mask in ROI coordinates
728
+ axis_center: Axis center point in ROI coordinates
729
+ axis_direction: Axis direction vector in ROI coordinates
730
+
731
+ Returns:
732
+ Visualization showing only filtered candidates
733
+ """
734
+ # Convert ROI to BGR
735
+ if len(roi_image.shape) == 2:
736
+ vis = cv2.cvtColor(roi_image, cv2.COLOR_GRAY2BGR)
737
+ else:
738
+ vis = roi_image.copy()
739
+
740
+ h, w = gradient_magnitude.shape
741
+
742
+ # Helper function to get axis x-coordinate at each row
743
+ def get_axis_x_at_row(y: int) -> int:
744
+ """Calculate axis x-coordinate at given y using axis center and direction."""
745
+ if abs(axis_direction[1]) < 1e-6:
746
+ # Axis is horizontal, use center x
747
+ return int(axis_center[0])
748
+
749
+ # Calculate offset from axis center
750
+ dy = y - axis_center[1]
751
+ dx = dy * (axis_direction[0] / axis_direction[1])
752
+ x = axis_center[0] + dx
753
+
754
+ return int(np.clip(x, 0, w - 1))
755
+
756
+ # MASK-CONSTRAINED MODE (if mask available)
757
+ if roi_mask is not None:
758
+ mode = "Mask-Constrained"
759
+ candidate_count = 0
760
+
761
+ for y in range(h):
762
+ row_gradient = gradient_magnitude[y, :]
763
+ row_mask = roi_mask[y, :]
764
+
765
+ if not np.any(row_mask):
766
+ continue
767
+
768
+ # Find mask boundaries
769
+ mask_indices = np.where(row_mask)[0]
770
+ if len(mask_indices) < 2:
771
+ continue
772
+
773
+ left_mask_boundary = mask_indices[0]
774
+ right_mask_boundary = mask_indices[-1]
775
+
776
+ # Get axis position
777
+ axis_x = get_axis_x_at_row(y)
778
+
779
+ # Search LEFT from axis to left mask boundary - find STRONGEST gradient
780
+ left_edge_x = None
781
+ left_strength = 0
782
+ search_start = max(left_mask_boundary, min(axis_x, w - 1))
783
+ for x in range(search_start, left_mask_boundary - 1, -1):
784
+ if x < 0 or x >= w:
785
+ continue
786
+ if row_gradient[x] > threshold:
787
+ # Update if this is stronger than previous
788
+ if row_gradient[x] > left_strength:
789
+ left_edge_x = x
790
+ left_strength = row_gradient[x]
791
+
792
+ # If no edge found, try relaxed threshold
793
+ if left_edge_x is None:
794
+ relaxed_threshold = threshold * 0.5
795
+ for x in range(search_start, left_mask_boundary - 1, -1):
796
+ if x < 0 or x >= w:
797
+ continue
798
+ if row_gradient[x] > relaxed_threshold:
799
+ if row_gradient[x] > left_strength:
800
+ left_edge_x = x
801
+ left_strength = row_gradient[x]
802
+
803
+ # Search RIGHT from axis to right mask boundary - find STRONGEST gradient
804
+ right_edge_x = None
805
+ right_strength = 0
806
+ search_start = min(right_mask_boundary, max(axis_x, 0))
807
+ for x in range(search_start, right_mask_boundary + 1):
808
+ if x < 0 or x >= w:
809
+ continue
810
+ if row_gradient[x] > threshold:
811
+ # Update if this is stronger than previous
812
+ if row_gradient[x] > right_strength:
813
+ right_edge_x = x
814
+ right_strength = row_gradient[x]
815
+
816
+ # If no edge found, try relaxed threshold
817
+ if right_edge_x is None:
818
+ relaxed_threshold = threshold * 0.5
819
+ for x in range(search_start, right_mask_boundary + 1):
820
+ if x < 0 or x >= w:
821
+ continue
822
+ if row_gradient[x] > relaxed_threshold:
823
+ if row_gradient[x] > right_strength:
824
+ right_edge_x = x
825
+ right_strength = row_gradient[x]
826
+
827
+ # Draw the SELECTED edges only (not all candidates)
828
+ if left_edge_x is not None:
829
+ cv2.circle(vis, (left_edge_x, y), 2, Color.CYAN, -1)
830
+ candidate_count += 1
831
+
832
+ if right_edge_x is not None:
833
+ cv2.circle(vis, (right_edge_x, y), 2, Color.MAGENTA, -1)
834
+ candidate_count += 1
835
+
836
+ # Draw axis position
837
+ cv2.circle(vis, (axis_x, y), 1, Color.YELLOW, -1)
838
+
839
+ # AXIS-EXPANSION MODE (no mask)
840
+ else:
841
+ mode = "Axis-Expansion"
842
+ candidate_count = 0
843
+
844
+ for y in range(h):
845
+ row_gradient = gradient_magnitude[y, :]
846
+ axis_x = get_axis_x_at_row(y)
847
+
848
+ if axis_x < 0 or axis_x >= w:
849
+ continue
850
+
851
+ # Draw axis position
852
+ cv2.circle(vis, (axis_x, y), 2, Color.YELLOW, -1)
853
+
854
+ # Search LEFT from axis until first edge
855
+ for x in range(axis_x, -1, -1):
856
+ if row_gradient[x] > threshold:
857
+ cv2.circle(vis, (x, y), 2, Color.CYAN, -1)
858
+ candidate_count += 1
859
+ break # Stop at first edge
860
+
861
+ # Search RIGHT from axis until first edge
862
+ for x in range(axis_x, w):
863
+ if row_gradient[x] > threshold:
864
+ cv2.circle(vis, (x, y), 2, Color.MAGENTA, -1)
865
+ candidate_count += 1
866
+ break # Stop at first edge
867
+
868
+ # Add annotation
869
+ text1 = f"Spatially-filtered candidates"
870
+ text2 = f"Mode: {mode}"
871
+ text3 = f"Count: {candidate_count:,}"
872
+
873
+ cv2.putText(vis, text1, (20, 40), FONT_FACE, 1.5, Color.WHITE, 4)
874
+ cv2.putText(vis, text1, (20, 40), FONT_FACE, 1.5, Color.GREEN, 2)
875
+
876
+ cv2.putText(vis, text2, (20, 80), FONT_FACE, 1.2, Color.WHITE, 4)
877
+ cv2.putText(vis, text2, (20, 80), FONT_FACE, 1.2, Color.YELLOW, 2)
878
+
879
+ cv2.putText(vis, text3, (20, 120), FONT_FACE, 1.2, Color.WHITE, 4)
880
+ cv2.putText(vis, text3, (20, 120), FONT_FACE, 1.2, Color.CYAN, 2)
881
+
882
+ # Add legend
883
+ legend_y = h - 80
884
+ cv2.putText(vis, "Yellow: Axis", (20, legend_y), FONT_FACE, 1.0, Color.YELLOW, 2)
885
+ cv2.putText(vis, "Cyan: Left edges", (20, legend_y + 30), FONT_FACE, 1.0, Color.CYAN, 2)
886
+ cv2.putText(vis, "Magenta: Right edges", (20, legend_y + 60), FONT_FACE, 1.0, Color.MAGENTA, 2)
887
+
888
+ return vis
889
+
890
+
891
+ def draw_selected_edges(
892
+ roi_image: np.ndarray,
893
+ edge_data: Dict[str, Any]
894
+ ) -> np.ndarray:
895
+ """
896
+ Draw final selected left/right edges with enhanced visualization.
897
+ Shows edge points, connecting lines, and statistics.
898
+ """
899
+ # Convert ROI to BGR
900
+ if len(roi_image.shape) == 2:
901
+ vis = cv2.cvtColor(roi_image, cv2.COLOR_GRAY2BGR)
902
+ else:
903
+ vis = roi_image.copy()
904
+
905
+ h, w = vis.shape[:2]
906
+
907
+ left_edges = edge_data["left_edges"]
908
+ right_edges = edge_data["right_edges"]
909
+ valid_rows = edge_data["valid_rows"]
910
+
911
+ # Calculate statistics for valid edges
912
+ valid_left = left_edges[valid_rows]
913
+ valid_right = right_edges[valid_rows]
914
+ valid_widths = valid_right - valid_left
915
+
916
+ if len(valid_widths) > 0:
917
+ median_width = np.median(valid_widths)
918
+
919
+ # Draw connecting lines for every Nth row (to avoid clutter)
920
+ line_spacing = max(1, int(len(valid_rows)) // 20) # Show ~20 lines
921
+
922
+ count = 0 # Count valid rows
923
+ for row_idx, valid in enumerate(valid_rows):
924
+ if not valid:
925
+ continue
926
+
927
+ left_x = int(left_edges[row_idx])
928
+ right_x = int(right_edges[row_idx])
929
+ width = right_x - left_x
930
+
931
+ # Draw connecting line (every Nth valid row)
932
+ if count % line_spacing == 0:
933
+ # Color based on width deviation
934
+ deviation = abs(width - median_width) / median_width if median_width > 0 else 0
935
+ if deviation < 0.05:
936
+ line_color = Color.GREEN
937
+ elif deviation < 0.15:
938
+ line_color = Color.YELLOW
939
+ else:
940
+ line_color = Color.ORANGE
941
+
942
+ cv2.line(vis, (left_x, row_idx), (right_x, row_idx), line_color, 1)
943
+
944
+ count += 1 # Increment valid row counter
945
+
946
+ # Draw edge points on top
947
+ for row_idx, valid in enumerate(valid_rows):
948
+ if valid:
949
+ # Draw left edge (blue)
950
+ left_x = int(left_edges[row_idx])
951
+ cv2.circle(vis, (left_x, row_idx), 2, Color.CYAN, -1)
952
+
953
+ # Draw right edge (magenta)
954
+ right_x = int(right_edges[row_idx])
955
+ cv2.circle(vis, (right_x, row_idx), 2, Color.MAGENTA, -1)
956
+
957
+ # Add text annotations
958
+ # Scale font size based on ROI height for readability
959
+ font_scale = max(0.3, h / 600.0) # Scale based on ROI height, min 0.3
960
+ line_height = int(15 + h / 40.0) # Scale line spacing too
961
+ thickness = 1
962
+
963
+ valid_pct = np.sum(valid_rows) / len(valid_rows) * 100
964
+ text_lines = [
965
+ f"Valid edges: {np.sum(valid_rows)}/{len(valid_rows)} ({valid_pct:.1f}%)",
966
+ f"Left range: {np.min(valid_left):.1f}-{np.max(valid_left):.1f}px",
967
+ f"Right range: {np.min(valid_right):.1f}-{np.max(valid_right):.1f}px",
968
+ f"Width: {np.min(valid_widths):.1f}-{np.max(valid_widths):.1f}px",
969
+ f"Median: {median_width:.1f}px"
970
+ ]
971
+
972
+ for i, text in enumerate(text_lines):
973
+ y = line_height + i * line_height
974
+ # Background for readability
975
+ (text_w, text_h), _ = cv2.getTextSize(text, FONT_FACE, font_scale, thickness)
976
+ cv2.rectangle(vis, (5, y - text_h - 2), (5 + text_w + 5, y + 2), (0, 0, 0), -1)
977
+ cv2.putText(vis, text, (8, y), FONT_FACE, font_scale, Color.WHITE, thickness)
978
+
979
+ return vis
980
+
981
+
982
+ def draw_width_measurements(
983
+ roi_image: np.ndarray,
984
+ edge_data: Dict[str, Any],
985
+ width_data: Dict[str, Any]
986
+ ) -> np.ndarray:
987
+ """
988
+ Draw width measurements with connecting lines.
989
+ """
990
+ # Convert ROI to BGR
991
+ if len(roi_image.shape) == 2:
992
+ vis = cv2.cvtColor(roi_image, cv2.COLOR_GRAY2BGR)
993
+ else:
994
+ vis = roi_image.copy()
995
+
996
+ left_edges = edge_data["left_edges"]
997
+ right_edges = edge_data["right_edges"]
998
+ valid_rows = edge_data["valid_rows"]
999
+
1000
+ median_width_px = width_data["median_width_px"]
1001
+
1002
+ # Draw width lines
1003
+ for row_idx, valid in enumerate(valid_rows):
1004
+ if valid:
1005
+ left_x = int(left_edges[row_idx])
1006
+ right_x = int(right_edges[row_idx])
1007
+ width_px = right_x - left_x
1008
+
1009
+ # Color based on deviation from median
1010
+ deviation = abs(width_px - median_width_px) / median_width_px
1011
+ if deviation < 0.05:
1012
+ color = Color.GREEN # Close to median
1013
+ elif deviation < 0.10:
1014
+ color = Color.YELLOW # Moderate deviation
1015
+ else:
1016
+ color = Color.RED # Large deviation
1017
+
1018
+ # Draw line
1019
+ cv2.line(vis, (left_x, row_idx), (right_x, row_idx), color, 1)
1020
+
1021
+ # Add median width annotation
1022
+ # Scale font size based on ROI height
1023
+ h = vis.shape[0]
1024
+ font_scale = max(0.4, h / 500.0)
1025
+ thickness = max(1, int(h / 150.0))
1026
+
1027
+ median_cm = width_data["median_width_cm"]
1028
+ text = f"Median: {median_cm:.2f} cm ({median_width_px:.1f} px)"
1029
+ cv2.putText(
1030
+ vis, text, (10, int(h * 0.15)),
1031
+ FONT_FACE, font_scale,
1032
+ Color.BLACK, thickness + 2
1033
+ )
1034
+ cv2.putText(
1035
+ vis, text, (10, int(h * 0.15)),
1036
+ FONT_FACE, font_scale,
1037
+ Color.GREEN, thickness
1038
+ )
1039
+
1040
+ return vis
1041
+
1042
+
1043
+ def draw_outlier_detection(
1044
+ roi_image: np.ndarray,
1045
+ edge_data: Dict[str, Any],
1046
+ width_data: Dict[str, Any]
1047
+ ) -> np.ndarray:
1048
+ """
1049
+ Highlight outlier measurements.
1050
+ """
1051
+ # Convert ROI to BGR
1052
+ if len(roi_image.shape) == 2:
1053
+ vis = cv2.cvtColor(roi_image, cv2.COLOR_GRAY2BGR)
1054
+ else:
1055
+ vis = roi_image.copy()
1056
+
1057
+ left_edges = edge_data["left_edges"]
1058
+ right_edges = edge_data["right_edges"]
1059
+ valid_rows = edge_data["valid_rows"]
1060
+
1061
+ median_width_px = width_data["median_width_px"]
1062
+ outliers_removed = width_data.get("outliers_removed", 0)
1063
+
1064
+ # Calculate MAD threshold
1065
+ all_widths = []
1066
+ for row_idx, valid in enumerate(valid_rows):
1067
+ if valid:
1068
+ width_px = right_edges[row_idx] - left_edges[row_idx]
1069
+ all_widths.append(width_px)
1070
+
1071
+ if len(all_widths) > 0:
1072
+ all_widths = np.array(all_widths)
1073
+ mad = np.median(np.abs(all_widths - median_width_px))
1074
+ outlier_threshold = 3.0 * mad
1075
+
1076
+ # Draw width lines color-coded
1077
+ for row_idx, valid in enumerate(valid_rows):
1078
+ if valid:
1079
+ left_x = int(left_edges[row_idx])
1080
+ right_x = int(right_edges[row_idx])
1081
+ width_px = right_x - left_x
1082
+
1083
+ is_outlier = abs(width_px - median_width_px) > outlier_threshold
1084
+ color = Color.RED if is_outlier else Color.GREEN
1085
+
1086
+ cv2.line(vis, (left_x, row_idx), (right_x, row_idx), color, 2)
1087
+
1088
+ # Add annotation with adaptive font scaling
1089
+ h = vis.shape[0]
1090
+ font_scale = max(0.4, h / 500.0)
1091
+ thickness = max(1, int(h / 150.0))
1092
+
1093
+ text = f"Outliers: {outliers_removed}"
1094
+ y_pos = int(h * 0.10) # Position at 10% of image height
1095
+
1096
+ # Get text size for background
1097
+ (text_w, text_h), baseline = cv2.getTextSize(text, FONT_FACE, font_scale, thickness)
1098
+
1099
+ # Draw background for readability
1100
+ cv2.rectangle(vis, (5, y_pos - text_h - 5), (15 + text_w, y_pos + baseline),
1101
+ (0, 0, 0), -1)
1102
+
1103
+ # Draw text with outline
1104
+ cv2.putText(vis, text, (10, y_pos), FONT_FACE, font_scale,
1105
+ Color.BLACK, thickness + 2, cv2.LINE_AA)
1106
+ cv2.putText(vis, text, (10, y_pos), FONT_FACE, font_scale,
1107
+ Color.RED, thickness, cv2.LINE_AA)
1108
+
1109
+ return vis
1110
+
1111
+
1112
+ def draw_comprehensive_edge_overlay(
1113
+ full_image: np.ndarray,
1114
+ edge_data: Dict[str, Any],
1115
+ roi_bounds: Tuple[int, int, int, int],
1116
+ axis_data: Dict[str, Any],
1117
+ zone_data: Dict[str, Any],
1118
+ width_data: Dict[str, Any],
1119
+ scale_px_per_cm: float
1120
+ ) -> np.ndarray:
1121
+ """
1122
+ Comprehensive visualization showing detected edges overlaid on full image
1123
+ with axis, zone, and measurement annotations.
1124
+ """
1125
+ vis = full_image.copy()
1126
+ h, w = vis.shape[:2]
1127
+
1128
+ x_min, y_min, x_max, y_max = roi_bounds
1129
+ left_edges = edge_data["left_edges"]
1130
+ right_edges = edge_data["right_edges"]
1131
+ valid_rows = edge_data["valid_rows"]
1132
+
1133
+ # 1. Draw axis line
1134
+ # Handle both PCA (tip_point, palm_point) and landmark-based axis (center, direction)
1135
+ if "center" in axis_data:
1136
+ axis_center = axis_data["center"]
1137
+ elif "tip_point" in axis_data and "palm_point" in axis_data:
1138
+ axis_center = (axis_data["tip_point"] + axis_data["palm_point"]) / 2
1139
+ else:
1140
+ # Fallback: use midpoint of axis
1141
+ axis_center = np.array([w//2, h//2])
1142
+
1143
+ axis_direction = axis_data["direction"]
1144
+ axis_length = axis_data["length"]
1145
+
1146
+ axis_start = axis_center - axis_direction * (axis_length / 2)
1147
+ axis_end = axis_center + axis_direction * (axis_length / 2)
1148
+ cv2.line(vis, tuple(axis_start.astype(int)), tuple(axis_end.astype(int)),
1149
+ Color.YELLOW, 2, cv2.LINE_AA)
1150
+
1151
+ # 2. Draw ring zone bounds as two lines perpendicular to axis at zone start/end
1152
+ zone_start = zone_data["start_point"]
1153
+ zone_end = zone_data["end_point"]
1154
+ perp_direction = np.array([-axis_direction[1], axis_direction[0]])
1155
+ # Use ROI half-width so the zone lines span the ROI
1156
+ roi_half_width = (x_max - x_min) / 2.0
1157
+
1158
+ for zone_pt in [zone_start, zone_end]:
1159
+ p1 = (zone_pt + perp_direction * roi_half_width).astype(int)
1160
+ p2 = (zone_pt - perp_direction * roi_half_width).astype(int)
1161
+ cv2.line(vis, tuple(p1), tuple(p2), Color.ORANGE, 2, cv2.LINE_AA)
1162
+
1163
+ # 3. Draw ROI boundary
1164
+ cv2.rectangle(vis, (x_min, y_min), (x_max, y_max), Color.CYAN, 2)
1165
+
1166
+ # 4. Draw detected edges
1167
+ line_spacing = max(1, int(np.sum(valid_rows)) // 25) # Show ~25 lines
1168
+ count = 0
1169
+
1170
+ for row_idx, valid in enumerate(valid_rows):
1171
+ if not valid:
1172
+ continue
1173
+
1174
+ # Map ROI coordinates to full image
1175
+ global_y = y_min + row_idx
1176
+ left_x_global = x_min + int(left_edges[row_idx])
1177
+ right_x_global = x_min + int(right_edges[row_idx])
1178
+
1179
+ # Draw edge points
1180
+ cv2.circle(vis, (left_x_global, global_y), 3, Color.BLUE, -1)
1181
+ cv2.circle(vis, (right_x_global, global_y), 3, Color.MAGENTA, -1)
1182
+
1183
+ # Draw connecting lines for every Nth row
1184
+ if count % line_spacing == 0:
1185
+ cv2.line(vis, (left_x_global, global_y), (right_x_global, global_y),
1186
+ Color.GREEN, 2, cv2.LINE_AA)
1187
+ count += 1
1188
+
1189
+ # 5. Add text annotations in top-left corner with adaptive sizing
1190
+ median_cm = width_data["median_width_cm"]
1191
+ median_px = width_data["median_width_px"]
1192
+ std_px = width_data["std_width_px"]
1193
+ num_samples = width_data["num_samples"]
1194
+ valid_pct = np.sum(valid_rows) / len(valid_rows) * 100
1195
+
1196
+ # Adaptive font scaling based on image height (more conservative for full-sized images)
1197
+ font_scale = max(0.3, h / 1500.0) # Scale for full-sized images
1198
+ line_height = int(35 + h / 70.0) # Scale line spacing (increased for better readability)
1199
+ thickness = max(1, int(h / 500.0))
1200
+
1201
+ annotations = [
1202
+ f"Sobel Edge Detection Results:",
1203
+ f" Median Width: {median_cm:.3f} cm ({median_px:.1f} px)",
1204
+ f" Std Dev: {std_px:.2f} px",
1205
+ f" Valid Edges: {np.sum(valid_rows)}/{len(valid_rows)} ({valid_pct:.1f}%)",
1206
+ f" Measurements: {num_samples}",
1207
+ f" Scale: {scale_px_per_cm:.2f} px/cm",
1208
+ "",
1209
+ "Legend:",
1210
+ " Yellow line = Finger axis",
1211
+ " Orange lines = Ring zone",
1212
+ " Cyan box = ROI",
1213
+ " Blue dots = Left edges",
1214
+ " Magenta dots = Right edges",
1215
+ " Green lines = Width measurements"
1216
+ ]
1217
+
1218
+ # Draw text with background for readability
1219
+ y_offset = line_height
1220
+ for line in annotations:
1221
+ if line: # Skip empty lines for background
1222
+ (text_w, text_h), baseline = cv2.getTextSize(line, FONT_FACE, font_scale, thickness)
1223
+ # Black background
1224
+ cv2.rectangle(vis, (15, y_offset - text_h - 5), (25 + text_w, y_offset + baseline),
1225
+ (0, 0, 0), -1)
1226
+ # Draw text
1227
+ if line.startswith(" "):
1228
+ color = Color.WHITE
1229
+ elif line.endswith(":"):
1230
+ color = Color.YELLOW
1231
+ else:
1232
+ color = Color.CYAN
1233
+ cv2.putText(vis, line, (20, y_offset), FONT_FACE, font_scale,
1234
+ color, thickness, cv2.LINE_AA)
1235
+ y_offset += line_height
1236
+
1237
+ return vis
1238
+
1239
+
1240
+ def draw_contour_vs_sobel(
1241
+ image: np.ndarray,
1242
+ finger_contour: np.ndarray,
1243
+ edge_data: Dict[str, Any],
1244
+ roi_bounds: Tuple[int, int, int, int]
1245
+ ) -> np.ndarray:
1246
+ """
1247
+ Side-by-side comparison of contour vs Sobel edges.
1248
+ """
1249
+ vis = image.copy()
1250
+
1251
+ # Draw contour (v0 method)
1252
+ cv2.drawContours(vis, [finger_contour], -1, Color.GREEN, Size.CONTOUR_THICK)
1253
+
1254
+ # Draw Sobel edges (v1 method)
1255
+ x_min, y_min, x_max, y_max = roi_bounds
1256
+ left_edges = edge_data["left_edges"]
1257
+ right_edges = edge_data["right_edges"]
1258
+ valid_rows = edge_data["valid_rows"]
1259
+
1260
+ for row_idx, valid in enumerate(valid_rows):
1261
+ if valid:
1262
+ # Map ROI coordinates back to original image
1263
+ global_y = y_min + row_idx
1264
+ left_x_global = x_min + int(left_edges[row_idx])
1265
+ right_x_global = x_min + int(right_edges[row_idx])
1266
+
1267
+ # Draw edge points
1268
+ cv2.circle(vis, (left_x_global, global_y), 2, Color.CYAN, -1)
1269
+ cv2.circle(vis, (right_x_global, global_y), 2, Color.MAGENTA, -1)
1270
+
1271
+ # Add legend
1272
+ cv2.putText(
1273
+ vis, "Green: Contour | Cyan/Magenta: Sobel Edges", (50, 100),
1274
+ FONT_FACE, FontScale.TITLE,
1275
+ Color.BLACK, FontThickness.TITLE_OUTLINE
1276
+ )
1277
+ cv2.putText(
1278
+ vis, "Green: Contour | Cyan/Magenta: Sobel Edges", (50, 100),
1279
+ FONT_FACE, FontScale.TITLE,
1280
+ Color.WHITE, FontThickness.TITLE
1281
+ )
1282
+
1283
+ return vis
src/edge_refinement.py ADDED
@@ -0,0 +1,1335 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Edge refinement using Sobel gradient filtering.
3
+
4
+ This module implements v1's core innovation: replacing contour-based width
5
+ measurement with gradient-based edge detection for improved accuracy.
6
+
7
+ Functions:
8
+ - extract_ring_zone_roi: Extract ROI around ring zone
9
+ - apply_sobel_filters: Bidirectional Sobel filtering
10
+ - detect_edges_per_row: Find left/right edges in each cross-section
11
+ - refine_edge_subpixel: Sub-pixel edge localization (Phase 3)
12
+ - measure_width_from_edges: Compute width from edge positions
13
+ - compute_edge_quality_score: Assess edge detection quality (Phase 3)
14
+ - should_use_sobel_measurement: Auto fallback logic (Phase 3)
15
+ - refine_edges_sobel: Main entry point for edge refinement
16
+ """
17
+
18
+ import cv2
19
+ import numpy as np
20
+ import logging
21
+ from typing import Dict, Any, Optional, Tuple, List
22
+
23
+ from src.edge_refinement_constants import (
24
+ # Sobel Filter
25
+ DEFAULT_KERNEL_SIZE,
26
+ VALID_KERNEL_SIZES,
27
+ # Edge Detection
28
+ DEFAULT_GRADIENT_THRESHOLD,
29
+ MIN_FINGER_WIDTH_CM,
30
+ MAX_FINGER_WIDTH_CM,
31
+ WIDTH_TOLERANCE_FACTOR,
32
+ # Sub-Pixel Refinement
33
+ MAX_SUBPIXEL_OFFSET,
34
+ MIN_PARABOLA_DENOMINATOR,
35
+ # Outlier Filtering
36
+ MAD_OUTLIER_THRESHOLD,
37
+ # Edge Quality Scoring
38
+ GRADIENT_STRENGTH_NORMALIZER,
39
+ SMOOTHNESS_VARIANCE_NORMALIZER,
40
+ QUALITY_WEIGHT_GRADIENT,
41
+ QUALITY_WEIGHT_CONSISTENCY,
42
+ QUALITY_WEIGHT_SMOOTHNESS,
43
+ QUALITY_WEIGHT_SYMMETRY,
44
+ # Auto Fallback Decision
45
+ MIN_QUALITY_SCORE_THRESHOLD,
46
+ MIN_CONSISTENCY_THRESHOLD,
47
+ MIN_REALISTIC_WIDTH_CM,
48
+ MAX_REALISTIC_WIDTH_CM,
49
+ MAX_CONTOUR_DIFFERENCE_PCT,
50
+ )
51
+
52
+ # Configure logging
53
+ logger = logging.getLogger(__name__)
54
+
55
+
56
+ # =============================================================================
57
+ # Helper Functions (extracted from nested scope)
58
+ # =============================================================================
59
+
60
+ def _get_axis_x_at_row(row_y: float, axis_center: Optional[np.ndarray],
61
+ axis_direction: Optional[np.ndarray], width: int) -> float:
62
+ """
63
+ Get axis x-coordinate at given row y-coordinate.
64
+
65
+ Args:
66
+ row_y: Row y-coordinate
67
+ axis_center: Axis center point (x, y)
68
+ axis_direction: Axis direction vector (dx, dy)
69
+ width: Image width (for fallback)
70
+
71
+ Returns:
72
+ X-coordinate of axis at given row
73
+ """
74
+ if axis_center is None or axis_direction is None:
75
+ return width / 2 # Fallback to center
76
+
77
+ if abs(axis_direction[1]) < 1e-6:
78
+ # Nearly horizontal axis
79
+ return axis_center[0]
80
+ else:
81
+ # Parametric line: P = axis_center + t * axis_direction
82
+ t = (row_y - axis_center[1]) / axis_direction[1]
83
+ return axis_center[0] + t * axis_direction[0]
84
+
85
+
86
+ def _find_edges_from_axis(
87
+ row_gradient: np.ndarray,
88
+ row_y: float,
89
+ axis_x: float,
90
+ threshold: float,
91
+ min_width_px: Optional[float],
92
+ max_width_px: Optional[float],
93
+ row_mask: Optional[np.ndarray] = None,
94
+ row_gradient_left_to_right: Optional[np.ndarray] = None,
95
+ row_gradient_right_to_left: Optional[np.ndarray] = None,
96
+ ) -> Optional[Tuple[float, float, float, float]]:
97
+ """
98
+ Find left and right edges by expanding from axis position.
99
+
100
+ Strategy:
101
+ - MASK-CONSTRAINED MODE (when row_mask provided):
102
+ 1. Find leftmost/rightmost mask pixels (finger boundaries)
103
+ 2. Search for strongest gradient within ±10px of mask boundaries
104
+ 3. Combines anatomical accuracy (mask) with sub-pixel precision (gradient)
105
+
106
+ - AXIS-EXPANSION MODE (when row_mask is None):
107
+ 1. Start at axis x-coordinate (INSIDE the finger)
108
+ 2. Search LEFT/RIGHT from axis for closest salient edge
109
+ 3. Validate width is within realistic range
110
+
111
+ Args:
112
+ row_gradient: Gradient magnitude for this row
113
+ row_y: Row y-coordinate
114
+ axis_x: Axis x-coordinate at this row
115
+ threshold: Gradient threshold for valid edge
116
+ min_width_px: Minimum valid width in pixels (None to skip)
117
+ max_width_px: Maximum valid width in pixels (None to skip)
118
+ row_mask: Optional mask row (True = finger pixel) for constrained search
119
+ row_gradient_left_to_right: Optional directional gradient map for right edge search
120
+ row_gradient_right_to_left: Optional directional gradient map for left edge search
121
+
122
+ Returns:
123
+ Tuple of (left_x, right_x, left_strength, right_strength) or None if invalid
124
+ """
125
+ if axis_x < 0 or axis_x >= len(row_gradient):
126
+ return None
127
+
128
+ # Direction-aware gradient maps (preferred when available):
129
+ # - left boundary should come from right-to-left transition
130
+ # - right boundary should come from left-to-right transition
131
+ left_search_gradient = row_gradient_right_to_left if row_gradient_right_to_left is not None else row_gradient
132
+ right_search_gradient = row_gradient_left_to_right if row_gradient_left_to_right is not None else row_gradient
133
+
134
+ # MASK-CONSTRAINED MODE (preferred when available)
135
+ if row_mask is not None and np.any(row_mask):
136
+ # Strategy: Search FROM axis OUTWARD, constrained by mask
137
+ # This avoids picking background edges while using gradient precision
138
+
139
+ mask_indices = np.where(row_mask)[0]
140
+ if len(mask_indices) < 2:
141
+ return None # Mask too small
142
+
143
+ left_mask_boundary = mask_indices[0]
144
+ right_mask_boundary = mask_indices[-1]
145
+
146
+ # Search LEFT from axis, stopping at mask boundary
147
+ left_edge_x = None
148
+ left_strength = 0
149
+
150
+ # Start from axis, go left until we reach left mask boundary
151
+ search_start = max(left_mask_boundary, int(axis_x))
152
+ for x in range(search_start, left_mask_boundary - 1, -1):
153
+ if x < 0 or x >= len(row_gradient):
154
+ continue
155
+ if left_search_gradient[x] > threshold:
156
+ # Found a strong edge - update if stronger than previous
157
+ if left_search_gradient[x] > left_strength:
158
+ left_edge_x = x
159
+ left_strength = left_search_gradient[x]
160
+
161
+ # If no edge found with full threshold, try with relaxed threshold
162
+ if left_edge_x is None:
163
+ relaxed_threshold = threshold * 0.5
164
+ for x in range(search_start, left_mask_boundary - 1, -1):
165
+ if x < 0 or x >= len(row_gradient):
166
+ continue
167
+ if left_search_gradient[x] > relaxed_threshold:
168
+ if left_search_gradient[x] > left_strength:
169
+ left_edge_x = x
170
+ left_strength = left_search_gradient[x]
171
+
172
+ # Search RIGHT from axis, stopping at mask boundary
173
+ right_edge_x = None
174
+ right_strength = 0
175
+
176
+ # Start from axis, go right until we reach right mask boundary
177
+ search_start = min(right_mask_boundary, int(axis_x))
178
+ for x in range(search_start, right_mask_boundary + 1):
179
+ if x < 0 or x >= len(row_gradient):
180
+ continue
181
+ if right_search_gradient[x] > threshold:
182
+ # Found a strong edge - update if stronger than previous
183
+ if right_search_gradient[x] > right_strength:
184
+ right_edge_x = x
185
+ right_strength = right_search_gradient[x]
186
+
187
+ # If no edge found with full threshold, try with relaxed threshold
188
+ if right_edge_x is None:
189
+ relaxed_threshold = threshold * 0.5
190
+ for x in range(search_start, right_mask_boundary + 1):
191
+ if x < 0 or x >= len(row_gradient):
192
+ continue
193
+ if right_search_gradient[x] > relaxed_threshold:
194
+ if right_search_gradient[x] > right_strength:
195
+ right_edge_x = x
196
+ right_strength = right_search_gradient[x]
197
+
198
+ if left_edge_x is None or right_edge_x is None:
199
+ return None # No valid edges found
200
+
201
+ else:
202
+ # AXIS-EXPANSION MODE (fallback when no mask)
203
+ # Search LEFT from axis (go leftward)
204
+ left_edge_x = None
205
+ left_strength = 0
206
+ for x in range(int(axis_x), -1, -1):
207
+ if left_search_gradient[x] > threshold:
208
+ # Found a salient edge - this is our left boundary
209
+ left_edge_x = x
210
+ left_strength = left_search_gradient[x]
211
+ break
212
+
213
+ # Search RIGHT from axis (go rightward)
214
+ right_edge_x = None
215
+ right_strength = 0
216
+ for x in range(int(axis_x), len(row_gradient)):
217
+ if right_search_gradient[x] > threshold:
218
+ # Found a salient edge - this is our right boundary
219
+ right_edge_x = x
220
+ right_strength = right_search_gradient[x]
221
+ break
222
+
223
+ if left_edge_x is None or right_edge_x is None:
224
+ return None
225
+
226
+ # Validate width is within realistic finger range
227
+ width = right_edge_x - left_edge_x
228
+ if min_width_px is not None and max_width_px is not None:
229
+ if width < min_width_px or width > max_width_px:
230
+ return None # Width out of realistic range
231
+
232
+ return (left_edge_x, right_edge_x, left_strength, right_strength)
233
+
234
+
235
+ # =============================================================================
236
+ # Main Functions
237
+ # =============================================================================
238
+
239
+ def extract_ring_zone_roi(
240
+ image: np.ndarray,
241
+ axis_data: Dict[str, Any],
242
+ zone_data: Dict[str, Any],
243
+ rotate_align: bool = False
244
+ ) -> Dict[str, Any]:
245
+ """
246
+ Extract ROI around ring zone.
247
+
248
+ The ROI is sized from the zone length (|DIP - PIP|): 1.5x wide, 0.5x tall,
249
+ centered on the ring zone center. This scales naturally with camera
250
+ distance since it's derived from anatomical landmarks.
251
+
252
+ Args:
253
+ image: Input BGR image
254
+ axis_data: Output from estimate_finger_axis()
255
+ zone_data: Output from localize_ring_zone()
256
+ rotate_align: If True, rotate ROI so finger axis is vertical
257
+
258
+ Returns:
259
+ Dictionary containing:
260
+ - roi_image: Extracted ROI (grayscale)
261
+ - roi_mask: Full ROI mask (all 255)
262
+ - roi_bounds: (x_min, y_min, x_max, y_max) in original image
263
+ - transform_matrix: 3x3 matrix to map ROI coords -> original coords
264
+ - inverse_transform: 3x3 matrix to map original -> ROI coords
265
+ - rotation_angle: Rotation angle applied (degrees)
266
+ - roi_width: ROI width in pixels
267
+ - roi_height: ROI height in pixels
268
+ """
269
+ h, w = image.shape[:2]
270
+
271
+ # ROI centered on ring zone center, sized from |DIP - PIP| distance:
272
+ # height = 0.5x zone length (along finger axis)
273
+ # width = 1.5x zone length (perpendicular, wider to capture full finger edges)
274
+ zone_length = zone_data["length"]
275
+ center = zone_data["center_point"]
276
+ direction = axis_data["direction"]
277
+ half_height = zone_length * 0.25 # 0.5x / 2
278
+ half_width = zone_length * 0.6 # 1.5x / 2
279
+
280
+ x_min = int(np.clip(center[0] - half_width, 0, w - 1))
281
+ x_max = int(np.clip(center[0] + half_width, 0, w - 1))
282
+ y_min = int(np.clip(center[1] - half_height, 0, h - 1))
283
+ y_max = int(np.clip(center[1] + half_height, 0, h - 1))
284
+
285
+ roi_width = x_max - x_min
286
+ roi_height = y_max - y_min
287
+
288
+ if roi_width < 10 or roi_height < 10:
289
+ raise ValueError(f"ROI too small: {roi_width}x{roi_height}")
290
+
291
+ # Extract ROI
292
+ roi_bgr = image[y_min:y_max, x_min:x_max].copy()
293
+
294
+ # Convert to grayscale for edge detection
295
+ roi_gray = cv2.cvtColor(roi_bgr, cv2.COLOR_BGR2GRAY)
296
+
297
+ # Full ROI mask — the ROI rectangle itself is the search constraint
298
+ roi_mask = np.ones((roi_height, roi_width), dtype=np.uint8) * 255
299
+
300
+ # Create transform matrix (ROI coords -> original coords)
301
+ # Simple translation for non-rotated case
302
+ transform = np.eye(3, dtype=np.float32)
303
+ transform[0, 2] = x_min # Translation in x
304
+ transform[1, 2] = y_min # Translation in y
305
+
306
+ inverse_transform = np.linalg.inv(transform)
307
+
308
+ rotation_angle = 0.0
309
+
310
+ # Optional rotation alignment
311
+ if rotate_align:
312
+ # Calculate rotation angle to make finger vertical
313
+ # Finger direction -> make it point upward (0, -1)
314
+ # Current direction is (dx, dy), want to rotate to (0, -1)
315
+ rotation_angle = np.degrees(np.arctan2(-direction[0], direction[1]))
316
+
317
+ # Get rotation matrix
318
+ roi_center = (roi_width / 2.0, roi_height / 2.0)
319
+ rotation_matrix = cv2.getRotationMatrix2D(roi_center, rotation_angle, 1.0)
320
+
321
+ # Rotate ROI
322
+ roi_gray = cv2.warpAffine(
323
+ roi_gray, rotation_matrix, (roi_width, roi_height),
324
+ flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REPLICATE
325
+ )
326
+
327
+ # Update transform matrices
328
+ # Rotation matrix is 2x3, convert to 3x3 for composition
329
+ rotation_matrix_3x3 = np.eye(3, dtype=np.float32)
330
+ rotation_matrix_3x3[:2, :] = rotation_matrix
331
+
332
+ # Compose: translate then rotate
333
+ transform = np.dot(rotation_matrix_3x3, transform)
334
+ inverse_transform = np.linalg.inv(transform)
335
+
336
+ # Convert axis center point and direction to ROI coordinates
337
+ axis_center = axis_data.get("center", center)
338
+ roi_offset = np.array([x_min, y_min], dtype=np.float32)
339
+ axis_center_in_roi = axis_center - roi_offset
340
+
341
+ # Direction vector stays the same (it's not affected by translation)
342
+ axis_direction_in_roi = direction.copy()
343
+
344
+ zone_start = zone_data["start_point"]
345
+ zone_end = zone_data["end_point"]
346
+
347
+ return {
348
+ "roi_image": roi_gray,
349
+ "roi_mask": roi_mask,
350
+ "roi_bgr": roi_bgr, # Keep BGR for debug visualization
351
+ "roi_bounds": (x_min, y_min, x_max, y_max),
352
+ "transform_matrix": transform,
353
+ "inverse_transform": inverse_transform,
354
+ "rotation_angle": rotation_angle,
355
+ "roi_width": roi_width,
356
+ "roi_height": roi_height,
357
+ "zone_start_in_roi": zone_start - roi_offset,
358
+ "zone_end_in_roi": zone_end - roi_offset,
359
+ "axis_center_in_roi": axis_center_in_roi,
360
+ "axis_direction_in_roi": axis_direction_in_roi,
361
+ }
362
+
363
+
364
+ def apply_sobel_filters(
365
+ roi_image: np.ndarray,
366
+ kernel_size: int = DEFAULT_KERNEL_SIZE,
367
+ axis_direction: str = "auto"
368
+ ) -> Dict[str, Any]:
369
+ """
370
+ Apply bidirectional Sobel filters to detect edges.
371
+
372
+ For vertical finger (axis_direction="vertical"):
373
+ - Use horizontal Sobel kernels (detect left/right edges)
374
+
375
+ For horizontal finger (axis_direction="horizontal"):
376
+ - Use vertical Sobel kernels (detect top/bottom edges)
377
+
378
+ Auto mode detects orientation from ROI aspect ratio.
379
+
380
+ Args:
381
+ roi_image: Grayscale ROI image
382
+ kernel_size: Sobel kernel size (3, 5, or 7)
383
+ axis_direction: Finger axis direction ("auto", "vertical", "horizontal")
384
+
385
+ Returns:
386
+ Dictionary containing:
387
+ - gradient_x: Horizontal gradient (Sobel X)
388
+ - gradient_y: Vertical gradient (Sobel Y)
389
+ - gradient_left_to_right: Positive X-gradient map (right-half gated in horizontal mode)
390
+ - gradient_right_to_left: Negative X-gradient map (left-half gated in horizontal mode)
391
+ - gradient_magnitude: Combined gradient magnitude
392
+ - gradient_direction: Edge orientation (radians)
393
+ - kernel_size: Kernel size used
394
+ - filter_orientation: "horizontal" or "vertical"
395
+ """
396
+ if kernel_size not in VALID_KERNEL_SIZES:
397
+ raise ValueError(f"Invalid kernel_size: {kernel_size}. Use {VALID_KERNEL_SIZES}")
398
+
399
+ h, w = roi_image.shape
400
+
401
+ # Determine filter orientation
402
+ if axis_direction == "auto":
403
+ # After rotation normalization, finger is always vertical (upright)
404
+ # Finger runs vertically → detect left/right edges → use horizontal filter
405
+ #
406
+ # NOTE: ROI aspect ratio is NOT reliable after rotation normalization!
407
+ # The ROI may be wider than tall even when finger is vertical.
408
+ # Always use horizontal filter orientation for upright hands.
409
+ filter_orientation = "horizontal" # Detect left/right edges for vertical finger
410
+ elif axis_direction == "vertical":
411
+ filter_orientation = "horizontal"
412
+ elif axis_direction == "horizontal":
413
+ filter_orientation = "vertical"
414
+ else:
415
+ raise ValueError(f"Invalid axis_direction: {axis_direction}")
416
+
417
+ # Apply Sobel filters
418
+ # Sobel X detects vertical edges (left/right boundaries)
419
+ # Sobel Y detects horizontal edges (top/bottom boundaries)
420
+
421
+ # Use cv2.Sobel for standard implementation
422
+ grad_x = cv2.Sobel(roi_image, cv2.CV_64F, 1, 0, ksize=kernel_size)
423
+ grad_y = cv2.Sobel(roi_image, cv2.CV_64F, 0, 1, ksize=kernel_size)
424
+
425
+ # Directional Sobel responses along X:
426
+ # - left_to_right: rising intensity while moving left -> right
427
+ # - right_to_left: falling intensity while moving left -> right
428
+ gradient_left_to_right = np.maximum(grad_x, 0.0)
429
+ gradient_right_to_left = np.maximum(-grad_x, 0.0)
430
+
431
+ # Spatial gating to reduce nearby non-target finger interference:
432
+ # - left_to_right only on ROI right half
433
+ # - right_to_left only on ROI left half
434
+ roi_split_x = w // 2
435
+ if filter_orientation == "horizontal":
436
+ gradient_left_to_right[:, :roi_split_x] = 0.0
437
+ gradient_right_to_left[:, roi_split_x:] = 0.0
438
+ gradient_magnitude = np.sqrt(gradient_left_to_right**2 + gradient_right_to_left**2)
439
+ else:
440
+ # Vertical mode fallback keeps the original behavior.
441
+ gradient_magnitude = np.sqrt(grad_x**2 + grad_y**2)
442
+
443
+ # Calculate gradient direction (angle)
444
+ gradient_direction = np.arctan2(grad_y, grad_x)
445
+
446
+ # Normalize gradients to 0-255 for visualization
447
+ grad_x_normalized = np.clip(np.abs(grad_x), 0, 255).astype(np.uint8)
448
+ grad_y_normalized = np.clip(np.abs(grad_y), 0, 255).astype(np.uint8)
449
+ grad_mag_normalized = np.clip(gradient_magnitude, 0, 255).astype(np.uint8)
450
+ grad_l2r_normalized = np.clip(gradient_left_to_right, 0, 255).astype(np.uint8)
451
+ grad_r2l_normalized = np.clip(gradient_right_to_left, 0, 255).astype(np.uint8)
452
+
453
+ return {
454
+ "gradient_x": grad_x,
455
+ "gradient_y": grad_y,
456
+ "gradient_left_to_right": gradient_left_to_right,
457
+ "gradient_right_to_left": gradient_right_to_left,
458
+ "gradient_magnitude": gradient_magnitude,
459
+ "gradient_direction": gradient_direction,
460
+ "gradient_x_normalized": grad_x_normalized,
461
+ "gradient_y_normalized": grad_y_normalized,
462
+ "gradient_left_to_right_normalized": grad_l2r_normalized,
463
+ "gradient_right_to_left_normalized": grad_r2l_normalized,
464
+ "gradient_mag_normalized": grad_mag_normalized,
465
+ "kernel_size": kernel_size,
466
+ "filter_orientation": filter_orientation,
467
+ "roi_split_x": roi_split_x,
468
+ }
469
+
470
+
471
+ def detect_edges_per_row(
472
+ gradient_data: Dict[str, Any],
473
+ roi_data: Dict[str, Any],
474
+ threshold: float = DEFAULT_GRADIENT_THRESHOLD,
475
+ expected_width_px: Optional[float] = None,
476
+ scale_px_per_cm: Optional[float] = None
477
+ ) -> Dict[str, Any]:
478
+ """
479
+ Detect left and right finger edges for each row (cross-section).
480
+
481
+ Uses mask-constrained mode when roi_mask is available:
482
+ 1. Find leftmost/rightmost mask pixels (anatomical finger boundaries)
483
+ 2. Search for gradient peaks within ±10px of mask boundaries
484
+ 3. Combines anatomical accuracy with sub-pixel gradient precision
485
+
486
+ Falls back to axis-expansion mode when no mask:
487
+ 1. Start at finger axis (guaranteed inside finger)
488
+ 2. Expand left/right to find nearest salient edges
489
+ 3. Validate width is within realistic range
490
+
491
+ Args:
492
+ gradient_data: Output from apply_sobel_filters()
493
+ roi_data: Output from extract_ring_zone_roi()
494
+ threshold: Minimum gradient magnitude for valid edge
495
+ expected_width_px: Expected finger width from contour (optional)
496
+ scale_px_per_cm: Scale factor for width validation (optional)
497
+
498
+ Returns:
499
+ Dictionary containing:
500
+ - left_edges: Array of left edge x-coordinates (one per row)
501
+ - right_edges: Array of right edge x-coordinates (one per row)
502
+ - edge_strengths_left: Gradient magnitude at left edges
503
+ - edge_strengths_right: Gradient magnitude at right edges
504
+ - valid_rows: Boolean mask of rows with successful detection
505
+ - num_valid_rows: Count of successful detections
506
+ - mode_used: "mask_constrained" or "axis_expansion"
507
+ """
508
+ gradient_magnitude = gradient_data["gradient_magnitude"]
509
+ gradient_left_to_right = gradient_data.get("gradient_left_to_right")
510
+ gradient_right_to_left = gradient_data.get("gradient_right_to_left")
511
+ filter_orientation = gradient_data["filter_orientation"]
512
+
513
+ h, w = gradient_magnitude.shape
514
+
515
+ # Calculate realistic finger width range in pixels
516
+ min_width_px = None
517
+ max_width_px = None
518
+ if scale_px_per_cm is not None:
519
+ min_width_px = MIN_FINGER_WIDTH_CM * scale_px_per_cm
520
+ max_width_px = MAX_FINGER_WIDTH_CM * scale_px_per_cm
521
+ logger.debug(f"Width constraint: {min_width_px:.1f}-{max_width_px:.1f}px ({MIN_FINGER_WIDTH_CM}-{MAX_FINGER_WIDTH_CM}cm)")
522
+ elif expected_width_px is not None:
523
+ # Use expected width with tolerance
524
+ min_width_px = expected_width_px * (1 - WIDTH_TOLERANCE_FACTOR)
525
+ max_width_px = expected_width_px * (1 + WIDTH_TOLERANCE_FACTOR)
526
+ logger.debug(f"Width constraint: {min_width_px:.1f}-{max_width_px:.1f}px (±{WIDTH_TOLERANCE_FACTOR*100}% of expected)")
527
+ else:
528
+ logger.debug("No width constraint (scale and expected width both None)")
529
+
530
+ # Get axis information - this is our strong anchor point (INSIDE the finger)
531
+ axis_center = roi_data.get("axis_center_in_roi")
532
+ axis_direction = roi_data.get("axis_direction_in_roi")
533
+ zone_start = roi_data.get("zone_start_in_roi")
534
+ zone_end = roi_data.get("zone_end_in_roi")
535
+
536
+ # Get finger mask for constrained edge detection (if available)
537
+ roi_mask = roi_data.get("roi_mask")
538
+ mode_used = "mask_constrained" if roi_mask is not None else "axis_expansion"
539
+
540
+ if roi_mask is not None:
541
+ logger.debug(f"Using MASK-CONSTRAINED edge detection (mask shape: {roi_mask.shape})")
542
+ else:
543
+ logger.debug("Using AXIS-EXPANSION edge detection (no mask available)")
544
+
545
+ # For horizontal filter orientation (detecting left/right edges)
546
+ # Process each row to find left and right edges
547
+ if filter_orientation == "horizontal":
548
+ num_rows = h
549
+ left_edges = np.full(num_rows, -1.0, dtype=np.float32)
550
+ right_edges = np.full(num_rows, -1.0, dtype=np.float32)
551
+ edge_strengths_left = np.zeros(num_rows, dtype=np.float32)
552
+ edge_strengths_right = np.zeros(num_rows, dtype=np.float32)
553
+ valid_rows = np.zeros(num_rows, dtype=bool)
554
+
555
+ for row in range(num_rows):
556
+ # Get axis position (our anchor point INSIDE the finger)
557
+ axis_x = _get_axis_x_at_row(row, axis_center, axis_direction, w)
558
+
559
+ # Get gradient for this row
560
+ row_gradient = gradient_magnitude[row, :]
561
+ row_gradient_l2r = gradient_left_to_right[row, :] if gradient_left_to_right is not None else None
562
+ row_gradient_r2l = gradient_right_to_left[row, :] if gradient_right_to_left is not None else None
563
+
564
+ # Get mask for this row (if available)
565
+ row_mask = roi_mask[row, :] if roi_mask is not None else None
566
+
567
+ # Find edges using mask-constrained or axis-expansion method
568
+ result = _find_edges_from_axis(row_gradient, row, axis_x, threshold,
569
+ min_width_px, max_width_px, row_mask,
570
+ row_gradient_left_to_right=row_gradient_l2r,
571
+ row_gradient_right_to_left=row_gradient_r2l)
572
+
573
+ if result is None:
574
+ continue # No valid edges found
575
+
576
+ left_edge_x, right_edge_x, left_strength, right_strength = result
577
+
578
+ # Mark as valid
579
+ left_edges[row] = float(left_edge_x)
580
+ right_edges[row] = float(right_edge_x)
581
+ edge_strengths_left[row] = left_strength
582
+ edge_strengths_right[row] = right_strength
583
+ valid_rows[row] = True
584
+
585
+ else:
586
+ # Vertical filter orientation (detecting top/bottom edges)
587
+ # Process each column
588
+ num_cols = w
589
+ left_edges = np.full(num_cols, -1.0, dtype=np.float32)
590
+ right_edges = np.full(num_cols, -1.0, dtype=np.float32)
591
+ edge_strengths_left = np.zeros(num_cols, dtype=np.float32)
592
+ edge_strengths_right = np.zeros(num_cols, dtype=np.float32)
593
+ valid_rows = np.zeros(num_cols, dtype=bool)
594
+
595
+ roi_center_y = h / 2.0
596
+
597
+ for col in range(num_cols):
598
+ col_gradient = gradient_magnitude[:, col]
599
+
600
+ strong_edges = np.where(col_gradient > threshold)[0]
601
+
602
+ if len(strong_edges) < 2:
603
+ continue
604
+
605
+ top_candidates = strong_edges[strong_edges < roi_center_y]
606
+ bottom_candidates = strong_edges[strong_edges >= roi_center_y]
607
+
608
+ if len(top_candidates) == 0 or len(bottom_candidates) == 0:
609
+ continue
610
+
611
+ # Select edges closest to center (finger boundaries)
612
+ top_edge_y = top_candidates[-1] # Bottommost of top candidates
613
+ bottom_edge_y = bottom_candidates[0] # Topmost of bottom candidates
614
+
615
+ top_strength = col_gradient[top_edge_y]
616
+ bottom_strength = col_gradient[bottom_edge_y]
617
+
618
+ height = bottom_edge_y - top_edge_y
619
+
620
+ if expected_width_px is not None:
621
+ if height < expected_width_px * 0.5 or height > expected_width_px * 1.5:
622
+ continue
623
+
624
+ left_edges[col] = float(top_edge_y)
625
+ right_edges[col] = float(bottom_edge_y)
626
+ edge_strengths_left[col] = top_strength
627
+ edge_strengths_right[col] = bottom_strength
628
+ valid_rows[col] = True
629
+
630
+ num_valid = np.sum(valid_rows)
631
+
632
+ return {
633
+ "left_edges": left_edges,
634
+ "right_edges": right_edges,
635
+ "edge_strengths_left": edge_strengths_left,
636
+ "edge_strengths_right": edge_strengths_right,
637
+ "valid_rows": valid_rows,
638
+ "num_valid_rows": int(num_valid),
639
+ "filter_orientation": filter_orientation,
640
+ "mode_used": mode_used, # "mask_constrained" or "axis_expansion"
641
+ }
642
+
643
+
644
+ def refine_edge_subpixel(
645
+ gradient_magnitude: np.ndarray,
646
+ edge_positions: np.ndarray,
647
+ valid_mask: np.ndarray,
648
+ method: str = "parabola"
649
+ ) -> np.ndarray:
650
+ """
651
+ Refine edge positions to sub-pixel precision.
652
+
653
+ Uses parabola fitting on gradient magnitude to find peak position
654
+ with <0.5 pixel accuracy.
655
+
656
+ Args:
657
+ gradient_magnitude: 2D gradient magnitude array
658
+ edge_positions: Integer edge positions (one per row/col)
659
+ valid_mask: Boolean mask indicating which positions are valid
660
+ method: Refinement method ("parabola" or "gaussian")
661
+
662
+ Returns:
663
+ Refined edge positions (float, sub-pixel precision)
664
+ """
665
+ refined_positions = edge_positions.copy()
666
+
667
+ if method == "parabola":
668
+ # Parabola fitting: fit f(x) = ax^2 + bx + c to 3 points
669
+ # Peak at x = -b/(2a)
670
+
671
+ for i in range(len(edge_positions)):
672
+ if not valid_mask[i]:
673
+ continue
674
+
675
+ edge_pos = int(edge_positions[i])
676
+
677
+ # Get gradient magnitude at edge and neighbors
678
+ # Handle edge cases (pun intended)
679
+ if edge_pos <= 0 or edge_pos >= gradient_magnitude.shape[1] - 1:
680
+ continue # Can't refine at image boundaries
681
+
682
+ # For horizontal orientation (row-wise edge detection)
683
+ if len(gradient_magnitude.shape) == 2 and i < gradient_magnitude.shape[0]:
684
+ # Sample gradient at x-1, x, x+1
685
+ x_minus = edge_pos - 1
686
+ x_center = edge_pos
687
+ x_plus = edge_pos + 1
688
+
689
+ g_minus = gradient_magnitude[i, x_minus]
690
+ g_center = gradient_magnitude[i, x_center]
691
+ g_plus = gradient_magnitude[i, x_plus]
692
+
693
+ # Fit parabola: f(x) = ax^2 + bx + c
694
+ # Using x = -1, 0, 1 for simplicity
695
+ # f(-1) = a - b + c = g_minus
696
+ # f(0) = c = g_center
697
+ # f(1) = a + b + c = g_plus
698
+
699
+ c = g_center
700
+ a = (g_plus + g_minus - 2 * c) / 2.0
701
+ b = (g_plus - g_minus) / 2.0
702
+
703
+ # Peak at x_peak = -b/(2a)
704
+ if abs(a) > MIN_PARABOLA_DENOMINATOR: # Avoid division by zero
705
+ x_peak = -b / (2.0 * a)
706
+
707
+ # Constrain to reasonable range
708
+ if abs(x_peak) <= MAX_SUBPIXEL_OFFSET:
709
+ refined_positions[i] = edge_pos + x_peak
710
+
711
+ elif method == "gaussian":
712
+ # Gaussian fitting (more complex, not implemented yet)
713
+ # Would fit Gaussian to 5-pixel window
714
+ # For now, fall back to parabola
715
+ return refine_edge_subpixel(gradient_magnitude, edge_positions, valid_mask, method="parabola")
716
+
717
+ else:
718
+ raise ValueError(f"Unknown refinement method: {method}")
719
+
720
+ return refined_positions
721
+
722
+
723
+ def measure_width_from_edges(
724
+ edge_data: Dict[str, Any],
725
+ roi_data: Dict[str, Any],
726
+ scale_px_per_cm: float,
727
+ gradient_data: Optional[Dict[str, Any]] = None,
728
+ use_subpixel: bool = True
729
+ ) -> Dict[str, Any]:
730
+ """
731
+ Compute finger width from detected edges.
732
+
733
+ Steps:
734
+ 1. Apply sub-pixel refinement if gradient data available
735
+ 2. Calculate width for each valid row: width_px = right_edge - left_edge
736
+ 3. Filter outliers (>3 MAD from median)
737
+ 4. Compute statistics (median, mean, std)
738
+ 5. Convert width from pixels to cm
739
+
740
+ Args:
741
+ edge_data: Output from detect_edges_per_row()
742
+ roi_data: Output from extract_ring_zone_roi()
743
+ scale_px_per_cm: Pixels per cm from card detection
744
+ gradient_data: Optional gradient data for sub-pixel refinement
745
+ use_subpixel: Enable sub-pixel refinement (default True)
746
+
747
+ Returns:
748
+ Dictionary containing:
749
+ - widths_px: Array of width measurements (pixels)
750
+ - median_width_px: Median width in pixels
751
+ - median_width_cm: Median width in cm (final measurement)
752
+ - mean_width_px: Mean width in pixels
753
+ - std_width_px: Standard deviation of widths
754
+ - num_samples: Number of valid width measurements
755
+ - outliers_removed: Number of outliers filtered
756
+ - subpixel_refinement_used: Whether sub-pixel refinement was applied
757
+ """
758
+ left_edges = edge_data["left_edges"].copy()
759
+ right_edges = edge_data["right_edges"].copy()
760
+ valid_rows = edge_data["valid_rows"]
761
+
762
+ # Apply sub-pixel refinement if available
763
+ subpixel_used = False
764
+ if use_subpixel and gradient_data is not None:
765
+ try:
766
+ gradient_magnitude = gradient_data["gradient_magnitude"]
767
+
768
+ # Refine left edges
769
+ left_edges = refine_edge_subpixel(
770
+ gradient_magnitude, left_edges, valid_rows, method="parabola"
771
+ )
772
+
773
+ # Refine right edges
774
+ right_edges = refine_edge_subpixel(
775
+ gradient_magnitude, right_edges, valid_rows, method="parabola"
776
+ )
777
+
778
+ subpixel_used = True
779
+ except Exception as e:
780
+ logger.warning(f"Sub-pixel refinement failed: {e}, using integer positions")
781
+ # Fall back to integer positions
782
+ left_edges = edge_data["left_edges"]
783
+ right_edges = edge_data["right_edges"]
784
+
785
+ # Calculate widths for valid rows
786
+ widths_px = []
787
+ for i in range(len(valid_rows)):
788
+ if valid_rows[i]:
789
+ width = right_edges[i] - left_edges[i]
790
+ if width > 0:
791
+ widths_px.append(width)
792
+
793
+ if len(widths_px) == 0:
794
+ raise ValueError("No valid width measurements found")
795
+
796
+ widths_px = np.array(widths_px)
797
+
798
+ # Filter outliers using median absolute deviation (MAD)
799
+ median = np.median(widths_px)
800
+ mad = np.median(np.abs(widths_px - median))
801
+
802
+ # Outliers are >3 MAD from median (more robust than std dev)
803
+ if mad > 0:
804
+ is_outlier = np.abs(widths_px - median) > (MAD_OUTLIER_THRESHOLD * mad)
805
+ widths_filtered = widths_px[~is_outlier]
806
+ outliers_removed = np.sum(is_outlier)
807
+ else:
808
+ widths_filtered = widths_px
809
+ outliers_removed = 0
810
+
811
+ if len(widths_filtered) == 0:
812
+ # All measurements were outliers, use original
813
+ widths_filtered = widths_px
814
+ outliers_removed = 0
815
+
816
+ # Calculate statistics
817
+ median_width_px = float(np.median(widths_filtered))
818
+ mean_width_px = float(np.mean(widths_filtered))
819
+ std_width_px = float(np.std(widths_filtered))
820
+
821
+ # Convert to cm
822
+ median_width_cm = median_width_px / scale_px_per_cm
823
+
824
+ # Log measurements
825
+ logger.debug(f"Raw median width: {median_width_px:.2f}px, scale: {scale_px_per_cm:.2f} px/cm → {median_width_cm:.4f}cm")
826
+ logger.debug(f"Width range: {np.min(widths_filtered):.1f}-{np.max(widths_filtered):.1f}px, std: {std_width_px:.1f}px")
827
+
828
+ return {
829
+ "widths_px": widths_filtered.tolist(),
830
+ "median_width_px": median_width_px,
831
+ "median_width_cm": median_width_cm,
832
+ "mean_width_px": mean_width_px,
833
+ "std_width_px": std_width_px,
834
+ "num_samples": len(widths_filtered),
835
+ "outliers_removed": int(outliers_removed),
836
+ "subpixel_refinement_used": subpixel_used,
837
+ }
838
+
839
+
840
+ def compute_edge_quality_score(
841
+ gradient_data: Dict[str, Any],
842
+ edge_data: Dict[str, Any],
843
+ width_data: Dict[str, Any]
844
+ ) -> Dict[str, Any]:
845
+ """
846
+ Assess quality of edge detection for confidence scoring.
847
+
848
+ Computes 4 quality metrics:
849
+ 1. Gradient strength: Average gradient magnitude at detected edges
850
+ 2. Edge consistency: Percentage of rows with valid edge pairs
851
+ 3. Edge smoothness: Variance of edge positions along finger
852
+ 4. Bilateral symmetry: Correlation between left/right edge quality
853
+
854
+ Args:
855
+ gradient_data: Output from apply_sobel_filters()
856
+ edge_data: Output from detect_edges_per_row()
857
+ width_data: Output from measure_width_from_edges()
858
+
859
+ Returns:
860
+ Dictionary containing:
861
+ - overall_score: Weighted average (0-1)
862
+ - gradient_strength_score: Gradient strength metric (0-1)
863
+ - consistency_score: Edge detection success rate (0-1)
864
+ - smoothness_score: Edge position smoothness (0-1)
865
+ - symmetry_score: Left/right balance (0-1)
866
+ - metrics: Dict with raw metric values
867
+ """
868
+ gradient_magnitude = gradient_data["gradient_magnitude"]
869
+ left_edges = edge_data["left_edges"]
870
+ right_edges = edge_data["right_edges"]
871
+ valid_rows = edge_data["valid_rows"]
872
+ edge_strengths_left = edge_data["edge_strengths_left"]
873
+ edge_strengths_right = edge_data["edge_strengths_right"]
874
+
875
+ # Metric 1: Gradient Strength
876
+ # Average gradient magnitude at detected edges, normalized
877
+ valid_left_strengths = edge_strengths_left[valid_rows]
878
+ valid_right_strengths = edge_strengths_right[valid_rows]
879
+
880
+ if len(valid_left_strengths) > 0:
881
+ avg_gradient_strength = (np.mean(valid_left_strengths) + np.mean(valid_right_strengths)) / 2.0
882
+ # Normalize: typical strong edge is 20-50, weak is <10
883
+ gradient_strength_score = min(avg_gradient_strength / GRADIENT_STRENGTH_NORMALIZER, 1.0)
884
+ else:
885
+ avg_gradient_strength = 0.0
886
+ gradient_strength_score = 0.0
887
+
888
+ # Metric 2: Edge Consistency
889
+ # Percentage of rows with valid edge pairs
890
+ total_rows = len(valid_rows)
891
+ num_valid = np.sum(valid_rows)
892
+ consistency_score = num_valid / total_rows if total_rows > 0 else 0.0
893
+
894
+ # Metric 3: Edge Smoothness
895
+ # Measure variance of edge positions (smoother = better)
896
+ # Lower variance = higher score
897
+ if num_valid > 1:
898
+ # Calculate variance of left and right edges separately
899
+ valid_left = left_edges[valid_rows]
900
+ valid_right = right_edges[valid_rows]
901
+
902
+ left_variance = np.var(valid_left)
903
+ right_variance = np.var(valid_right)
904
+ avg_variance = (left_variance + right_variance) / 2.0
905
+
906
+ # Normalize: typical finger has variance <100, noisy edges >500
907
+ smoothness_score = np.exp(-avg_variance / SMOOTHNESS_VARIANCE_NORMALIZER)
908
+ else:
909
+ avg_variance = 0.0
910
+ smoothness_score = 0.0
911
+
912
+ # Metric 4: Bilateral Symmetry
913
+ # Correlation between left and right edge quality (strength balance)
914
+ if len(valid_left_strengths) > 1:
915
+ # Calculate ratio of average strengths
916
+ avg_left = np.mean(valid_left_strengths)
917
+ avg_right = np.mean(valid_right_strengths)
918
+
919
+ if avg_left > 0 and avg_right > 0:
920
+ # Symmetric ratio close to 1.0 is good
921
+ ratio = min(avg_left, avg_right) / max(avg_left, avg_right)
922
+ symmetry_score = ratio # Already 0-1
923
+ else:
924
+ symmetry_score = 0.0
925
+ else:
926
+ symmetry_score = 0.0
927
+
928
+ # Weighted overall score
929
+ overall_score = (
930
+ QUALITY_WEIGHT_GRADIENT * gradient_strength_score +
931
+ QUALITY_WEIGHT_CONSISTENCY * consistency_score +
932
+ QUALITY_WEIGHT_SMOOTHNESS * smoothness_score +
933
+ QUALITY_WEIGHT_SYMMETRY * symmetry_score
934
+ )
935
+
936
+ return {
937
+ "overall_score": float(overall_score),
938
+ "gradient_strength_score": float(gradient_strength_score),
939
+ "consistency_score": float(consistency_score),
940
+ "smoothness_score": float(smoothness_score),
941
+ "symmetry_score": float(symmetry_score),
942
+ "metrics": {
943
+ "avg_gradient_strength": float(avg_gradient_strength),
944
+ "edge_consistency_pct": float(consistency_score * 100),
945
+ "avg_variance": float(avg_variance) if num_valid > 1 else 0.0,
946
+ "left_right_strength_ratio": float(symmetry_score),
947
+ }
948
+ }
949
+
950
+
951
+ def should_use_sobel_measurement(
952
+ sobel_result: Dict[str, Any],
953
+ contour_result: Optional[Dict[str, Any]] = None,
954
+ min_quality_score: float = MIN_QUALITY_SCORE_THRESHOLD,
955
+ min_consistency: float = MIN_CONSISTENCY_THRESHOLD,
956
+ max_difference_pct: float = MAX_CONTOUR_DIFFERENCE_PCT
957
+ ) -> Tuple[bool, str]:
958
+ """
959
+ Decide whether to use Sobel measurement or fall back to contour.
960
+
961
+ Decision criteria:
962
+ 1. Edge quality score > min_quality_score (default 0.7)
963
+ 2. Edge consistency > min_consistency (default 0.5 = 50%)
964
+ 3. If contour available: Sobel and contour agree within max_difference_pct
965
+
966
+ Args:
967
+ sobel_result: Output from refine_edges_sobel()
968
+ contour_result: Optional output from compute_cross_section_width()
969
+ min_quality_score: Minimum acceptable quality score
970
+ min_consistency: Minimum edge detection success rate
971
+ max_difference_pct: Maximum allowed difference from contour (%)
972
+
973
+ Returns:
974
+ Tuple of (should_use_sobel, reason)
975
+ """
976
+ # Check if edge quality data available
977
+ if "edge_quality" not in sobel_result:
978
+ return False, "edge_quality_data_missing"
979
+
980
+ edge_quality = sobel_result["edge_quality"]
981
+
982
+ # Check 1: Overall quality score
983
+ if edge_quality["overall_score"] < min_quality_score:
984
+ return False, f"quality_score_low_{edge_quality['overall_score']:.2f}"
985
+
986
+ # Check 2: Consistency (success rate)
987
+ if edge_quality["consistency_score"] < min_consistency:
988
+ return False, f"consistency_low_{edge_quality['consistency_score']:.2f}"
989
+
990
+ # Check 3: Measurement reasonableness
991
+ sobel_width = sobel_result.get("median_width_cm")
992
+ if sobel_width is None or sobel_width <= 0:
993
+ return False, "invalid_measurement"
994
+
995
+ # Typical finger width range
996
+ if sobel_width < MIN_REALISTIC_WIDTH_CM or sobel_width > MAX_REALISTIC_WIDTH_CM:
997
+ return False, f"unrealistic_width_{sobel_width:.2f}cm"
998
+
999
+ # Check 4: Agreement with contour (if available)
1000
+ if contour_result is not None:
1001
+ contour_width = contour_result.get("median_width_px")
1002
+ sobel_width_px = sobel_result.get("median_width_px")
1003
+
1004
+ if contour_width and sobel_width_px:
1005
+ diff_pct = abs(sobel_width_px - contour_width) / contour_width * 100
1006
+
1007
+ if diff_pct > max_difference_pct:
1008
+ return False, f"disagrees_with_contour_{diff_pct:.1f}pct"
1009
+
1010
+ # All checks passed
1011
+ return True, "quality_acceptable"
1012
+
1013
+
1014
+ def refine_edges_sobel(
1015
+ image: np.ndarray,
1016
+ axis_data: Dict[str, Any],
1017
+ zone_data: Dict[str, Any],
1018
+ scale_px_per_cm: float,
1019
+ finger_landmarks: Optional[np.ndarray] = None,
1020
+ sobel_threshold: float = DEFAULT_GRADIENT_THRESHOLD,
1021
+ kernel_size: int = DEFAULT_KERNEL_SIZE,
1022
+ rotate_align: bool = False,
1023
+ use_subpixel: bool = True,
1024
+ expected_width_px: Optional[float] = None,
1025
+ debug_dir: Optional[str] = None,
1026
+ ) -> Dict[str, Any]:
1027
+ """
1028
+ Main entry point for Sobel-based edge refinement.
1029
+
1030
+ Replaces contour-based width measurement with gradient-based edge detection.
1031
+
1032
+ Pipeline:
1033
+ 1. Extract ROI around ring zone
1034
+ 2. Apply bidirectional Sobel filters
1035
+ 3. Detect left/right edges per row
1036
+ 4. Measure width from edges
1037
+ 5. Convert to cm and return measurement
1038
+
1039
+ Args:
1040
+ image: Input BGR image
1041
+ axis_data: Output from estimate_finger_axis()
1042
+ zone_data: Output from localize_ring_zone()
1043
+ scale_px_per_cm: Pixels per cm from card detection
1044
+ finger_landmarks: Optional 4x2 array of finger landmarks for debug
1045
+ sobel_threshold: Minimum gradient magnitude for valid edge
1046
+ kernel_size: Sobel kernel size (3, 5, or 7)
1047
+ rotate_align: Rotate ROI for vertical finger alignment
1048
+ use_subpixel: Enable sub-pixel edge localization
1049
+ expected_width_px: Expected width for validation (optional)
1050
+ debug_dir: Directory to save debug visualizations (None to skip)
1051
+
1052
+ Returns:
1053
+ Dictionary containing:
1054
+ - median_width_cm: Final measurement in cm
1055
+ - median_width_px: Measurement in pixels
1056
+ - std_width_px: Standard deviation
1057
+ - num_samples: Number of valid measurements
1058
+ - edge_detection_success_rate: % of rows with valid edges
1059
+ - roi_data: ROI extraction data
1060
+ - gradient_data: Sobel filter data
1061
+ - edge_data: Edge detection data
1062
+ - method: "sobel"
1063
+ """
1064
+ # Initialize debug observer if debug_dir provided
1065
+ if debug_dir:
1066
+ from src.debug_observer import DebugObserver, draw_landmark_axis, draw_ring_zone_roi
1067
+ from src.debug_observer import draw_roi_extraction, draw_gradient_visualization
1068
+ from src.debug_observer import draw_edge_candidates, draw_filtered_edge_candidates
1069
+ from src.debug_observer import draw_selected_edges
1070
+ from src.debug_observer import draw_width_measurements, draw_outlier_detection
1071
+ from src.debug_observer import draw_comprehensive_edge_overlay
1072
+ observer = DebugObserver(debug_dir)
1073
+
1074
+ # Stage A: Axis & Zone Visualization
1075
+ if debug_dir:
1076
+ # A.1: Landmark axis
1077
+ observer.draw_and_save("01_landmark_axis", image, draw_landmark_axis, axis_data, finger_landmarks)
1078
+
1079
+ # A.2: Ring zone + ROI bounds (need to extract bounds first)
1080
+ # We'll save this after ROI extraction
1081
+
1082
+ # Step 1: Extract ROI
1083
+ roi_data = extract_ring_zone_roi(
1084
+ image, axis_data, zone_data,
1085
+ rotate_align=rotate_align
1086
+ )
1087
+
1088
+ logger.debug(f"ROI size: {roi_data['roi_width']}x{roi_data['roi_height']}px")
1089
+ logger.debug(f"ROI bounds: {roi_data['roi_bounds']}")
1090
+
1091
+ if debug_dir:
1092
+ # A.2: Ring zone + ROI bounds
1093
+ roi_bounds = roi_data["roi_bounds"]
1094
+ observer.draw_and_save("02_ring_zone_roi", image, draw_ring_zone_roi, zone_data, roi_bounds)
1095
+
1096
+ # A.3: ROI extraction
1097
+ observer.draw_and_save("03_roi_extraction", roi_data["roi_image"], draw_roi_extraction, roi_data.get("roi_mask"))
1098
+
1099
+ # Step 2: Apply Sobel filters
1100
+ gradient_data = apply_sobel_filters(
1101
+ roi_data["roi_image"],
1102
+ kernel_size=kernel_size,
1103
+ axis_direction="auto"
1104
+ )
1105
+
1106
+ if debug_dir:
1107
+ # Stage B: Sobel Filtering
1108
+ # B.1: Left-to-right gradient
1109
+ grad_left = draw_gradient_visualization(gradient_data["gradient_left_to_right"], cv2.COLORMAP_JET)
1110
+ observer.save_stage("04_sobel_left_to_right", grad_left)
1111
+
1112
+ # B.2: Right-to-left gradient
1113
+ grad_right = draw_gradient_visualization(gradient_data["gradient_right_to_left"], cv2.COLORMAP_JET)
1114
+ observer.save_stage("05_sobel_right_to_left", grad_right)
1115
+
1116
+ # B.3: Gradient magnitude
1117
+ grad_mag = draw_gradient_visualization(gradient_data["gradient_magnitude"], cv2.COLORMAP_HOT)
1118
+ observer.save_stage("06_gradient_magnitude", grad_mag)
1119
+
1120
+ # Step 3: Detect edges per row
1121
+ edge_data = detect_edges_per_row(
1122
+ gradient_data, roi_data,
1123
+ threshold=sobel_threshold,
1124
+ expected_width_px=expected_width_px,
1125
+ scale_px_per_cm=scale_px_per_cm
1126
+ )
1127
+
1128
+ logger.debug(f"Valid rows: {edge_data['num_valid_rows']}/{len(edge_data['valid_rows'])} ({edge_data['num_valid_rows']/len(edge_data['valid_rows'])*100:.1f}%)")
1129
+ if edge_data['num_valid_rows'] > 0:
1130
+ valid_left = edge_data['left_edges'][edge_data['valid_rows']]
1131
+ valid_right = edge_data['right_edges'][edge_data['valid_rows']]
1132
+ logger.debug(f"Left edges range: {np.min(valid_left):.1f}-{np.max(valid_left):.1f}px")
1133
+ logger.debug(f"Right edges range: {np.min(valid_right):.1f}-{np.max(valid_right):.1f}px")
1134
+ widths = valid_right - valid_left
1135
+ logger.debug(f"Raw widths range: {np.min(widths):.1f}-{np.max(widths):.1f}px, median: {np.median(widths):.1f}px")
1136
+
1137
+ if debug_dir:
1138
+ # B.4a: All edge candidates (raw threshold, shows noise)
1139
+ observer.draw_and_save("07a_all_candidates", roi_data["roi_image"],
1140
+ draw_edge_candidates, gradient_data["gradient_magnitude"], sobel_threshold)
1141
+
1142
+ # B.4b: Filtered edge candidates (spatially-filtered, what algorithm uses)
1143
+ observer.draw_and_save("07b_filtered_candidates", roi_data["roi_image"],
1144
+ draw_filtered_edge_candidates,
1145
+ gradient_data["gradient_magnitude"],
1146
+ sobel_threshold,
1147
+ roi_data.get("roi_mask"),
1148
+ roi_data["axis_center_in_roi"],
1149
+ roi_data["axis_direction_in_roi"])
1150
+
1151
+ # B.5: Selected edges (final detected edges)
1152
+ observer.draw_and_save("09_selected_edges", roi_data["roi_image"], draw_selected_edges, edge_data)
1153
+
1154
+ # Step 4: Measure width from edges (with sub-pixel refinement)
1155
+ width_data = measure_width_from_edges(
1156
+ edge_data, roi_data, scale_px_per_cm,
1157
+ gradient_data=gradient_data,
1158
+ use_subpixel=use_subpixel
1159
+ )
1160
+
1161
+ if debug_dir:
1162
+ # Stage C: Measurement
1163
+ # C.1: Sub-pixel refinement (use selected edges for now)
1164
+ observer.draw_and_save("10_subpixel_refinement", roi_data["roi_image"], draw_selected_edges, edge_data)
1165
+
1166
+ # C.2: Width measurements
1167
+ observer.draw_and_save("11_width_measurements", roi_data["roi_image"],
1168
+ draw_width_measurements, edge_data, width_data)
1169
+
1170
+ # C.3: Width distribution (histogram - requires matplotlib)
1171
+ try:
1172
+ _save_width_distribution(width_data, debug_dir)
1173
+ except:
1174
+ pass # Skip if matplotlib not available
1175
+
1176
+ # C.4: Outlier detection
1177
+ observer.draw_and_save("13_outlier_detection", roi_data["roi_image"],
1178
+ draw_outlier_detection, edge_data, width_data)
1179
+
1180
+ # C.5: Comprehensive edge overlay on full image
1181
+ observer.draw_and_save("14_comprehensive_overlay", image,
1182
+ draw_comprehensive_edge_overlay,
1183
+ edge_data, roi_data["roi_bounds"], axis_data, zone_data,
1184
+ width_data, scale_px_per_cm)
1185
+
1186
+ # Step 5: Compute edge quality score
1187
+ edge_quality = compute_edge_quality_score(
1188
+ gradient_data, edge_data, width_data
1189
+ )
1190
+
1191
+ # Calculate success rate
1192
+ total_rows = len(edge_data["valid_rows"])
1193
+ success_rate = edge_data["num_valid_rows"] / total_rows if total_rows > 0 else 0.0
1194
+
1195
+ # Combine results
1196
+ return {
1197
+ "median_width_cm": width_data["median_width_cm"],
1198
+ "median_width_px": width_data["median_width_px"],
1199
+ "mean_width_px": width_data["mean_width_px"],
1200
+ "std_width_px": width_data["std_width_px"],
1201
+ "num_samples": width_data["num_samples"],
1202
+ "outliers_removed": width_data["outliers_removed"],
1203
+ "subpixel_refinement_used": width_data["subpixel_refinement_used"],
1204
+ "edge_detection_success_rate": success_rate,
1205
+ "edge_quality": edge_quality,
1206
+ "roi_data": roi_data,
1207
+ "gradient_data": gradient_data,
1208
+ "edge_data": edge_data,
1209
+ "width_data": width_data,
1210
+ "method": "sobel",
1211
+ }
1212
+
1213
+
1214
+ def _save_width_distribution(width_data: Dict[str, Any], debug_dir: str) -> None:
1215
+ """Helper to save width distribution histogram."""
1216
+ try:
1217
+ import matplotlib
1218
+ matplotlib.use('Agg')
1219
+ import matplotlib.pyplot as plt
1220
+ import os
1221
+ except ImportError:
1222
+ return
1223
+
1224
+ widths_px = width_data.get("widths_px", [])
1225
+ if len(widths_px) == 0:
1226
+ return
1227
+
1228
+ median_width_px = width_data["median_width_px"]
1229
+ mean_width_px = width_data["mean_width_px"]
1230
+
1231
+ # Create histogram
1232
+ fig, ax = plt.subplots(figsize=(10, 6))
1233
+ ax.hist(widths_px, bins=30, color='skyblue', edgecolor='black', alpha=0.7)
1234
+ ax.axvline(median_width_px, color='red', linestyle='--', linewidth=2, label=f'Median: {median_width_px:.1f} px')
1235
+ ax.axvline(mean_width_px, color='orange', linestyle='--', linewidth=2, label=f'Mean: {mean_width_px:.1f} px')
1236
+
1237
+ ax.set_xlabel('Width (pixels)', fontsize=12)
1238
+ ax.set_ylabel('Frequency', fontsize=12)
1239
+ ax.set_title('Distribution of Cross-Section Widths', fontsize=14, fontweight='bold')
1240
+ ax.legend(fontsize=10)
1241
+ ax.grid(True, alpha=0.3)
1242
+
1243
+ # Save
1244
+ output_path = os.path.join(debug_dir, "12_width_distribution.png")
1245
+ plt.savefig(output_path, dpi=150, bbox_inches='tight')
1246
+ plt.close()
1247
+
1248
+
1249
+ def compare_edge_methods(
1250
+ contour_result: Dict[str, Any],
1251
+ sobel_result: Dict[str, Any],
1252
+ scale_px_per_cm: float
1253
+ ) -> Dict[str, Any]:
1254
+ """
1255
+ Compare contour-based and Sobel-based edge detection methods.
1256
+
1257
+ Provides detailed analysis of differences, quality metrics, and
1258
+ recommendation on which method to use.
1259
+
1260
+ Args:
1261
+ contour_result: Output from compute_cross_section_width()
1262
+ sobel_result: Output from refine_edges_sobel()
1263
+ scale_px_per_cm: Scale factor for unit conversion
1264
+
1265
+ Returns:
1266
+ Dictionary containing:
1267
+ - contour: Summary of contour method results
1268
+ - sobel: Summary of Sobel method results
1269
+ - difference: Comparison metrics
1270
+ - recommendation: Which method to use and why
1271
+ - quality_comparison: Quality metrics comparison
1272
+ """
1273
+ # Extract measurements
1274
+ contour_width_cm = contour_result["median_width_px"] / scale_px_per_cm
1275
+ sobel_width_cm = sobel_result["median_width_cm"]
1276
+
1277
+ contour_width_px = contour_result["median_width_px"]
1278
+ sobel_width_px = sobel_result["median_width_px"]
1279
+
1280
+ # Calculate differences
1281
+ diff_cm = sobel_width_cm - contour_width_cm
1282
+ diff_px = sobel_width_px - contour_width_px
1283
+ diff_pct = (diff_cm / contour_width_cm) * 100 if contour_width_cm > 0 else 0.0
1284
+
1285
+ # Quality comparison
1286
+ contour_cv = (contour_result["std_width_px"] / contour_result["median_width_px"]) if contour_result["median_width_px"] > 0 else 0.0
1287
+ sobel_cv = (sobel_result["std_width_px"] / sobel_result["median_width_px"]) if sobel_result["median_width_px"] > 0 else 0.0
1288
+
1289
+ # Determine recommendation
1290
+ should_use_sobel, reason = should_use_sobel_measurement(sobel_result, contour_result)
1291
+
1292
+ # Build summary
1293
+ result = {
1294
+ "contour": {
1295
+ "width_cm": float(contour_width_cm),
1296
+ "width_px": float(contour_width_px),
1297
+ "std_dev_px": float(contour_result["std_width_px"]),
1298
+ "coefficient_variation": float(contour_cv),
1299
+ "num_samples": int(contour_result["num_samples"]),
1300
+ "method": "contour",
1301
+ },
1302
+ "sobel": {
1303
+ "width_cm": float(sobel_width_cm),
1304
+ "width_px": float(sobel_width_px),
1305
+ "std_dev_px": float(sobel_result["std_width_px"]),
1306
+ "coefficient_variation": float(sobel_cv),
1307
+ "num_samples": int(sobel_result["num_samples"]),
1308
+ "subpixel_used": bool(sobel_result["subpixel_refinement_used"]),
1309
+ "success_rate": float(sobel_result["edge_detection_success_rate"]),
1310
+ "edge_quality_score": float(sobel_result["edge_quality"]["overall_score"]),
1311
+ "method": "sobel",
1312
+ },
1313
+ "difference": {
1314
+ "absolute_cm": float(diff_cm),
1315
+ "absolute_px": float(diff_px),
1316
+ "relative_pct": float(diff_pct),
1317
+ "precision_improvement": float(contour_result["std_width_px"] - sobel_result["std_width_px"]),
1318
+ },
1319
+ "recommendation": {
1320
+ "use_sobel": bool(should_use_sobel),
1321
+ "reason": str(reason),
1322
+ "preferred_method": "sobel" if should_use_sobel else "contour",
1323
+ },
1324
+ "quality_comparison": {
1325
+ "contour_cv": float(contour_cv),
1326
+ "sobel_cv": float(sobel_cv),
1327
+ "sobel_quality_score": float(sobel_result["edge_quality"]["overall_score"]),
1328
+ "sobel_gradient_strength": float(sobel_result["edge_quality"]["gradient_strength_score"]),
1329
+ "sobel_consistency": float(sobel_result["edge_quality"]["consistency_score"]),
1330
+ "sobel_smoothness": float(sobel_result["edge_quality"]["smoothness_score"]),
1331
+ "sobel_symmetry": float(sobel_result["edge_quality"]["symmetry_score"]),
1332
+ },
1333
+ }
1334
+
1335
+ return result
src/edge_refinement_constants.py ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Constants for Sobel edge refinement algorithm.
3
+
4
+ This module contains all configurable parameters and thresholds used
5
+ in the edge refinement pipeline to make them easy to tune and maintain.
6
+ """
7
+
8
+ # =============================================================================
9
+ # ROI Extraction Constants
10
+ # =============================================================================
11
+
12
+ # ROI padding around zone for gradient context
13
+ ROI_PADDING_PX = 50
14
+
15
+ # Finger width estimation factor (conservative to ensure full capture)
16
+ # Typical finger aspect ratio is 3:1 to 5:1 (length:width)
17
+ FINGER_WIDTH_RATIO = 3.0 # length / width
18
+
19
+
20
+ # =============================================================================
21
+ # Sobel Filter Constants
22
+ # =============================================================================
23
+
24
+ # Default Sobel kernel size
25
+ DEFAULT_KERNEL_SIZE = 3
26
+
27
+ # Valid kernel sizes
28
+ VALID_KERNEL_SIZES = [3, 5, 7]
29
+
30
+
31
+ # =============================================================================
32
+ # Edge Detection Constants
33
+ # =============================================================================
34
+
35
+ # Default gradient threshold for valid edge
36
+ DEFAULT_GRADIENT_THRESHOLD = 15.0
37
+
38
+ # Realistic finger width range for validation
39
+ # Based on typical adult finger widths across ring sizes
40
+ MIN_FINGER_WIDTH_CM = 1.6 # Size 6 (16mm)
41
+ MAX_FINGER_WIDTH_CM = 2.5 # Size 13 (23mm)
42
+
43
+ # Tolerance for expected width comparison (when contour available)
44
+ WIDTH_TOLERANCE_FACTOR = 0.25 # ±25%
45
+
46
+
47
+ # =============================================================================
48
+ # Sub-Pixel Refinement Constants
49
+ # =============================================================================
50
+
51
+ # Maximum sub-pixel refinement offset from integer position
52
+ MAX_SUBPIXEL_OFFSET = 0.5 # ±0.5 pixels
53
+
54
+ # Minimum denominator value to avoid division by zero in parabola fitting
55
+ MIN_PARABOLA_DENOMINATOR = 1e-6
56
+
57
+
58
+ # =============================================================================
59
+ # Outlier Filtering Constants
60
+ # =============================================================================
61
+
62
+ # MAD (Median Absolute Deviation) threshold multiplier
63
+ MAD_OUTLIER_THRESHOLD = 3.0 # Outliers are >3 MAD from median
64
+
65
+
66
+ # =============================================================================
67
+ # Edge Quality Scoring Constants
68
+ # =============================================================================
69
+
70
+ # Gradient strength normalization (typical strong edge magnitude)
71
+ GRADIENT_STRENGTH_NORMALIZER = 30.0
72
+
73
+ # Smoothness scoring (variance to exponential mapping)
74
+ SMOOTHNESS_VARIANCE_NORMALIZER = 200.0
75
+
76
+ # Quality score component weights
77
+ QUALITY_WEIGHT_GRADIENT = 0.4 # Gradient strength: 40%
78
+ QUALITY_WEIGHT_CONSISTENCY = 0.3 # Edge consistency: 30%
79
+ QUALITY_WEIGHT_SMOOTHNESS = 0.2 # Edge smoothness: 20%
80
+ QUALITY_WEIGHT_SYMMETRY = 0.1 # Bilateral symmetry: 10%
81
+
82
+
83
+ # =============================================================================
84
+ # Auto Fallback Decision Constants
85
+ # =============================================================================
86
+
87
+ # Minimum quality score to use Sobel (otherwise fall back to contour)
88
+ MIN_QUALITY_SCORE_THRESHOLD = 0.65 # Lowered from 0.7 for mask-constrained mode
89
+
90
+ # Minimum edge detection success rate
91
+ MIN_CONSISTENCY_THRESHOLD = 0.30 # 30% (lowered from 50% for mask-constrained mode)
92
+
93
+ # Realistic measurement range for validation
94
+ MIN_REALISTIC_WIDTH_CM = 0.8
95
+ MAX_REALISTIC_WIDTH_CM = 3.5
96
+
97
+ # Maximum allowed difference from contour measurement (percentage)
98
+ MAX_CONTOUR_DIFFERENCE_PCT = 50.0
src/finger_segmentation.py ADDED
@@ -0,0 +1,949 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Hand and finger segmentation utilities.
3
+
4
+ This module handles:
5
+ - Hand detection using MediaPipe
6
+ - Hand mask generation
7
+ - Individual finger isolation
8
+ - Mask cleanup and validation
9
+ """
10
+
11
+ import cv2
12
+ import numpy as np
13
+ from typing import Optional, Dict, Any, Literal, List, Tuple
14
+ import mediapipe as mp
15
+ from mediapipe.tasks import python
16
+ from mediapipe.tasks.python import vision
17
+ import urllib.request
18
+ import os
19
+ from pathlib import Path
20
+
21
+ # Import debug observer and drawing functions
22
+ from src.debug_observer import (
23
+ DebugObserver,
24
+ draw_landmarks_overlay,
25
+ draw_hand_skeleton,
26
+ draw_detection_info,
27
+ )
28
+
29
+ FingerIndex = Literal["auto", "index", "middle", "ring", "pinky"]
30
+
31
+ # MediaPipe hand landmark indices for each finger
32
+ # Each finger has 4 landmarks: MCP (knuckle), PIP, DIP, TIP
33
+ FINGER_LANDMARKS = {
34
+ "index": [5, 6, 7, 8],
35
+ "middle": [9, 10, 11, 12],
36
+ "ring": [13, 14, 15, 16],
37
+ "pinky": [17, 18, 19, 20],
38
+ }
39
+
40
+ # Thumb landmarks (special case - not typically used for ring measurement)
41
+ THUMB_LANDMARKS = [1, 2, 3, 4]
42
+
43
+ # Wrist landmark
44
+ WRIST_LANDMARK = 0
45
+
46
+ # Palm landmarks (for creating hand mask)
47
+ PALM_LANDMARKS = [0, 1, 5, 9, 13, 17]
48
+
49
+ # Model path
50
+ MODEL_PATH = os.path.join(os.path.dirname(__file__), "..", ".model", "hand_landmarker.task")
51
+ MODEL_URL = "https://storage.googleapis.com/mediapipe-models/hand_landmarker/hand_landmarker/float16/1/hand_landmarker.task"
52
+
53
+ # Initialize MediaPipe Hands (lazy loading)
54
+ _hands_detector = None
55
+
56
+
57
+ def _download_model():
58
+ """Download the hand landmarker model if not present."""
59
+ if not os.path.exists(MODEL_PATH):
60
+ os.makedirs(os.path.dirname(MODEL_PATH), exist_ok=True)
61
+ print(f"Downloading hand landmarker model...")
62
+ urllib.request.urlretrieve(MODEL_URL, MODEL_PATH)
63
+ print(f"Model downloaded to {MODEL_PATH}")
64
+
65
+
66
+ def _get_hands_detector(force_new: bool = False):
67
+ """Get or initialize the MediaPipe Hands detector."""
68
+ global _hands_detector
69
+ if _hands_detector is None or force_new:
70
+ _download_model()
71
+ base_options = python.BaseOptions(model_asset_path=MODEL_PATH)
72
+ options = vision.HandLandmarkerOptions(
73
+ base_options=base_options,
74
+ num_hands=2,
75
+ min_hand_detection_confidence=0.3, # Lower threshold for better detection
76
+ min_tracking_confidence=0.3,
77
+ )
78
+ _hands_detector = vision.HandLandmarker.create_from_options(options)
79
+ return _hands_detector
80
+
81
+
82
+ def _try_detect_hand(detector, image: np.ndarray) -> Optional[Tuple[Any, int]]:
83
+ """
84
+ Try to detect hand in image, returns (results, rotation_code) or None.
85
+ rotation_code: 0=none, 1=90cw, 2=180, 3=90ccw
86
+ """
87
+ # Try different rotations to handle various image orientations
88
+ rotations = [
89
+ (image, 0),
90
+ (cv2.rotate(image, cv2.ROTATE_90_CLOCKWISE), 1),
91
+ (cv2.rotate(image, cv2.ROTATE_90_COUNTERCLOCKWISE), 3),
92
+ (cv2.rotate(image, cv2.ROTATE_180), 2),
93
+ ]
94
+
95
+ best_result = None
96
+ best_confidence = 0
97
+ best_rotation = 0
98
+
99
+ for rotated, rot_code in rotations:
100
+ # Convert to RGB and ensure contiguous memory layout
101
+ rgb = cv2.cvtColor(rotated, cv2.COLOR_BGR2RGB)
102
+ rgb = np.ascontiguousarray(rgb)
103
+ mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=rgb)
104
+ results = detector.detect(mp_image)
105
+
106
+ if results.hand_landmarks:
107
+ # Get best confidence among detected hands
108
+ for i, handedness in enumerate(results.handedness):
109
+ conf = handedness[0].score
110
+ if conf > best_confidence:
111
+ best_confidence = conf
112
+ best_result = results
113
+ best_rotation = rot_code
114
+
115
+ if best_result is None:
116
+ return None
117
+
118
+ return best_result, best_rotation
119
+
120
+
121
+ def _transform_landmarks_for_rotation(
122
+ landmarks: np.ndarray,
123
+ rotation_code: int,
124
+ original_h: int,
125
+ original_w: int,
126
+ ) -> np.ndarray:
127
+ """
128
+ Transform landmarks from rotated coordinates back to original image coordinates.
129
+ """
130
+ if rotation_code == 0:
131
+ # No rotation
132
+ return landmarks
133
+ elif rotation_code == 1:
134
+ # Was rotated 90 CW, so transform back (90 CCW)
135
+ # In rotated: (x, y) with size (h, w) -> original: (y, w-1-x) with size (w, h)
136
+ new_landmarks = np.zeros_like(landmarks)
137
+ new_landmarks[:, 0] = landmarks[:, 1] * original_w # y -> x
138
+ new_landmarks[:, 1] = (1 - landmarks[:, 0]) * original_h # (1-x) -> y
139
+ return new_landmarks
140
+ elif rotation_code == 2:
141
+ # Was rotated 180
142
+ new_landmarks = np.zeros_like(landmarks)
143
+ new_landmarks[:, 0] = (1 - landmarks[:, 0]) * original_w
144
+ new_landmarks[:, 1] = (1 - landmarks[:, 1]) * original_h
145
+ return new_landmarks
146
+ elif rotation_code == 3:
147
+ # Was rotated 90 CCW, so transform back (90 CW)
148
+ new_landmarks = np.zeros_like(landmarks)
149
+ new_landmarks[:, 0] = (1 - landmarks[:, 1]) * original_w
150
+ new_landmarks[:, 1] = landmarks[:, 0] * original_h
151
+ return new_landmarks
152
+
153
+ return landmarks
154
+
155
+
156
+ def detect_hand_orientation(
157
+ landmarks_normalized: np.ndarray,
158
+ finger: FingerIndex = "index"
159
+ ) -> float:
160
+ """
161
+ Detect hand orientation angle from vertical (canonical orientation).
162
+
163
+ Canonical orientation: wrist at bottom, fingers pointing upward.
164
+
165
+ Args:
166
+ landmarks_normalized: MediaPipe hand landmarks (21x2) in normalized [0-1] coordinates
167
+ finger: Which finger to use for orientation detection (default: "index")
168
+
169
+ Returns:
170
+ Angle in degrees to rotate image clockwise to achieve canonical orientation.
171
+ Returns one of: 0, 90, 180, 270
172
+ """
173
+ # Get wrist (landmark 0) and specified finger tip
174
+ wrist = landmarks_normalized[WRIST_LANDMARK]
175
+
176
+ # Use specified finger, fallback to middle if invalid
177
+ if finger in FINGER_LANDMARKS:
178
+ finger_tip = landmarks_normalized[FINGER_LANDMARKS[finger][3]]
179
+ else:
180
+ # Fallback to middle finger for "auto" or invalid values
181
+ finger_tip = landmarks_normalized[FINGER_LANDMARKS["middle"][3]]
182
+
183
+ # Compute vector from wrist to fingertip
184
+ direction = finger_tip - wrist
185
+
186
+ # Compute angle from vertical upward direction
187
+ # In image coordinates: y increases downward, x increases rightward
188
+ # Vertical upward = (0, -1) in (x, y)
189
+ # angle = atan2(cross, dot) where cross = dx*(-1) - dy*0, dot = dx*0 + dy*(-1)
190
+ angle_rad = np.arctan2(direction[0], -direction[1])
191
+ angle_deg = angle_rad * 180.0 / np.pi
192
+
193
+ # angle_deg is now in range [-180, 180]:
194
+ # 0° = fingers pointing up (canonical)
195
+ # 90° = fingers pointing right
196
+ # 180° = fingers pointing down
197
+ # -90° = fingers pointing left
198
+
199
+ # Convert to [0, 360] range
200
+ if angle_deg < 0:
201
+ angle_deg += 360
202
+
203
+ # Snap to nearest 90° increment
204
+ # We want to return how much to rotate CW to get to canonical (0°)
205
+ rotation_needed = _snap_to_orthogonal(angle_deg)
206
+
207
+ return rotation_needed
208
+
209
+
210
+ def _snap_to_orthogonal(angle_deg: float) -> int:
211
+ """
212
+ Snap angle to nearest orthogonal rotation (0, 90, 180, 270).
213
+
214
+ Args:
215
+ angle_deg: Angle in degrees [0, 360]
216
+
217
+ Returns:
218
+ Rotation needed in degrees (0, 90, 180, 270) to rotate CW to canonical orientation
219
+ """
220
+ # If angle is 0±45°, no rotation needed
221
+ # If angle is 90±45°, need to rotate 270° CW (or 90° CCW) to get to 0°
222
+ # If angle is 180±45°, need to rotate 180°
223
+ # If angle is 270±45°, need to rotate 90° CW
224
+
225
+ # Determine which quadrant (with 45° tolerance)
226
+ if angle_deg < 45 or angle_deg >= 315:
227
+ return 0 # Already upright
228
+ elif 45 <= angle_deg < 135:
229
+ return 270 # Pointing right, rotate 270° CW (= 90° CCW)
230
+ elif 135 <= angle_deg < 225:
231
+ return 180 # Upside down, rotate 180°
232
+ else: # 225 <= angle_deg < 315
233
+ return 90 # Pointing left, rotate 90° CW
234
+
235
+
236
+ def normalize_hand_orientation(
237
+ image: np.ndarray,
238
+ landmarks_normalized: np.ndarray,
239
+ finger: FingerIndex = "index",
240
+ ) -> Tuple[np.ndarray, int]:
241
+ """
242
+ Rotate image to canonical hand orientation (wrist at bottom, fingers up).
243
+
244
+ Args:
245
+ image: Input BGR image
246
+ landmarks_normalized: MediaPipe landmarks in normalized [0-1] coordinates
247
+ finger: Which finger to use for orientation detection (default: "index")
248
+
249
+ Returns:
250
+ Tuple of (rotated_image, rotation_angle_degrees)
251
+ rotation_angle_degrees is one of: 0, 90, 180, 270
252
+ """
253
+ # Detect hand orientation based on specified finger
254
+ rotation_needed = detect_hand_orientation(landmarks_normalized, finger)
255
+
256
+ # Rotate image if needed
257
+ if rotation_needed == 0:
258
+ return image, 0
259
+ elif rotation_needed == 90:
260
+ return cv2.rotate(image, cv2.ROTATE_90_CLOCKWISE), 90
261
+ elif rotation_needed == 180:
262
+ return cv2.rotate(image, cv2.ROTATE_180), 180
263
+ elif rotation_needed == 270:
264
+ return cv2.rotate(image, cv2.ROTATE_90_COUNTERCLOCKWISE), 270
265
+ else:
266
+ # Shouldn't happen, but return original as fallback
267
+ print(f"Warning: Unexpected rotation angle {rotation_needed}, skipping rotation")
268
+ return image, 0
269
+
270
+
271
+ def segment_hand(
272
+ image: np.ndarray,
273
+ finger: FingerIndex = "index",
274
+ max_dimension: int = 1280,
275
+ debug_dir: Optional[str] = None,
276
+ ) -> Optional[Dict[str, Any]]:
277
+ """
278
+ Detect and segment hand from image using MediaPipe.
279
+
280
+ Args:
281
+ image: Input BGR image
282
+ finger: Which finger to use for orientation detection (default: "index")
283
+ max_dimension: Maximum dimension for processing (large images are resized)
284
+ debug_dir: Optional directory to save debug images
285
+
286
+ Returns:
287
+ Dictionary containing:
288
+ - landmarks: 21x2 array of landmark positions (pixel coordinates)
289
+ - landmarks_normalized: 21x2 array of normalized coordinates [0-1]
290
+ - mask: Binary hand mask
291
+ - confidence: Detection confidence
292
+ - handedness: "Left" or "Right"
293
+ Or None if no hand detected
294
+ """
295
+ # Create debug observer if debug mode enabled
296
+ observer = DebugObserver(debug_dir) if debug_dir else None
297
+
298
+ h, w = image.shape[:2]
299
+
300
+ # Debug: Save original image
301
+ if observer:
302
+ observer.save_stage("01_original", image)
303
+
304
+ # Resize if image is too large (MediaPipe works better with smaller images)
305
+ scale = 1.0
306
+ if max(h, w) > max_dimension:
307
+ scale = max_dimension / max(h, w)
308
+ new_w = int(w * scale)
309
+ new_h = int(h * scale)
310
+ resized = cv2.resize(image, (new_w, new_h), interpolation=cv2.INTER_AREA)
311
+ else:
312
+ resized = image
313
+ new_h, new_w = h, w
314
+
315
+ # Debug: Save resized image (if resized)
316
+ if scale != 1.0 and observer:
317
+ observer.save_stage("02_resized_for_detection", resized)
318
+
319
+ # Process with MediaPipe (try multiple rotations)
320
+ detector = _get_hands_detector()
321
+ detection_result = _try_detect_hand(detector, resized)
322
+
323
+ if detection_result is None:
324
+ return None
325
+
326
+ results, rotation_code = detection_result
327
+
328
+ # Select the best hand (highest confidence)
329
+ best_hand_idx = 0
330
+ best_conf = 0
331
+ for i, handedness in enumerate(results.handedness):
332
+ if handedness[0].score > best_conf:
333
+ best_conf = handedness[0].score
334
+ best_hand_idx = i
335
+
336
+ hand_landmarks = results.hand_landmarks[best_hand_idx]
337
+ handedness = results.handedness[best_hand_idx]
338
+
339
+ # Extract landmark coordinates (normalized 0-1 in rotated image)
340
+ landmarks_normalized_rotated = np.array([
341
+ [lm.x, lm.y] for lm in hand_landmarks
342
+ ])
343
+
344
+ # NEW: Normalize hand orientation to canonical (wrist at bottom, fingers up)
345
+ # This is done in the detected-rotation space first
346
+ if rotation_code == 1:
347
+ # Was rotated 90 CW
348
+ rotated_image = cv2.rotate(resized, cv2.ROTATE_90_CLOCKWISE)
349
+ elif rotation_code == 2:
350
+ # Was rotated 180
351
+ rotated_image = cv2.rotate(resized, cv2.ROTATE_180)
352
+ elif rotation_code == 3:
353
+ # Was rotated 90 CCW
354
+ rotated_image = cv2.rotate(resized, cv2.ROTATE_90_COUNTERCLOCKWISE)
355
+ else:
356
+ rotated_image = resized
357
+
358
+ # Now normalize orientation based on hand direction
359
+ canonical_image, orientation_rotation = normalize_hand_orientation(
360
+ rotated_image, landmarks_normalized_rotated, finger
361
+ )
362
+
363
+ # Update landmarks for orientation normalization
364
+ if orientation_rotation != 0:
365
+ rot_h, rot_w = rotated_image.shape[:2]
366
+ landmarks_px_rotated = landmarks_normalized_rotated.copy()
367
+ landmarks_px_rotated[:, 0] *= rot_w
368
+ landmarks_px_rotated[:, 1] *= rot_h
369
+
370
+ # Apply rotation transform to landmarks
371
+ if orientation_rotation == 90:
372
+ # Rotate 90 CW: (x, y) -> (h-1-y, x)
373
+ new_landmarks = np.zeros_like(landmarks_px_rotated)
374
+ new_landmarks[:, 0] = rot_h - 1 - landmarks_px_rotated[:, 1]
375
+ new_landmarks[:, 1] = landmarks_px_rotated[:, 0]
376
+ landmarks_px_canonical = new_landmarks
377
+ elif orientation_rotation == 180:
378
+ # Rotate 180: (x, y) -> (w-1-x, h-1-y)
379
+ new_landmarks = np.zeros_like(landmarks_px_rotated)
380
+ new_landmarks[:, 0] = rot_w - 1 - landmarks_px_rotated[:, 0]
381
+ new_landmarks[:, 1] = rot_h - 1 - landmarks_px_rotated[:, 1]
382
+ landmarks_px_canonical = new_landmarks
383
+ elif orientation_rotation == 270:
384
+ # Rotate 90 CCW: (x, y) -> (y, w-1-x)
385
+ new_landmarks = np.zeros_like(landmarks_px_rotated)
386
+ new_landmarks[:, 0] = landmarks_px_rotated[:, 1]
387
+ new_landmarks[:, 1] = rot_w - 1 - landmarks_px_rotated[:, 0]
388
+ landmarks_px_canonical = new_landmarks
389
+ else:
390
+ landmarks_px_canonical = landmarks_px_rotated
391
+
392
+ # Update normalized landmarks for canonical image
393
+ can_h, can_w = canonical_image.shape[:2]
394
+ landmarks_normalized_canonical = landmarks_px_canonical.copy()
395
+ landmarks_normalized_canonical[:, 0] /= can_w
396
+ landmarks_normalized_canonical[:, 1] /= can_h
397
+ else:
398
+ landmarks_normalized_canonical = landmarks_normalized_rotated
399
+
400
+ # Scale landmarks back to original resolution if needed
401
+ if scale != 1.0:
402
+ canonical_full = cv2.resize(canonical_image, (int(canonical_image.shape[1] / scale),
403
+ int(canonical_image.shape[0] / scale)),
404
+ interpolation=cv2.INTER_CUBIC)
405
+ else:
406
+ canonical_full = canonical_image
407
+
408
+ # Final landmarks in canonical full resolution
409
+ can_full_h, can_full_w = canonical_full.shape[:2]
410
+ landmarks_canonical = landmarks_normalized_canonical.copy()
411
+ landmarks_canonical[:, 0] *= can_full_w
412
+ landmarks_canonical[:, 1] *= can_full_h
413
+
414
+ # Debug: Draw landmarks overlay in canonical orientation
415
+ if observer:
416
+ observer.draw_and_save("03_landmarks_overlay_canonical", canonical_full,
417
+ draw_landmarks_overlay, landmarks_canonical, label=True)
418
+ observer.draw_and_save("04_hand_skeleton_canonical", canonical_full,
419
+ draw_hand_skeleton, landmarks_canonical)
420
+ observer.draw_and_save("05_detection_info_canonical", canonical_full,
421
+ draw_detection_info, handedness[0].score,
422
+ handedness[0].category_name,
423
+ f"det={rotation_code}, orient={orientation_rotation}")
424
+
425
+ # Generate hand mask at canonical resolution
426
+ mask = _create_hand_mask(landmarks_canonical, (can_full_h, can_full_w))
427
+
428
+ return {
429
+ "landmarks": landmarks_canonical,
430
+ "landmarks_normalized": landmarks_normalized_canonical,
431
+ "mask": mask,
432
+ "confidence": handedness[0].score,
433
+ "handedness": handedness[0].category_name,
434
+ "rotation_applied": rotation_code,
435
+ "orientation_rotation": orientation_rotation,
436
+ "canonical_image": canonical_full, # Return the canonical image for downstream processing
437
+ }
438
+
439
+
440
+ def _create_hand_mask(landmarks: np.ndarray, shape: Tuple[int, int]) -> np.ndarray:
441
+ """
442
+ Create a binary mask of the hand region from landmarks.
443
+
444
+ Args:
445
+ landmarks: 21x2 array of landmark pixel coordinates
446
+ shape: (height, width) of output mask
447
+
448
+ Returns:
449
+ Binary mask (uint8, 0 or 255)
450
+ """
451
+ h, w = shape
452
+ mask = np.zeros((h, w), dtype=np.uint8)
453
+
454
+ # Create convex hull of all landmarks
455
+ hull_points = cv2.convexHull(landmarks.astype(np.int32))
456
+ cv2.fillConvexPoly(mask, hull_points, 255)
457
+
458
+ # Also fill individual finger regions for better coverage
459
+ for finger_name, indices in FINGER_LANDMARKS.items():
460
+ finger_pts = landmarks[indices].astype(np.int32)
461
+ cv2.fillConvexPoly(mask, finger_pts, 255)
462
+
463
+ # Fill thumb
464
+ thumb_pts = landmarks[THUMB_LANDMARKS].astype(np.int32)
465
+ cv2.fillConvexPoly(mask, thumb_pts, 255)
466
+
467
+ # Apply morphological operations to smooth the mask
468
+ kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (15, 15))
469
+ mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
470
+ mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
471
+
472
+ return mask
473
+
474
+
475
+ def _calculate_finger_extension(landmarks: np.ndarray, finger_indices: List[int]) -> float:
476
+ """
477
+ Calculate how extended a finger is based on landmark positions.
478
+
479
+ Returns a score where higher = more extended.
480
+ """
481
+ if len(finger_indices) < 4:
482
+ return 0.0
483
+
484
+ # Get finger landmarks
485
+ mcp = landmarks[finger_indices[0]] # Knuckle
486
+ pip = landmarks[finger_indices[1]] # First joint
487
+ dip = landmarks[finger_indices[2]] # Second joint
488
+ tip = landmarks[finger_indices[3]] # Fingertip
489
+
490
+ # Calculate vectors
491
+ mcp_to_tip = tip - mcp
492
+ mcp_to_pip = pip - mcp
493
+
494
+ # Extension score based on:
495
+ # 1. Distance from knuckle to tip (longer = more extended)
496
+ finger_length = np.linalg.norm(mcp_to_tip)
497
+
498
+ # 2. Straightness (how aligned are the joints)
499
+ pip_to_dip = dip - pip
500
+ dip_to_tip = tip - dip
501
+
502
+ # Dot products to check alignment (1 = straight, -1 = bent back)
503
+ if np.linalg.norm(mcp_to_pip) > 0 and np.linalg.norm(pip_to_dip) > 0:
504
+ align1 = np.dot(mcp_to_pip, pip_to_dip) / (np.linalg.norm(mcp_to_pip) * np.linalg.norm(pip_to_dip))
505
+ else:
506
+ align1 = 0
507
+
508
+ if np.linalg.norm(pip_to_dip) > 0 and np.linalg.norm(dip_to_tip) > 0:
509
+ align2 = np.dot(pip_to_dip, dip_to_tip) / (np.linalg.norm(pip_to_dip) * np.linalg.norm(dip_to_tip))
510
+ else:
511
+ align2 = 0
512
+
513
+ straightness = (align1 + align2) / 2
514
+
515
+ # Combined score
516
+ return finger_length * (0.5 + 0.5 * max(0, straightness))
517
+
518
+
519
+ def _create_finger_roi_mask(
520
+ finger_landmarks: np.ndarray,
521
+ all_landmarks: np.ndarray,
522
+ shape: Tuple[int, int],
523
+ expansion_factor: float = 1.8,
524
+ ) -> np.ndarray:
525
+ """
526
+ Create a Region of Interest (ROI) mask around finger landmarks.
527
+
528
+ This creates a generous bounding region that should contain the entire finger
529
+ without cutting off edges, but excludes other fingers.
530
+
531
+ Args:
532
+ finger_landmarks: 4x2 array of finger landmark positions (MCP, PIP, DIP, TIP)
533
+ all_landmarks: 21x2 array of all hand landmarks
534
+ shape: (height, width) of output mask
535
+ expansion_factor: How much to expand perpendicular to finger axis
536
+
537
+ Returns:
538
+ Binary ROI mask
539
+ """
540
+ h, w = shape
541
+ roi_mask = np.zeros((h, w), dtype=np.uint8)
542
+
543
+ # Calculate finger axis direction
544
+ mcp = finger_landmarks[0]
545
+ tip = finger_landmarks[3]
546
+ finger_axis = tip - mcp
547
+ finger_length = np.linalg.norm(finger_axis)
548
+
549
+ if finger_length < 1:
550
+ return roi_mask
551
+
552
+ finger_direction = finger_axis / finger_length
553
+
554
+ # Perpendicular direction
555
+ perp = np.array([-finger_direction[1], finger_direction[0]])
556
+
557
+ # Estimate finger width from landmark spacing
558
+ # Use median distance between consecutive landmarks as width proxy
559
+ segment_lengths = []
560
+ for i in range(len(finger_landmarks) - 1):
561
+ seg_len = np.linalg.norm(finger_landmarks[i + 1] - finger_landmarks[i])
562
+ segment_lengths.append(seg_len)
563
+ avg_segment = np.median(segment_lengths) if segment_lengths else finger_length / 3
564
+
565
+ # Finger width is roughly 1/3 to 1/2 of segment length
566
+ base_width = avg_segment * 0.6 * expansion_factor
567
+
568
+ # Extend ROI slightly beyond landmarks (towards palm and beyond tip)
569
+ wrist = all_landmarks[WRIST_LANDMARK]
570
+ palm_direction = mcp - wrist
571
+ palm_direction = palm_direction / (np.linalg.norm(palm_direction) + 1e-8)
572
+
573
+ # Extend 20% beyond MCP toward palm
574
+ extended_base = mcp - palm_direction * finger_length * 0.2
575
+ # Extend 10% beyond tip
576
+ extended_tip = tip + finger_direction * finger_length * 0.1
577
+
578
+ # Create polygon along finger with wider margins
579
+ polygon_points = []
580
+ num_samples = 8 # More points for smoother ROI
581
+
582
+ for i in range(num_samples):
583
+ t = i / (num_samples - 1)
584
+ # Interpolate from extended base to extended tip
585
+ pt = extended_base + (extended_tip - extended_base) * t
586
+
587
+ # Width varies: wider at base, narrower at tip
588
+ width_scale = 1.0 - 0.2 * t
589
+ half_width = base_width * width_scale / 2
590
+
591
+ # Add left and right points
592
+ left = pt + perp * half_width
593
+ right = pt - perp * half_width
594
+ polygon_points.append((left, right))
595
+
596
+ # Build polygon
597
+ polygon = []
598
+ for left, right in polygon_points:
599
+ polygon.append(left)
600
+ for left, right in reversed(polygon_points):
601
+ polygon.append(right)
602
+
603
+ polygon = np.array(polygon, dtype=np.int32)
604
+ cv2.fillPoly(roi_mask, [polygon], 255)
605
+
606
+ return roi_mask
607
+
608
+
609
+ def _isolate_finger_from_hand_mask(
610
+ hand_mask: np.ndarray,
611
+ finger_landmarks: np.ndarray,
612
+ all_landmarks: np.ndarray,
613
+ min_area: int = 500,
614
+ ) -> Optional[np.ndarray]:
615
+ """
616
+ Isolate finger using pixel-level intersection of hand mask with finger ROI.
617
+
618
+ This is the preferred method as it preserves actual finger edges from MediaPipe
619
+ rather than creating a synthetic polygon.
620
+
621
+ Args:
622
+ hand_mask: Full hand mask from MediaPipe (pixel-accurate)
623
+ finger_landmarks: 4x2 array of finger landmarks
624
+ all_landmarks: 21x2 array of all hand landmarks
625
+ min_area: Minimum valid finger area
626
+
627
+ Returns:
628
+ Binary finger mask, or None if isolation fails
629
+ """
630
+ h, w = hand_mask.shape
631
+
632
+ # Create ROI mask around finger
633
+ roi_mask = _create_finger_roi_mask(finger_landmarks, all_landmarks, (h, w))
634
+
635
+ # Intersect hand mask with finger ROI
636
+ # This preserves real pixel-level edges from MediaPipe
637
+ finger_mask = cv2.bitwise_and(hand_mask, roi_mask)
638
+
639
+ # Find connected components to remove fragments from other fingers
640
+ num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(
641
+ finger_mask, connectivity=8
642
+ )
643
+
644
+ if num_labels <= 1:
645
+ return None
646
+
647
+ # Select component closest to finger landmarks centroid
648
+ landmarks_centroid = np.mean(finger_landmarks, axis=0)
649
+
650
+ best_component = None
651
+ best_distance = float('inf')
652
+
653
+ for i in range(1, num_labels): # Skip background (0)
654
+ area = stats[i, cv2.CC_STAT_AREA]
655
+ if area < min_area:
656
+ continue
657
+
658
+ component_centroid = centroids[i]
659
+ dist = np.linalg.norm(component_centroid - landmarks_centroid)
660
+
661
+ if dist < best_distance:
662
+ best_distance = dist
663
+ best_component = i
664
+
665
+ if best_component is None:
666
+ return None
667
+
668
+ # Create final mask with only the selected component
669
+ final_mask = np.zeros_like(finger_mask)
670
+ final_mask[labels == best_component] = 255
671
+
672
+ return final_mask
673
+
674
+
675
+ def isolate_finger(
676
+ hand_data: Dict[str, Any],
677
+ finger: FingerIndex = "auto",
678
+ image_shape: Optional[Tuple[int, int]] = None,
679
+ ) -> Optional[Dict[str, Any]]:
680
+ """
681
+ Isolate a specific finger from hand segmentation data.
682
+
683
+ Args:
684
+ hand_data: Output from segment_hand()
685
+ finger: Which finger to isolate, or "auto" to select most extended
686
+ image_shape: (height, width) for mask generation
687
+
688
+ Returns:
689
+ Dictionary containing:
690
+ - mask: Binary finger mask
691
+ - landmarks: Finger landmark positions (4x2 array)
692
+ - base_point: Palm-side base of finger (MCP joint)
693
+ - tip_point: Fingertip position
694
+ - finger_name: Name of the isolated finger
695
+ Or None if finger cannot be isolated
696
+ """
697
+ landmarks = hand_data["landmarks"]
698
+
699
+ if image_shape is None:
700
+ if "mask" in hand_data:
701
+ image_shape = hand_data["mask"].shape[:2]
702
+ else:
703
+ return None
704
+
705
+ # Determine which finger to use
706
+ if finger == "auto":
707
+ best_finger = None
708
+ best_score = -1
709
+
710
+ for finger_name, indices in FINGER_LANDMARKS.items():
711
+ score = _calculate_finger_extension(landmarks, indices)
712
+ if score > best_score:
713
+ best_score = score
714
+ best_finger = finger_name
715
+
716
+ if best_finger is None:
717
+ return None
718
+ finger = best_finger
719
+
720
+ if finger not in FINGER_LANDMARKS:
721
+ return None
722
+
723
+ indices = FINGER_LANDMARKS[finger]
724
+ finger_landmarks = landmarks[indices]
725
+
726
+ # Create finger mask using pixel-level approach (preferred)
727
+ mask = None
728
+ method_used = "unknown"
729
+
730
+ if "mask" in hand_data and hand_data["mask"] is not None:
731
+ mask = _isolate_finger_from_hand_mask(
732
+ hand_data["mask"],
733
+ finger_landmarks,
734
+ landmarks,
735
+ min_area=500,
736
+ )
737
+ if mask is not None:
738
+ method_used = "pixel-level"
739
+ print(f" Finger isolated using pixel-level segmentation")
740
+ else:
741
+ print(f" Pixel-level segmentation failed, falling back to polygon")
742
+
743
+ # Fallback to polygon-based approach
744
+ if mask is None:
745
+ mask = _create_finger_mask(landmarks, indices, image_shape)
746
+ if mask is not None:
747
+ method_used = "polygon"
748
+ print(f" Finger isolated using polygon-based segmentation (fallback)")
749
+ else:
750
+ print(f" Both segmentation methods failed")
751
+ return None
752
+
753
+ return {
754
+ "mask": mask,
755
+ "landmarks": finger_landmarks,
756
+ "base_point": finger_landmarks[0], # MCP joint
757
+ "tip_point": finger_landmarks[3], # Fingertip
758
+ "finger_name": finger,
759
+ "method": method_used,
760
+ }
761
+
762
+
763
+ def _create_finger_mask(
764
+ all_landmarks: np.ndarray,
765
+ finger_indices: List[int],
766
+ shape: Tuple[int, int],
767
+ width_factor: float = 2.5,
768
+ ) -> Optional[np.ndarray]:
769
+ """
770
+ Create a binary mask for a single finger using polygon approximation.
771
+
772
+ This is the fallback method when pixel-level segmentation fails.
773
+
774
+ Args:
775
+ all_landmarks: All 21 hand landmarks
776
+ finger_indices: Indices of the 4 finger landmarks
777
+ shape: (height, width) of output mask
778
+ width_factor: Multiplier for estimated finger width
779
+
780
+ Returns:
781
+ Binary mask of finger region
782
+ """
783
+ h, w = shape
784
+ mask = np.zeros((h, w), dtype=np.uint8)
785
+
786
+ finger_landmarks = all_landmarks[finger_indices]
787
+
788
+ # Estimate finger width based on joint spacing
789
+ mcp_idx = finger_indices[0]
790
+
791
+ adjacent_distances = []
792
+ for other_finger, other_indices in FINGER_LANDMARKS.items():
793
+ other_mcp = other_indices[0]
794
+ if other_mcp != mcp_idx:
795
+ dist = np.linalg.norm(all_landmarks[mcp_idx] - all_landmarks[other_mcp])
796
+ adjacent_distances.append(dist)
797
+
798
+ if adjacent_distances:
799
+ estimated_width = min(adjacent_distances) * 0.4 * width_factor
800
+ else:
801
+ finger_length = np.linalg.norm(finger_landmarks[3] - finger_landmarks[0])
802
+ estimated_width = finger_length / 6 * width_factor
803
+
804
+ # Create polygon along finger with estimated width
805
+ polygon_points = []
806
+
807
+ for i in range(len(finger_landmarks)):
808
+ pt = finger_landmarks[i]
809
+
810
+ if i < len(finger_landmarks) - 1:
811
+ direction = finger_landmarks[i + 1] - pt
812
+ else:
813
+ direction = pt - finger_landmarks[i - 1]
814
+
815
+ perp = np.array([-direction[1], direction[0]])
816
+ perp_norm = np.linalg.norm(perp)
817
+ if perp_norm > 0:
818
+ perp = perp / perp_norm
819
+
820
+ width_scale = 1.0 - 0.3 * (i / (len(finger_landmarks) - 1))
821
+ half_width = estimated_width * width_scale / 2
822
+
823
+ left = pt + perp * half_width
824
+ right = pt - perp * half_width
825
+ polygon_points.append((left, right))
826
+
827
+ # Build polygon: go up left side, then down right side
828
+ polygon = []
829
+ for left, right in polygon_points:
830
+ polygon.append(left)
831
+ for left, right in reversed(polygon_points):
832
+ polygon.append(right)
833
+
834
+ polygon = np.array(polygon, dtype=np.int32)
835
+ cv2.fillPoly(mask, [polygon], 255)
836
+
837
+ # Extend mask slightly towards palm
838
+ mcp = finger_landmarks[0]
839
+ wrist = all_landmarks[WRIST_LANDMARK]
840
+ palm_direction = mcp - wrist
841
+ palm_direction = palm_direction / (np.linalg.norm(palm_direction) + 1e-8)
842
+
843
+ finger_length = np.linalg.norm(finger_landmarks[3] - finger_landmarks[0])
844
+ extension = palm_direction * finger_length * 0.15
845
+ extended_base = mcp - extension
846
+
847
+ perp = np.array([-palm_direction[1], palm_direction[0]])
848
+ half_width = estimated_width / 2
849
+ ext_polygon = np.array([
850
+ mcp + perp * half_width,
851
+ mcp - perp * half_width,
852
+ extended_base - perp * half_width * 0.8,
853
+ extended_base + perp * half_width * 0.8,
854
+ ], dtype=np.int32)
855
+
856
+ cv2.fillPoly(mask, [ext_polygon], 255)
857
+
858
+ return mask
859
+
860
+
861
+ def clean_mask(
862
+ mask: np.ndarray,
863
+ min_area: int = 1000,
864
+ ) -> Optional[np.ndarray]:
865
+ """
866
+ Clean a binary mask by extracting largest component and applying morphology.
867
+
868
+ Args:
869
+ mask: Input binary mask
870
+ min_area: Minimum valid area in pixels
871
+
872
+ Returns:
873
+ Cleaned binary mask, or None if no valid component found
874
+ """
875
+ if mask is None or mask.size == 0:
876
+ return None
877
+
878
+ # Find connected components
879
+ num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(mask, connectivity=8)
880
+
881
+ if num_labels <= 1:
882
+ return None
883
+
884
+ # Find largest component (excluding background at index 0)
885
+ largest_idx = 1
886
+ largest_area = 0
887
+
888
+ for i in range(1, num_labels):
889
+ area = stats[i, cv2.CC_STAT_AREA]
890
+ if area > largest_area:
891
+ largest_area = area
892
+ largest_idx = i
893
+
894
+ if largest_area < min_area:
895
+ return None
896
+
897
+ # Create mask with only the largest component
898
+ cleaned = np.zeros_like(mask)
899
+ cleaned[labels == largest_idx] = 255
900
+
901
+ # Apply morphological smoothing
902
+ kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (7, 7))
903
+
904
+ cleaned = cv2.morphologyEx(cleaned, cv2.MORPH_CLOSE, kernel)
905
+ cleaned = cv2.morphologyEx(cleaned, cv2.MORPH_OPEN, kernel)
906
+
907
+ # Smooth edges with Gaussian blur and re-threshold
908
+ cleaned = cv2.GaussianBlur(cleaned, (5, 5), 0)
909
+ _, cleaned = cv2.threshold(cleaned, 127, 255, cv2.THRESH_BINARY)
910
+
911
+ return cleaned
912
+
913
+
914
+ def get_finger_contour(
915
+ mask: np.ndarray,
916
+ smooth: bool = True,
917
+ ) -> Optional[np.ndarray]:
918
+ """
919
+ Extract outer contour from finger mask.
920
+
921
+ Args:
922
+ mask: Binary finger mask
923
+ smooth: Whether to apply contour smoothing
924
+
925
+ Returns:
926
+ Contour points as Nx2 array, or None if no contour found
927
+ """
928
+ if mask is None:
929
+ return None
930
+
931
+ # Find contours
932
+ contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
933
+
934
+ if not contours:
935
+ return None
936
+
937
+ # Get the largest contour
938
+ largest_contour = max(contours, key=cv2.contourArea)
939
+
940
+ # Reshape to Nx2
941
+ contour = largest_contour.reshape(-1, 2)
942
+
943
+ if smooth and len(contour) > 10:
944
+ # Apply contour smoothing using approximation
945
+ epsilon = 0.005 * cv2.arcLength(largest_contour, True)
946
+ smoothed = cv2.approxPolyDP(largest_contour, epsilon, True)
947
+ contour = smoothed.reshape(-1, 2)
948
+
949
+ return contour.astype(np.float32)
src/geometry.py ADDED
@@ -0,0 +1,791 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Geometric computation utilities.
3
+
4
+ This module handles:
5
+ - Finger axis estimation (PCA and landmark-based)
6
+ - Ring-wearing zone localization
7
+ - Cross-section width measurement
8
+ - Coordinate transformations
9
+ """
10
+
11
+ import logging
12
+ import cv2
13
+ import numpy as np
14
+ from typing import Tuple, List, Optional, Dict, Any, Literal
15
+
16
+ from .geometry_constants import (
17
+ MIN_LANDMARK_SPACING_PX,
18
+ MIN_FINGER_LENGTH_PX,
19
+ EPSILON,
20
+ MIN_MASK_POINTS_FOR_PCA,
21
+ ENDPOINT_SAMPLE_DISTANCE_FACTOR,
22
+ DEFAULT_ZONE_START_PCT,
23
+ DEFAULT_ZONE_END_PCT,
24
+ ANATOMICAL_ZONE_WIDTH_FACTOR,
25
+ MIN_DETERMINANT_FOR_INTERSECTION,
26
+ )
27
+
28
+ logger = logging.getLogger(__name__)
29
+
30
+ # Type for axis estimation method
31
+ AxisMethod = Literal["auto", "landmarks", "pca"]
32
+
33
+
34
+ def _validate_landmark_quality(landmarks: np.ndarray) -> Tuple[bool, str]:
35
+ """
36
+ Validate quality of finger landmarks for axis estimation.
37
+
38
+ Args:
39
+ landmarks: 4x2 array of finger landmarks [MCP, PIP, DIP, TIP]
40
+
41
+ Returns:
42
+ Tuple of (is_valid, reason)
43
+ """
44
+ if landmarks is None or len(landmarks) != 4:
45
+ return False, "landmarks_missing_or_incomplete"
46
+
47
+ # Check for NaN or infinite values
48
+ if not np.all(np.isfinite(landmarks)):
49
+ return False, "landmarks_contain_invalid_values"
50
+
51
+ # Check reasonable spacing (landmarks not collapsed)
52
+ # Calculate distances between consecutive landmarks
53
+ distances = []
54
+ for i in range(len(landmarks) - 1):
55
+ dist = np.linalg.norm(landmarks[i + 1] - landmarks[i])
56
+ distances.append(dist)
57
+
58
+ # Check if any distance is too small (collapsed landmarks)
59
+ min_distance = min(distances)
60
+ if min_distance < MIN_LANDMARK_SPACING_PX:
61
+ return False, "landmarks_too_close"
62
+
63
+ # Check for monotonically increasing progression (no crossovers)
64
+ # Calculate overall direction from MCP to TIP
65
+ overall_direction = landmarks[3] - landmarks[0]
66
+ overall_length = np.linalg.norm(overall_direction)
67
+
68
+ if overall_length < MIN_FINGER_LENGTH_PX:
69
+ return False, "finger_too_short"
70
+
71
+ overall_direction = overall_direction / overall_length
72
+
73
+ # Project each landmark onto overall direction
74
+ # They should be monotonically increasing from MCP to TIP
75
+ projections = []
76
+ for i in range(len(landmarks)):
77
+ proj = np.dot(landmarks[i] - landmarks[0], overall_direction)
78
+ projections.append(proj)
79
+
80
+ # Check monotonic increase
81
+ for i in range(len(projections) - 1):
82
+ if projections[i + 1] <= projections[i]:
83
+ return False, "landmarks_not_monotonic"
84
+
85
+ return True, "valid"
86
+
87
+
88
+ def estimate_finger_axis_from_landmarks(
89
+ landmarks: np.ndarray,
90
+ method: str = "linear_fit"
91
+ ) -> Dict[str, Any]:
92
+ """
93
+ Calculate finger axis directly from anatomical landmarks.
94
+
95
+ OPTIMIZED: Focuses on DIP-PIP segment (ring-wearing zone) for better accuracy.
96
+
97
+ Args:
98
+ landmarks: 4x2 array of finger landmarks [MCP, PIP, DIP, TIP]
99
+ method: Calculation method
100
+ - "endpoints": MCP to TIP vector (legacy, less accurate)
101
+ - "linear_fit": DIP to PIP vector (DEFAULT, optimized for ring measurements)
102
+ - "median_direction": Median of 3 segment directions (robust to outliers)
103
+
104
+ Returns:
105
+ Dictionary containing:
106
+ - center: Axis center point at midpoint of PIP-DIP (x, y)
107
+ - direction: Unit direction vector (dx, dy) from PIP to DIP
108
+ - length: Full finger length in pixels (TIP to MCP, for reference)
109
+ - palm_end: Visualization endpoint (extended from PIP toward palm)
110
+ - tip_end: Visualization endpoint (extended from DIP toward tip)
111
+ - method: Method used ("landmarks")
112
+ """
113
+ # Validate landmarks
114
+ is_valid, reason = _validate_landmark_quality(landmarks)
115
+ if not is_valid:
116
+ raise ValueError(f"Invalid landmarks for axis estimation: {reason}")
117
+
118
+ # Extract landmark positions
119
+ mcp = landmarks[0] # Metacarpophalangeal joint (knuckle, palm-side)
120
+ pip = landmarks[1] # Proximal interphalangeal joint
121
+ dip = landmarks[2] # Distal interphalangeal joint
122
+ tip = landmarks[3] # Fingertip
123
+
124
+ # Calculate direction based on method
125
+ # OPTIMIZED: Focus on DIP-PIP segment (ring-wearing zone)
126
+ if method == "endpoints":
127
+ # Simple: vector from MCP to TIP (legacy, less accurate for ring zone)
128
+ direction = tip - mcp
129
+ direction_length = np.linalg.norm(direction)
130
+ direction = direction / direction_length
131
+
132
+ elif method == "linear_fit":
133
+ # OPTIMIZED: Use only DIP and PIP (most relevant for ring measurements)
134
+ # These two joints define the proximal phalanx where rings are worn
135
+ direction = dip - pip # Vector from PIP to DIP
136
+ direction_length = np.linalg.norm(direction)
137
+ direction = direction / direction_length
138
+
139
+ # Ensure direction points from palm to tip (PIP to DIP)
140
+ # Direction should already be correct, but verify
141
+ if np.dot(direction, tip - mcp) < 0:
142
+ direction = -direction
143
+
144
+ elif method == "median_direction":
145
+ # Robust to outliers: median of segment directions
146
+ # Calculate direction vectors for each segment
147
+ seg1_dir = (pip - mcp) / np.linalg.norm(pip - mcp)
148
+ seg2_dir = (dip - pip) / np.linalg.norm(dip - pip)
149
+ seg3_dir = (tip - dip) / np.linalg.norm(tip - dip)
150
+
151
+ # Take median of each component
152
+ directions = np.array([seg1_dir, seg2_dir, seg3_dir])
153
+ median_dir = np.median(directions, axis=0)
154
+ direction = median_dir / np.linalg.norm(median_dir)
155
+
156
+ else:
157
+ raise ValueError(f"Unknown method: {method}. Use 'endpoints', 'linear_fit', or 'median_direction'")
158
+
159
+ # OPTIMIZED: Center at midpoint of DIP and PIP (ring zone focus)
160
+ center = (pip + dip) / 2.0
161
+
162
+ # Calculate finger length (still use full finger for reference)
163
+ length = np.linalg.norm(tip - mcp)
164
+
165
+ # OPTIMIZED: Visual endpoints are DIP and PIP (ring zone segment)
166
+ # Extended slightly for visualization clarity
167
+ segment_length = np.linalg.norm(dip - pip)
168
+ extension_factor = 0.5 # Extend 50% beyond each endpoint for visualization
169
+ palm_end = pip - direction * (segment_length * extension_factor)
170
+ tip_end = dip + direction * (segment_length * extension_factor)
171
+
172
+ return {
173
+ "center": center.astype(np.float32),
174
+ "direction": direction.astype(np.float32),
175
+ "length": float(length),
176
+ "palm_end": palm_end.astype(np.float32),
177
+ "tip_end": tip_end.astype(np.float32),
178
+ "method": "landmarks",
179
+ }
180
+
181
+
182
+ def _estimate_axis_pca(
183
+ mask: np.ndarray,
184
+ landmarks: Optional[np.ndarray] = None,
185
+ ) -> Dict[str, Any]:
186
+ """
187
+ Estimate finger axis using PCA on mask points.
188
+
189
+ This is the original v0 implementation, now refactored as a helper function.
190
+
191
+ Args:
192
+ mask: Binary finger mask
193
+ landmarks: Optional finger landmarks for orientation (4x2 array)
194
+
195
+ Returns:
196
+ Dictionary containing axis data with method="pca"
197
+ Keys: center, direction, length, palm_end, tip_end, method
198
+ """
199
+ # Get all non-zero points in the mask
200
+ points = np.column_stack(np.where(mask > 0)) # Returns (row, col) i.e., (y, x)
201
+ points = points[:, [1, 0]] # Convert to (x, y) format
202
+
203
+ if len(points) < MIN_MASK_POINTS_FOR_PCA:
204
+ raise ValueError("Not enough points in mask for axis estimation")
205
+
206
+ # Calculate center (centroid)
207
+ center = np.mean(points, axis=0)
208
+
209
+ # Center the points
210
+ centered = points - center
211
+
212
+ # Compute covariance matrix
213
+ cov = np.cov(centered.T)
214
+
215
+ # Compute eigenvalues and eigenvectors
216
+ eigenvalues, eigenvectors = np.linalg.eigh(cov)
217
+
218
+ # Principal axis is the eigenvector with largest eigenvalue
219
+ principal_idx = np.argmax(eigenvalues)
220
+ direction = eigenvectors[:, principal_idx]
221
+
222
+ # Ensure direction is a unit vector
223
+ direction = direction / np.linalg.norm(direction)
224
+
225
+ # Project all points onto the principal axis to find endpoints
226
+ projections = np.dot(centered, direction)
227
+ min_proj = np.min(projections)
228
+ max_proj = np.max(projections)
229
+
230
+ # Calculate finger length
231
+ length = max_proj - min_proj
232
+
233
+ # Calculate endpoints along the axis
234
+ endpoint1 = center + direction * min_proj
235
+ endpoint2 = center + direction * max_proj
236
+
237
+ # Determine which endpoint is palm vs tip
238
+ # If landmarks are provided, use them for orientation
239
+ if landmarks is not None and len(landmarks) == 4:
240
+ # landmarks[0] is MCP (palm side), landmarks[3] is tip
241
+ base_point = landmarks[0]
242
+ tip_point = landmarks[3]
243
+
244
+ # Determine which endpoint is closer to the base
245
+ dist1_to_base = np.linalg.norm(endpoint1 - base_point)
246
+ dist2_to_base = np.linalg.norm(endpoint2 - base_point)
247
+
248
+ if dist1_to_base < dist2_to_base:
249
+ palm_end = endpoint1
250
+ tip_end = endpoint2
251
+ else:
252
+ palm_end = endpoint2
253
+ tip_end = endpoint1
254
+ direction = -direction # Flip direction to point from palm to tip
255
+ else:
256
+ # Without landmarks, use heuristic: tip is usually thinner
257
+ # Sample points near each endpoint
258
+ sample_distance = length * ENDPOINT_SAMPLE_DISTANCE_FACTOR
259
+
260
+ # Points near endpoint1
261
+ near_ep1 = points[np.abs(projections - min_proj) < sample_distance]
262
+ # Points near endpoint2
263
+ near_ep2 = points[np.abs(projections - max_proj) < sample_distance]
264
+
265
+ # Calculate average distance from axis for each end (proxy for thickness)
266
+ if len(near_ep1) > 0 and len(near_ep2) > 0:
267
+ # Project distances perpendicular to axis
268
+ perp_direction = np.array([-direction[1], direction[0]])
269
+ dist1 = np.mean(np.abs(np.dot(near_ep1 - center, perp_direction)))
270
+ dist2 = np.mean(np.abs(np.dot(near_ep2 - center, perp_direction)))
271
+
272
+ # Thinner end is likely the tip
273
+ if dist1 < dist2:
274
+ palm_end = endpoint2
275
+ tip_end = endpoint1
276
+ direction = -direction
277
+ else:
278
+ palm_end = endpoint1
279
+ tip_end = endpoint2
280
+ else:
281
+ # Fallback: assume endpoint2 is tip (positive direction)
282
+ palm_end = endpoint1
283
+ tip_end = endpoint2
284
+
285
+ return {
286
+ "center": center.astype(np.float32),
287
+ "direction": direction.astype(np.float32),
288
+ "length": float(length),
289
+ "palm_end": palm_end.astype(np.float32),
290
+ "tip_end": tip_end.astype(np.float32),
291
+ "method": "pca",
292
+ }
293
+
294
+
295
+ def estimate_finger_axis(
296
+ mask: np.ndarray,
297
+ landmarks: Optional[np.ndarray] = None,
298
+ method: AxisMethod = "auto",
299
+ landmark_method: str = "linear_fit",
300
+ ) -> Dict[str, Any]:
301
+ """
302
+ Estimate the principal axis of a finger using landmarks (preferred) or PCA (fallback).
303
+
304
+ v1 Enhancement: Now supports landmark-based axis estimation for improved accuracy
305
+ on bent fingers. Auto mode (default) uses landmarks when available and valid,
306
+ falling back to PCA if needed.
307
+
308
+ Args:
309
+ mask: Binary finger mask
310
+ landmarks: Optional finger landmarks (4x2 array: [MCP, PIP, DIP, TIP])
311
+ method: Axis estimation method
312
+ - "auto": Use landmarks if available and valid, else PCA (recommended)
313
+ - "landmarks": Force landmark-based (fails if landmarks invalid)
314
+ - "pca": Force PCA-based (v0 behavior)
315
+ landmark_method: Method for landmark-based estimation
316
+ ("endpoints", "linear_fit", "median_direction")
317
+
318
+ Returns:
319
+ Dictionary containing:
320
+ - center: Axis center point (x, y)
321
+ - direction: Unit direction vector (dx, dy) pointing from palm to tip
322
+ - length: Estimated finger length in pixels
323
+ - palm_end: Palm-side endpoint
324
+ - tip_end: Fingertip endpoint
325
+ - method: Method actually used ("landmarks" or "pca")
326
+ """
327
+ if method == "pca":
328
+ # Force PCA method
329
+ return _estimate_axis_pca(mask, landmarks)
330
+
331
+ elif method == "landmarks":
332
+ # Force landmark method (fail if landmarks invalid)
333
+ if landmarks is None or len(landmarks) != 4:
334
+ raise ValueError("Landmark method requested but landmarks not available")
335
+ return estimate_finger_axis_from_landmarks(landmarks, method=landmark_method)
336
+
337
+ elif method == "auto":
338
+ # Auto mode: try landmarks first, fall back to PCA
339
+ try:
340
+ # Check if landmarks are available and valid
341
+ if landmarks is not None and len(landmarks) == 4:
342
+ is_valid, reason = _validate_landmark_quality(landmarks)
343
+ if is_valid:
344
+ # Use landmark-based method
345
+ logger.debug(f"Using landmark-based axis estimation ({landmark_method})")
346
+ return estimate_finger_axis_from_landmarks(landmarks, method=landmark_method)
347
+ else:
348
+ logger.debug(f"Landmarks available but quality check failed: {reason}")
349
+ logger.debug("Falling back to PCA axis estimation")
350
+ else:
351
+ logger.debug("Landmarks not available, using PCA axis estimation")
352
+
353
+ except Exception as e:
354
+ logger.debug(f"Landmark-based axis estimation failed: {e}")
355
+ logger.debug("Falling back to PCA axis estimation")
356
+
357
+ # Fall back to PCA
358
+ return _estimate_axis_pca(mask, landmarks)
359
+
360
+ else:
361
+ raise ValueError(f"Unknown method: {method}. Use 'auto', 'landmarks', or 'pca'")
362
+
363
+
364
+ def localize_ring_zone(
365
+ axis_data: Dict[str, Any],
366
+ zone_start_pct: float = DEFAULT_ZONE_START_PCT,
367
+ zone_end_pct: float = DEFAULT_ZONE_END_PCT,
368
+ ) -> Dict[str, Any]:
369
+ """
370
+ Localize the ring-wearing zone along the finger axis.
371
+
372
+ Args:
373
+ axis_data: Output from estimate_finger_axis() containing center,
374
+ direction, length, palm_end, tip_end
375
+ zone_start_pct: Zone start as percentage from palm (default 15%)
376
+ zone_end_pct: Zone end as percentage from palm (default 25%)
377
+
378
+ Returns:
379
+ Dictionary containing:
380
+ - start_point: Zone start position (x, y)
381
+ - end_point: Zone end position (x, y)
382
+ - center_point: Zone center position (x, y)
383
+ - length: Zone length in pixels
384
+ - start_pct: Start percentage used
385
+ - end_pct: End percentage used
386
+ - localization_method: "percentage"
387
+ """
388
+ # Extract axis information
389
+ palm_end = axis_data["palm_end"]
390
+ tip_end = axis_data["tip_end"]
391
+ direction = axis_data["direction"]
392
+ finger_length = axis_data["length"]
393
+
394
+ # Calculate zone positions along the axis
395
+ # Start at zone_start_pct from palm end
396
+ start_distance = finger_length * zone_start_pct
397
+ start_point = palm_end + direction * start_distance
398
+
399
+ # End at zone_end_pct from palm end
400
+ end_distance = finger_length * zone_end_pct
401
+ end_point = palm_end + direction * end_distance
402
+
403
+ # Calculate zone center
404
+ center_point = (start_point + end_point) / 2.0
405
+
406
+ # Zone length
407
+ zone_length = end_distance - start_distance
408
+
409
+ return {
410
+ "start_point": start_point.astype(np.float32),
411
+ "end_point": end_point.astype(np.float32),
412
+ "center_point": center_point.astype(np.float32),
413
+ "length": float(zone_length),
414
+ "start_pct": zone_start_pct,
415
+ "end_pct": zone_end_pct,
416
+ "localization_method": "percentage",
417
+ }
418
+
419
+
420
+ def localize_ring_zone_from_landmarks(
421
+ landmarks: np.ndarray,
422
+ axis_data: Dict[str, Any],
423
+ zone_type: str = "percentage",
424
+ zone_start_pct: float = DEFAULT_ZONE_START_PCT,
425
+ zone_end_pct: float = DEFAULT_ZONE_END_PCT,
426
+ ) -> Dict[str, Any]:
427
+ """
428
+ Localize ring-wearing zone using anatomical landmarks.
429
+
430
+ v1 Enhancement: Provides anatomical-based ring zone localization
431
+ as an alternative to percentage-based approach.
432
+
433
+ Args:
434
+ landmarks: 4x2 array of finger landmarks [MCP, PIP, DIP, TIP]
435
+ axis_data: Output from estimate_finger_axis() containing center,
436
+ direction, length, palm_end, tip_end
437
+ zone_type: Zone localization method
438
+ - "percentage": 15-25% from palm (v0 compatible, default)
439
+ - "anatomical": Centered on PIP joint with proportional width
440
+ zone_start_pct: Zone start percentage (percentage mode only)
441
+ zone_end_pct: Zone end percentage (percentage mode only)
442
+
443
+ Returns:
444
+ Dictionary containing:
445
+ - start_point: Zone start position (x, y)
446
+ - end_point: Zone end position (x, y)
447
+ - center_point: Zone center position (x, y)
448
+ - length: Zone length in pixels
449
+ - localization_method: "percentage" or "anatomical"
450
+ """
451
+ if zone_type == "percentage":
452
+ # Use percentage-based method (v0 compatible)
453
+ result = localize_ring_zone(axis_data, zone_start_pct, zone_end_pct)
454
+ return result
455
+
456
+ elif zone_type == "anatomical":
457
+ # Anatomical mode: Target the proximal phalanx (ring-wearing segment)
458
+ # Upper bound: PIP joint (toward fingertip)
459
+ # Lower bound: PIP - (DIP - PIP) = one segment length below PIP (toward palm)
460
+ # This spans the proximal phalanx where rings are typically worn
461
+ pip = landmarks[1]
462
+ dip = landmarks[2]
463
+
464
+ # Calculate segment length (DIP to PIP distance)
465
+ segment_vector = dip - pip # Vector from PIP to DIP
466
+
467
+ # Ring zone spans from PIP down toward palm by one segment length
468
+ # end_point is toward fingertip (PIP)
469
+ # start_point is toward palm (PIP - segment_vector = one segment below PIP)
470
+ end_point = pip.copy() # Upper bound at PIP
471
+ start_point = pip - segment_vector # Lower bound one segment below PIP
472
+
473
+ # Calculate zone center and length
474
+ center_point = (start_point + end_point) / 2.0
475
+ zone_length = np.linalg.norm(end_point - start_point)
476
+
477
+ return {
478
+ "start_point": start_point.astype(np.float32),
479
+ "end_point": end_point.astype(np.float32),
480
+ "center_point": center_point.astype(np.float32),
481
+ "length": float(zone_length),
482
+ "localization_method": "anatomical",
483
+ }
484
+
485
+ else:
486
+ raise ValueError(f"Unknown zone_type: {zone_type}. Use 'percentage' or 'anatomical'")
487
+
488
+
489
+ def compute_cross_section_width(
490
+ contour: np.ndarray,
491
+ axis_data: Dict[str, Any],
492
+ zone_data: Dict[str, Any],
493
+ num_samples: int = 20,
494
+ ) -> Dict[str, Any]:
495
+ """
496
+ Measure finger width by sampling cross-sections perpendicular to axis.
497
+
498
+ Args:
499
+ contour: Finger contour points (Nx2 array in x,y format)
500
+ axis_data: Output from estimate_finger_axis() containing center,
501
+ direction, length, palm_end, tip_end
502
+ zone_data: Output from localize_ring_zone() containing start_point,
503
+ end_point, center_point
504
+ num_samples: Number of cross-section samples (default 20)
505
+
506
+ Returns:
507
+ Dictionary containing:
508
+ - widths_px: List of width measurements in pixels
509
+ - sample_points: List of (left, right) intersection point tuples
510
+ - median_width_px: Median width in pixels
511
+ - std_width_px: Standard deviation of widths
512
+ - mean_width_px: Mean width in pixels
513
+ - num_samples: Actual number of successful measurements
514
+ """
515
+ direction = axis_data["direction"]
516
+ start_point = zone_data["start_point"]
517
+ end_point = zone_data["end_point"]
518
+
519
+ # Perpendicular direction (rotate 90 degrees)
520
+ perp_direction = np.array([-direction[1], direction[0]], dtype=np.float32)
521
+
522
+ widths = []
523
+ sample_points_list = []
524
+
525
+ # Generate sample points along the zone
526
+ for i in range(num_samples):
527
+ # Interpolate between start and end
528
+ t = i / (num_samples - 1) if num_samples > 1 else 0.5
529
+ sample_center = start_point + t * (end_point - start_point)
530
+
531
+ # Find intersections with contour along perpendicular line
532
+ intersections = line_contour_intersections(
533
+ contour, sample_center, perp_direction
534
+ )
535
+
536
+ if len(intersections) >= 2:
537
+ # Convert to numpy array for distance calculations
538
+ pts = np.array(intersections)
539
+
540
+ # Find the two points that are farthest apart
541
+ # This handles cases where the line intersects multiple times
542
+ max_dist = 0
543
+ best_pair = None
544
+
545
+ for j in range(len(pts)):
546
+ for k in range(j + 1, len(pts)):
547
+ dist = np.linalg.norm(pts[j] - pts[k])
548
+ if dist > max_dist:
549
+ max_dist = dist
550
+ best_pair = (pts[j], pts[k])
551
+
552
+ if best_pair is not None:
553
+ widths.append(max_dist)
554
+ sample_points_list.append(best_pair)
555
+
556
+ if len(widths) == 0:
557
+ raise ValueError("No valid width measurements found in ring zone")
558
+
559
+ widths = np.array(widths)
560
+
561
+ # Calculate statistics
562
+ median_width = float(np.median(widths))
563
+ mean_width = float(np.mean(widths))
564
+ std_width = float(np.std(widths))
565
+
566
+ return {
567
+ "widths_px": widths.tolist(),
568
+ "sample_points": sample_points_list,
569
+ "median_width_px": median_width,
570
+ "mean_width_px": mean_width,
571
+ "std_width_px": std_width,
572
+ "num_samples": len(widths),
573
+ }
574
+
575
+
576
+ def line_contour_intersections(
577
+ contour: np.ndarray,
578
+ point: Tuple[float, float],
579
+ direction: Tuple[float, float],
580
+ ) -> List[Tuple[float, float]]:
581
+ """
582
+ Find intersection points between a line and a contour.
583
+
584
+ Uses parametric line-segment intersection to find where an infinite line
585
+ intersects with the contour edges.
586
+
587
+ Args:
588
+ contour: Contour points (Nx2 array in x,y format)
589
+ point: A point on the line (x, y)
590
+ direction: Line direction vector (dx, dy), will be normalized
591
+
592
+ Returns:
593
+ List of intersection points as (x, y) tuples
594
+ """
595
+ intersections = []
596
+
597
+ # Normalize direction
598
+ direction = np.array(direction, dtype=np.float32)
599
+ direction = direction / (np.linalg.norm(direction) + EPSILON)
600
+
601
+ point = np.array(point, dtype=np.float32)
602
+
603
+ # Check each edge of the contour
604
+ n = len(contour)
605
+ for i in range(n):
606
+ p1 = contour[i]
607
+ p2 = contour[(i + 1) % n]
608
+
609
+ # Find intersection between line and edge segment
610
+ # Line: P = point + t * direction
611
+ # Segment: Q = p1 + s * (p2 - p1), where s ∈ [0, 1]
612
+
613
+ edge_vec = p2 - p1
614
+
615
+ # Solve: point + t * direction = p1 + s * edge_vec
616
+ # Rearranged: t * direction - s * edge_vec = p1 - point
617
+
618
+ # Create matrix [direction, -edge_vec] * [t, s]^T = p1 - point
619
+ A = np.column_stack([direction, -edge_vec])
620
+ b = p1 - point
621
+
622
+ # Check if matrix is singular (parallel lines)
623
+ det = A[0, 0] * A[1, 1] - A[0, 1] * A[1, 0]
624
+ if abs(det) < MIN_DETERMINANT_FOR_INTERSECTION:
625
+ continue
626
+
627
+ # Solve for t and s
628
+ try:
629
+ params = np.linalg.solve(A, b)
630
+ t, s = params[0], params[1]
631
+
632
+ # Check if intersection is on the edge segment (s ∈ [0, 1])
633
+ if 0 <= s <= 1:
634
+ intersection = point + t * direction
635
+ intersections.append(tuple(intersection))
636
+ except np.linalg.LinAlgError:
637
+ continue
638
+
639
+ return intersections
640
+
641
+
642
+ # ============================================================================
643
+ # Precise Image Rotation for Finger Alignment
644
+ # ============================================================================
645
+
646
+ def calculate_angle_from_vertical(direction: np.ndarray) -> float:
647
+ """
648
+ Calculate the rotation needed to align a direction vector to vertical (upward).
649
+
650
+ In image coordinates, vertical upward is (0, -1) in (x, y) format.
651
+
652
+ Args:
653
+ direction: Unit direction vector (dx, dy) in (x, y) format
654
+
655
+ Returns:
656
+ Rotation angle in degrees to apply to align direction to vertical.
657
+ Positive = need to rotate counter-clockwise (CCW) in image coordinates.
658
+ Range: [-180, 180]
659
+ """
660
+ # Vertical upward in image coordinates: (0, -1)
661
+ vertical = np.array([0.0, -1.0])
662
+
663
+ # Calculate angle using atan2(cross_product, dot_product)
664
+ # cross = dx * (-1) - dy * 0 = -dx
665
+ # dot = dx * 0 + dy * (-1) = -dy
666
+ cross = direction[0] * vertical[1] - direction[1] * vertical[0]
667
+ dot = np.dot(direction, vertical)
668
+
669
+ angle_rad = np.arctan2(cross, dot)
670
+ angle_deg = np.degrees(angle_rad)
671
+
672
+ # Negate the angle: if finger is tilted +10° CW from vertical,
673
+ # we need to rotate -10° (CCW) to straighten it
674
+ return -angle_deg
675
+
676
+
677
+ def rotate_image_precise(
678
+ image: np.ndarray,
679
+ angle_degrees: float,
680
+ center: Optional[Tuple[float, float]] = None
681
+ ) -> Tuple[np.ndarray, np.ndarray]:
682
+ """
683
+ Rotate image by a precise angle around a center point.
684
+
685
+ Args:
686
+ image: Input image (grayscale or BGR)
687
+ angle_degrees: Rotation angle in degrees (positive = clockwise)
688
+ center: Rotation center (x, y). If None, uses image center.
689
+
690
+ Returns:
691
+ Tuple of:
692
+ - rotated_image: Rotated image (same size as input)
693
+ - rotation_matrix: 2x3 affine transformation matrix
694
+ """
695
+ h, w = image.shape[:2]
696
+
697
+ if center is None:
698
+ center = (w / 2.0, h / 2.0)
699
+
700
+ # Get rotation matrix (OpenCV uses clockwise positive)
701
+ rotation_matrix = cv2.getRotationMatrix2D(center, angle_degrees, scale=1.0)
702
+
703
+ # Apply rotation
704
+ rotated = cv2.warpAffine(
705
+ image, rotation_matrix, (w, h),
706
+ flags=cv2.INTER_LINEAR,
707
+ borderMode=cv2.BORDER_CONSTANT,
708
+ borderValue=0
709
+ )
710
+
711
+ return rotated, rotation_matrix
712
+
713
+
714
+ def transform_points_rotation(
715
+ points: np.ndarray,
716
+ rotation_matrix: np.ndarray
717
+ ) -> np.ndarray:
718
+ """
719
+ Transform points using a rotation matrix from cv2.getRotationMatrix2D.
720
+
721
+ Args:
722
+ points: Nx2 array of points in (x, y) format
723
+ rotation_matrix: 2x3 affine transformation matrix from cv2.getRotationMatrix2D
724
+
725
+ Returns:
726
+ Nx2 array of transformed points in (x, y) format
727
+ """
728
+ # Add homogeneous coordinate (1) to each point: (x, y) -> (x, y, 1)
729
+ n_points = points.shape[0]
730
+ homogeneous = np.hstack([points, np.ones((n_points, 1))])
731
+
732
+ # Apply transformation: [2x3] @ [3xN]^T -> [2xN]^T
733
+ transformed = (rotation_matrix @ homogeneous.T).T
734
+
735
+ return transformed.astype(np.float32)
736
+
737
+
738
+ def rotate_axis_data(
739
+ axis_data: Dict[str, Any],
740
+ rotation_matrix: np.ndarray
741
+ ) -> Dict[str, Any]:
742
+ """
743
+ Update axis data after image rotation.
744
+
745
+ Args:
746
+ axis_data: Axis data dictionary with center, direction, palm_end, tip_end
747
+ rotation_matrix: 2x3 rotation matrix
748
+
749
+ Returns:
750
+ Updated axis data with transformed coordinates
751
+ """
752
+ rotated = axis_data.copy()
753
+
754
+ # Transform center point
755
+ center = axis_data["center"].reshape(1, 2)
756
+ rotated["center"] = transform_points_rotation(center, rotation_matrix)[0]
757
+
758
+ # Transform direction vector (rotation only, no translation)
759
+ # For direction vectors, we only apply the rotation part (2x2)
760
+ rotation_only = rotation_matrix[:2, :2]
761
+ direction = axis_data["direction"].reshape(2, 1)
762
+ rotated_direction = (rotation_only @ direction).flatten()
763
+ rotated["direction"] = rotated_direction / np.linalg.norm(rotated_direction)
764
+
765
+ # Transform endpoints if they exist
766
+ if "palm_end" in axis_data:
767
+ palm_end = axis_data["palm_end"].reshape(1, 2)
768
+ rotated["palm_end"] = transform_points_rotation(palm_end, rotation_matrix)[0]
769
+
770
+ if "tip_end" in axis_data:
771
+ tip_end = axis_data["tip_end"].reshape(1, 2)
772
+ rotated["tip_end"] = transform_points_rotation(tip_end, rotation_matrix)[0]
773
+
774
+ return rotated
775
+
776
+
777
+ def rotate_contour(
778
+ contour: np.ndarray,
779
+ rotation_matrix: np.ndarray
780
+ ) -> np.ndarray:
781
+ """
782
+ Rotate a contour using rotation matrix.
783
+
784
+ Args:
785
+ contour: Nx2 array of contour points in (x, y) format
786
+ rotation_matrix: 2x3 rotation matrix
787
+
788
+ Returns:
789
+ Rotated contour in same format
790
+ """
791
+ return transform_points_rotation(contour, rotation_matrix)
src/geometry_constants.py ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Constants for geometric computation module.
3
+
4
+ This module contains thresholds and parameters used in finger axis
5
+ estimation and ring zone localization.
6
+ """
7
+
8
+ # =============================================================================
9
+ # Landmark Quality Validation Constants
10
+ # =============================================================================
11
+
12
+ # Minimum distance between consecutive landmarks (pixels)
13
+ # Less than this suggests collapsed/invalid landmarks
14
+ MIN_LANDMARK_SPACING_PX = 5.0
15
+
16
+ # Minimum total finger length from MCP to TIP (pixels)
17
+ # Entire finger less than this suggests invalid detection
18
+ MIN_FINGER_LENGTH_PX = 20.0
19
+
20
+
21
+ # =============================================================================
22
+ # Finger Axis Estimation Constants
23
+ # =============================================================================
24
+
25
+ # Epsilon for avoiding division by zero in normalization
26
+ EPSILON = 1e-8
27
+
28
+ # Minimum number of mask points required for PCA
29
+ MIN_MASK_POINTS_FOR_PCA = 10
30
+
31
+ # Sample distance factor for endpoint thickness heuristic
32
+ # Used when determining palm vs tip end without landmarks
33
+ ENDPOINT_SAMPLE_DISTANCE_FACTOR = 0.1 # 10% of finger length
34
+
35
+
36
+ # =============================================================================
37
+ # Ring Zone Localization Constants
38
+ # =============================================================================
39
+
40
+ # Default ring zone position as percentage of finger length from palm
41
+ DEFAULT_ZONE_START_PCT = 0.15 # 15% from palm end
42
+ DEFAULT_ZONE_END_PCT = 0.25 # 25% from palm end
43
+
44
+ # Anatomical zone width factor (for anatomical localization mode)
45
+ # Zone width = MCP-PIP distance * this factor
46
+ ANATOMICAL_ZONE_WIDTH_FACTOR = 0.5 # 50% of MCP-PIP segment (25% each side)
47
+
48
+
49
+ # =============================================================================
50
+ # Line-Contour Intersection Constants
51
+ # =============================================================================
52
+
53
+ # Minimum determinant value to detect parallel lines
54
+ MIN_DETERMINANT_FOR_INTERSECTION = 1e-8
src/image_quality.py ADDED
@@ -0,0 +1,181 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Image quality assessment utilities.
3
+
4
+ This module handles:
5
+ - Blur detection using Laplacian variance
6
+ - Exposure/contrast analysis
7
+ - Overall quality scoring
8
+ """
9
+
10
+ import cv2
11
+ import numpy as np
12
+ from typing import Dict, Any, Tuple
13
+
14
+
15
+ # Quality thresholds
16
+ BLUR_THRESHOLD = 20.0 # Laplacian variance below this is considered blurry
17
+ MIN_BRIGHTNESS = 40 # Mean brightness below this is underexposed
18
+ MAX_BRIGHTNESS = 220 # Mean brightness above this is overexposed
19
+ MIN_CONTRAST = 30 # Std dev below this indicates low contrast
20
+
21
+
22
+ def detect_blur(image: np.ndarray) -> Tuple[float, bool]:
23
+ """
24
+ Detect image blur using Laplacian variance method.
25
+
26
+ The Laplacian operator highlights regions of rapid intensity change,
27
+ so a well-focused image will have high variance in Laplacian response.
28
+
29
+ Args:
30
+ image: Input BGR image
31
+
32
+ Returns:
33
+ Tuple of (blur_score, is_sharp)
34
+ - blur_score: Laplacian variance (higher = sharper)
35
+ - is_sharp: True if image passes sharpness threshold
36
+ """
37
+ # Convert to grayscale if needed
38
+ if len(image.shape) == 3:
39
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
40
+ else:
41
+ gray = image
42
+
43
+ # Compute Laplacian
44
+ laplacian = cv2.Laplacian(gray, cv2.CV_64F)
45
+
46
+ # Variance of Laplacian indicates focus quality
47
+ blur_score = laplacian.var()
48
+
49
+ is_sharp = blur_score >= BLUR_THRESHOLD
50
+
51
+ return blur_score, is_sharp
52
+
53
+
54
+ def check_exposure(image: np.ndarray) -> Dict[str, Any]:
55
+ """
56
+ Check image exposure and contrast using histogram analysis.
57
+
58
+ Args:
59
+ image: Input BGR image
60
+
61
+ Returns:
62
+ Dictionary containing:
63
+ - brightness: Mean brightness (0-255)
64
+ - contrast: Standard deviation of brightness
65
+ - is_underexposed: True if image is too dark
66
+ - is_overexposed: True if image is too bright
67
+ - has_good_contrast: True if contrast is sufficient
68
+ """
69
+ # Convert to grayscale if needed
70
+ if len(image.shape) == 3:
71
+ gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
72
+ else:
73
+ gray = image
74
+
75
+ # Calculate statistics
76
+ brightness = float(np.mean(gray))
77
+ contrast = float(np.std(gray))
78
+
79
+ # Check exposure conditions
80
+ is_underexposed = brightness < MIN_BRIGHTNESS
81
+ is_overexposed = brightness > MAX_BRIGHTNESS
82
+ has_good_contrast = contrast >= MIN_CONTRAST
83
+
84
+ return {
85
+ "brightness": brightness,
86
+ "contrast": contrast,
87
+ "is_underexposed": is_underexposed,
88
+ "is_overexposed": is_overexposed,
89
+ "has_good_contrast": has_good_contrast,
90
+ }
91
+
92
+
93
+ def check_resolution(image: np.ndarray, min_dimension: int = 720) -> Dict[str, Any]:
94
+ """
95
+ Check if image resolution is sufficient.
96
+
97
+ Args:
98
+ image: Input BGR image
99
+ min_dimension: Minimum acceptable dimension (default 720 for 720p)
100
+
101
+ Returns:
102
+ Dictionary containing:
103
+ - width: Image width in pixels
104
+ - height: Image height in pixels
105
+ - is_sufficient: True if resolution meets minimum
106
+ """
107
+ height, width = image.shape[:2]
108
+ min_dim = min(width, height)
109
+
110
+ return {
111
+ "width": width,
112
+ "height": height,
113
+ "is_sufficient": min_dim >= min_dimension,
114
+ }
115
+
116
+
117
+ def assess_image_quality(image: np.ndarray) -> Dict[str, Any]:
118
+ """
119
+ Comprehensive image quality assessment.
120
+
121
+ Combines blur detection, exposure check, and resolution check
122
+ to determine if image is suitable for processing.
123
+
124
+ Args:
125
+ image: Input BGR image
126
+
127
+ Returns:
128
+ Dictionary containing:
129
+ - passed: True if image passes all quality checks
130
+ - blur_score: Laplacian variance score
131
+ - brightness: Mean brightness
132
+ - contrast: Standard deviation
133
+ - resolution: (width, height)
134
+ - issues: List of quality issues found
135
+ - fail_reason: Primary failure reason if failed, else None
136
+ """
137
+ issues = []
138
+ fail_reason = None
139
+
140
+ # Check blur
141
+ blur_score, is_sharp = detect_blur(image)
142
+ if not is_sharp:
143
+ issues.append(f"Image is blurry (score: {blur_score:.1f}, threshold: {BLUR_THRESHOLD})")
144
+ if fail_reason is None:
145
+ fail_reason = "image_too_blurry"
146
+
147
+ # Check exposure
148
+ exposure = check_exposure(image)
149
+ if exposure["is_underexposed"]:
150
+ issues.append(f"Image is underexposed (brightness: {exposure['brightness']:.1f})")
151
+ if fail_reason is None:
152
+ fail_reason = "image_underexposed"
153
+ if exposure["is_overexposed"]:
154
+ issues.append(f"Image is overexposed (brightness: {exposure['brightness']:.1f})")
155
+ if fail_reason is None:
156
+ fail_reason = "image_overexposed"
157
+ if not exposure["has_good_contrast"]:
158
+ issues.append(f"Image has low contrast (std: {exposure['contrast']:.1f})")
159
+ if fail_reason is None:
160
+ fail_reason = "image_low_contrast"
161
+
162
+ # Check resolution
163
+ resolution = check_resolution(image)
164
+ if not resolution["is_sufficient"]:
165
+ issues.append(
166
+ f"Resolution too low ({resolution['width']}x{resolution['height']})"
167
+ )
168
+ if fail_reason is None:
169
+ fail_reason = "image_resolution_too_low"
170
+
171
+ passed = len(issues) == 0
172
+
173
+ return {
174
+ "passed": passed,
175
+ "blur_score": round(blur_score, 2),
176
+ "brightness": round(exposure["brightness"], 2),
177
+ "contrast": round(exposure["contrast"], 2),
178
+ "resolution": (resolution["width"], resolution["height"]),
179
+ "issues": issues,
180
+ "fail_reason": fail_reason,
181
+ }
src/visualization.py ADDED
@@ -0,0 +1,366 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Debug visualization utilities.
3
+
4
+ This module handles:
5
+ - Credit card overlay
6
+ - Finger contour and axis visualization
7
+ - Ring zone highlighting
8
+ - Cross-section measurement display
9
+ - Result annotation
10
+ """
11
+
12
+ import cv2
13
+ import numpy as np
14
+ from typing import Dict, Any, Optional, List, Tuple
15
+
16
+ # Import shared visualization constants
17
+ from .viz_constants import (
18
+ FONT_FACE,
19
+ Color,
20
+ FontScale,
21
+ FontThickness,
22
+ Size,
23
+ Layout,
24
+ get_scaled_font_size,
25
+ )
26
+
27
+ # Font scaling parameters (specific to final visualization)
28
+ FONT_BASE_SCALE = FontScale.BODY # Base font scale at reference height
29
+ FONT_REFERENCE_HEIGHT = 1200 # Reference image height for font scaling
30
+ FONT_MIN_SCALE = FontScale.BODY # Minimum font scale regardless of image size
31
+
32
+
33
+ def get_scaled_font_params(image_height: int) -> Dict[str, float]:
34
+ """
35
+ Calculate font parameters scaled to image dimensions.
36
+
37
+ Args:
38
+ image_height: Height of the image in pixels
39
+
40
+ Returns:
41
+ Dictionary containing scaled font parameters
42
+ """
43
+ font_scale = max(FONT_MIN_SCALE, image_height / FONT_REFERENCE_HEIGHT)
44
+ scale_factor = font_scale / FONT_BASE_SCALE
45
+
46
+ return {
47
+ "font_scale": font_scale,
48
+ "text_thickness": int(FontThickness.BODY * scale_factor),
49
+ "line_thickness": int(Size.LINE_THICK * scale_factor),
50
+ "contour_thickness": int(Size.CONTOUR_THICK * scale_factor),
51
+ "corner_radius": int(Size.CORNER_RADIUS * scale_factor),
52
+ "endpoint_radius": int(Size.ENDPOINT_RADIUS * scale_factor),
53
+ "intersection_radius": int(Size.INTERSECTION_RADIUS * scale_factor),
54
+ "text_offset": int(Layout.TEXT_OFFSET_Y * scale_factor),
55
+ "label_offset": int(Layout.LABEL_OFFSET * scale_factor),
56
+ "line_height": int(Layout.RESULT_TEXT_LINE_HEIGHT * scale_factor),
57
+ "y_start": int(Layout.RESULT_TEXT_Y_START * scale_factor),
58
+ "x_offset": int(Layout.RESULT_TEXT_X_OFFSET * scale_factor),
59
+ }
60
+
61
+
62
+ def create_debug_visualization(
63
+ image: np.ndarray,
64
+ card_result: Optional[Dict[str, Any]] = None,
65
+ contour: Optional[np.ndarray] = None,
66
+ axis_data: Optional[Dict[str, Any]] = None,
67
+ zone_data: Optional[Dict[str, Any]] = None,
68
+ width_data: Optional[Dict[str, Any]] = None,
69
+ measurement_cm: Optional[float] = None,
70
+ confidence: Optional[float] = None,
71
+ scale_px_per_cm: Optional[float] = None,
72
+ ) -> np.ndarray:
73
+ """
74
+ Create debug visualization overlay on original image.
75
+
76
+ Args:
77
+ image: Original BGR image
78
+ card_result: Credit card detection result
79
+ contour: Finger contour points
80
+ axis_data: Finger axis data
81
+ zone_data: Ring zone data
82
+ width_data: Width measurement data
83
+ measurement_cm: Final measurement in cm
84
+ confidence: Overall confidence score
85
+ scale_px_per_cm: Scale factor
86
+
87
+ Returns:
88
+ Annotated BGR image
89
+ """
90
+ # Create a copy for drawing
91
+ vis = image.copy()
92
+
93
+ # Draw credit card overlay
94
+ if card_result is not None:
95
+ vis = draw_card_overlay(vis, card_result, scale_px_per_cm)
96
+
97
+ # Draw finger contour and axis
98
+ if contour is not None:
99
+ vis = draw_finger_contour(vis, contour)
100
+
101
+ if axis_data is not None:
102
+ vis = draw_finger_axis(vis, axis_data)
103
+
104
+ # Draw ring zone
105
+ if zone_data is not None and axis_data is not None:
106
+ vis = draw_ring_zone(vis, zone_data, axis_data)
107
+
108
+ # Draw cross-section measurements
109
+ if width_data is not None and zone_data is not None:
110
+ vis = draw_cross_sections(vis, width_data)
111
+
112
+ # Add measurement annotation with JSON information
113
+ if measurement_cm is not None and confidence is not None:
114
+ vis = add_measurement_text(
115
+ vis,
116
+ measurement_cm,
117
+ confidence,
118
+ scale_px_per_cm=scale_px_per_cm,
119
+ card_detected=card_result is not None,
120
+ finger_detected=contour is not None,
121
+ view_angle_ok=True, # This is passed from caller
122
+ )
123
+
124
+ return vis
125
+
126
+
127
+ def draw_card_overlay(
128
+ image: np.ndarray,
129
+ card_result: Dict[str, Any],
130
+ scale_px_per_cm: Optional[float] = None,
131
+ ) -> np.ndarray:
132
+ """Draw credit card detection overlay."""
133
+ corners = card_result["corners"].astype(np.int32)
134
+ params = get_scaled_font_params(image.shape[0])
135
+
136
+ # Draw quadrilateral
137
+ cv2.polylines(image, [corners], isClosed=True, color=Color.CARD,
138
+ thickness=params["contour_thickness"])
139
+
140
+ # Draw corner points with labels
141
+ corner_labels = ["TL", "TR", "BR", "BL"]
142
+ for corner, label in zip(corners, corner_labels):
143
+ cv2.circle(image, tuple(corner), params["corner_radius"], Color.CARD, -1)
144
+ cv2.putText(
145
+ image,
146
+ label,
147
+ tuple(corner + np.array([params["label_offset"], -params["label_offset"]])),
148
+ FONT_FACE,
149
+ params["font_scale"],
150
+ Color.CARD,
151
+ params["text_thickness"],
152
+ )
153
+
154
+ # Add scale annotation
155
+ if scale_px_per_cm is not None:
156
+ center = np.mean(corners, axis=0).astype(np.int32)
157
+ text = f"Card: {scale_px_per_cm:.1f} px/cm"
158
+ cv2.putText(
159
+ image,
160
+ text,
161
+ tuple(center),
162
+ FONT_FACE,
163
+ params["font_scale"] * 1.2,
164
+ Color.CARD,
165
+ params["text_thickness"],
166
+ )
167
+
168
+ return image
169
+
170
+
171
+ def draw_finger_contour(
172
+ image: np.ndarray,
173
+ contour: np.ndarray,
174
+ ) -> np.ndarray:
175
+ """Draw finger contour."""
176
+ params = get_scaled_font_params(image.shape[0])
177
+ contour_int = contour.astype(np.int32).reshape((-1, 1, 2))
178
+ cv2.polylines(image, [contour_int], isClosed=True, color=Color.FINGER,
179
+ thickness=params["contour_thickness"])
180
+ return image
181
+
182
+
183
+ def draw_finger_axis(
184
+ image: np.ndarray,
185
+ axis_data: Dict[str, Any],
186
+ ) -> np.ndarray:
187
+ """Draw finger axis line."""
188
+ palm_end = axis_data["palm_end"].astype(np.int32)
189
+ tip_end = axis_data["tip_end"].astype(np.int32)
190
+ params = get_scaled_font_params(image.shape[0])
191
+
192
+ # Draw axis line
193
+ cv2.line(image, tuple(palm_end), tuple(tip_end), Color.AXIS_LINE,
194
+ params["line_thickness"])
195
+
196
+ # Mark endpoints
197
+ cv2.circle(image, tuple(palm_end), params["endpoint_radius"], Color.AXIS_PALM, -1)
198
+ cv2.circle(image, tuple(tip_end), params["endpoint_radius"], Color.AXIS_TIP, -1)
199
+
200
+ # Add labels
201
+ cv2.putText(
202
+ image,
203
+ "Palm",
204
+ tuple(palm_end + np.array([params["text_offset"], params["text_offset"]])),
205
+ FONT_FACE,
206
+ params["font_scale"],
207
+ Color.AXIS_PALM,
208
+ params["text_thickness"],
209
+ )
210
+ cv2.putText(
211
+ image,
212
+ "Tip",
213
+ tuple(tip_end + np.array([params["text_offset"], params["text_offset"]])),
214
+ FONT_FACE,
215
+ params["font_scale"],
216
+ Color.AXIS_TIP,
217
+ params["text_thickness"],
218
+ )
219
+
220
+ return image
221
+
222
+
223
+ def draw_ring_zone(
224
+ image: np.ndarray,
225
+ zone_data: Dict[str, Any],
226
+ axis_data: Dict[str, Any],
227
+ ) -> np.ndarray:
228
+ """Draw ring-wearing zone band."""
229
+ direction = axis_data["direction"]
230
+ perp = np.array([-direction[1], direction[0]], dtype=np.float32)
231
+
232
+ start_point = zone_data["start_point"]
233
+ end_point = zone_data["end_point"]
234
+
235
+ # Create zone band (perpendicular lines at start and end)
236
+ # Make the band wide enough to be visible
237
+ band_width = 200 # pixels
238
+
239
+ start_left = start_point + perp * band_width
240
+ start_right = start_point - perp * band_width
241
+ end_left = end_point + perp * band_width
242
+ end_right = end_point - perp * band_width
243
+
244
+ # Draw zone band as a semi-transparent overlay
245
+ overlay = image.copy()
246
+ zone_poly = np.array([start_left, start_right, end_right, end_left], dtype=np.int32)
247
+ cv2.fillPoly(overlay, [zone_poly], Color.RING_ZONE)
248
+ cv2.addWeighted(overlay, 0.2, image, 0.8, 0, image)
249
+
250
+ # Draw zone boundaries
251
+ params = get_scaled_font_params(image.shape[0])
252
+ cv2.line(
253
+ image,
254
+ tuple(start_left.astype(np.int32)),
255
+ tuple(start_right.astype(np.int32)),
256
+ Color.RING_ZONE,
257
+ params["line_thickness"],
258
+ )
259
+ cv2.line(
260
+ image,
261
+ tuple(end_left.astype(np.int32)),
262
+ tuple(end_right.astype(np.int32)),
263
+ Color.RING_ZONE,
264
+ params["line_thickness"],
265
+ )
266
+
267
+ # Add zone label
268
+ label_offset = int(40 * params["font_scale"] / FONT_BASE_SCALE)
269
+ label_pos = zone_data["center_point"].astype(np.int32) + np.array([band_width + label_offset, 0], dtype=np.int32)
270
+ cv2.putText(
271
+ image,
272
+ "Ring Zone",
273
+ tuple(label_pos),
274
+ FONT_FACE,
275
+ params["font_scale"] * 1.2,
276
+ Color.RING_ZONE,
277
+ params["text_thickness"],
278
+ )
279
+
280
+ return image
281
+
282
+
283
+ def draw_cross_sections(
284
+ image: np.ndarray,
285
+ width_data: Dict[str, Any],
286
+ ) -> np.ndarray:
287
+ """Draw cross-section sample lines and intersection points."""
288
+ params = get_scaled_font_params(image.shape[0])
289
+ sample_points = width_data.get("sample_points", [])
290
+
291
+ for left, right in sample_points:
292
+ left_int = tuple(np.array(left, dtype=np.int32))
293
+ right_int = tuple(np.array(right, dtype=np.int32))
294
+
295
+ # Draw cross-section line
296
+ cv2.line(image, left_int, right_int, Color.CROSS_SECTION,
297
+ max(2, params["line_thickness"] // 2))
298
+
299
+ # Draw intersection points
300
+ cv2.circle(image, left_int, params["intersection_radius"], Color.POINT, -1)
301
+ cv2.circle(image, right_int, params["intersection_radius"], Color.POINT, -1)
302
+
303
+ return image
304
+
305
+
306
+ def add_measurement_text(
307
+ image: np.ndarray,
308
+ measurement_cm: float,
309
+ confidence: float,
310
+ scale_px_per_cm: Optional[float] = None,
311
+ card_detected: bool = True,
312
+ finger_detected: bool = True,
313
+ view_angle_ok: bool = True,
314
+ ) -> np.ndarray:
315
+ """Add measurement result text overlay with JSON information."""
316
+ h, w = image.shape[:2]
317
+
318
+ # Create larger semi-transparent background for more text
319
+ overlay = image.copy()
320
+ cv2.rectangle(overlay, (10, 10), (1100, 550), (0, 0, 0), -1)
321
+ cv2.addWeighted(overlay, 0.7, image, 0.3, 0, image)
322
+
323
+ # Confidence level indicator
324
+ if confidence > 0.85:
325
+ level = "HIGH"
326
+ level_color = Color.TEXT_SUCCESS
327
+ elif confidence >= 0.6:
328
+ level = "MEDIUM"
329
+ level_color = (0, 255, 255) # Yellow
330
+ else:
331
+ level = "LOW"
332
+ level_color = Color.TEXT_ERROR
333
+
334
+ # Build text lines with JSON information
335
+ text_lines = [
336
+ ("=== MEASUREMENT RESULT ===", Color.TEXT_PRIMARY, False),
337
+ (f"Finger Diameter: {measurement_cm:.2f} cm", Color.TEXT_PRIMARY, False),
338
+ (f"Confidence: {confidence:.3f} ({level})", level_color, True),
339
+ ("", Color.TEXT_PRIMARY, False), # Empty line
340
+ ("=== QUALITY FLAGS ===", Color.TEXT_PRIMARY, False),
341
+ (f"Card Detected: {'YES' if card_detected else 'NO'}", Color.TEXT_SUCCESS if card_detected else Color.TEXT_ERROR, False),
342
+ (f"Finger Detected: {'YES' if finger_detected else 'NO'}", Color.TEXT_SUCCESS if finger_detected else Color.TEXT_ERROR, False),
343
+ (f"View Angle OK: {'YES' if view_angle_ok else 'NO'}", Color.TEXT_SUCCESS if view_angle_ok else Color.TEXT_ERROR, False),
344
+ ]
345
+
346
+ # Add scale information if available
347
+ if scale_px_per_cm is not None:
348
+ text_lines.insert(3, (f"Scale: {scale_px_per_cm:.2f} px/cm", Color.TEXT_PRIMARY, False))
349
+
350
+ # Get scaled font parameters
351
+ params = get_scaled_font_params(image.shape[0])
352
+
353
+ for i, (text, color, is_bold) in enumerate(text_lines):
354
+ if text: # Skip empty lines for drawing
355
+ thickness = params["text_thickness"] + 1 if is_bold else params["text_thickness"]
356
+ cv2.putText(
357
+ image,
358
+ text,
359
+ (params["x_offset"], params["y_start"] + i * params["line_height"]),
360
+ FONT_FACE,
361
+ params["font_scale"],
362
+ color,
363
+ thickness,
364
+ )
365
+
366
+ return image
src/viz_constants.py ADDED
@@ -0,0 +1,306 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Shared visualization constants for debug output across all algorithms.
3
+
4
+ This module provides centralized configuration for fonts, colors, sizes, and
5
+ layout used in debug visualizations throughout the Ring Sizer system.
6
+
7
+ Used by:
8
+ - card_detection.py - Multi-strategy card detection debug output
9
+ - finger_segmentation.py - Hand/finger detection debug output
10
+ - geometry.py - Axis, zone, measurement debug output
11
+ - visualization.py - Final composite debug overlay
12
+ - confidence.py - Confidence visualization
13
+
14
+ Example usage:
15
+ from viz_constants import Color, FontScale, FontThickness, FONT_FACE
16
+
17
+ cv2.putText(img, "Title", (20, 100), FONT_FACE,
18
+ FontScale.TITLE, Color.WHITE,
19
+ FontThickness.TITLE_OUTLINE, cv2.LINE_AA)
20
+ """
21
+
22
+ import cv2
23
+ from typing import Tuple
24
+
25
+ # ============================================================================
26
+ # FONT SETTINGS
27
+ # ============================================================================
28
+
29
+ # Font face used across all visualizations
30
+ FONT_FACE = cv2.FONT_HERSHEY_SIMPLEX
31
+
32
+
33
+ class FontScale:
34
+ """
35
+ Font scale constants for text hierarchy levels.
36
+
37
+ Larger values = bigger text. These are base scales that may be
38
+ adjusted based on image size in some visualizations.
39
+ """
40
+ TITLE = 3.5 # Main titles (e.g., "Card Detection", "Final Result")
41
+ SUBTITLE = 2.5 # Section headers (e.g., "Score: 0.85")
42
+ LABEL = 1.8 # Inline labels (e.g., "#1 Score:0.83")
43
+ BODY = 1.5 # Body text (normal annotations)
44
+ SMALL = 1.0 # Small text (fine details)
45
+
46
+
47
+ class FontThickness:
48
+ """
49
+ Font thickness (stroke width) for text rendering.
50
+
51
+ Larger values = thicker/bolder text.
52
+ Use OUTLINE variants for background layer to create outlined text effect.
53
+ """
54
+ # Main text thickness
55
+ TITLE = 7
56
+ SUBTITLE = 5
57
+ LABEL = 4
58
+ BODY = 2
59
+
60
+ # Outline/shadow thickness (draw first for outline effect)
61
+ TITLE_OUTLINE = 10
62
+ SUBTITLE_OUTLINE = 8
63
+ LABEL_OUTLINE = 6
64
+ BODY_OUTLINE = 4
65
+
66
+
67
+ # ============================================================================
68
+ # COLORS (BGR format for OpenCV)
69
+ # ============================================================================
70
+
71
+ class Color:
72
+ """
73
+ Standard colors used across all visualizations.
74
+
75
+ All colors in BGR format (Blue, Green, Red) as required by OpenCV.
76
+ Example: (255, 255, 255) = White in BGR
77
+
78
+ Usage:
79
+ cv2.circle(img, center, radius, Color.GREEN, -1)
80
+ """
81
+ # ========================================================================
82
+ # Basic Colors
83
+ # ========================================================================
84
+ WHITE = (255, 255, 255)
85
+ BLACK = (0, 0, 0)
86
+ RED = (0, 0, 255) # BGR: (0, 0, 255)
87
+ GREEN = (0, 255, 0) # BGR: (0, 255, 0)
88
+ BLUE = (255, 0, 0) # BGR: (255, 0, 0)
89
+
90
+ # ========================================================================
91
+ # Extended Palette
92
+ # ========================================================================
93
+ CYAN = (255, 255, 0) # BGR: (255, 255, 0)
94
+ YELLOW = (0, 255, 255) # BGR: (0, 255, 255)
95
+ MAGENTA = (255, 0, 255) # BGR: (255, 0, 255)
96
+ ORANGE = (0, 128, 255) # BGR: (0, 128, 255)
97
+ PINK = (128, 128, 255) # BGR: (128, 128, 255)
98
+
99
+ # ========================================================================
100
+ # Semantic Colors (what they represent in the system)
101
+ # ========================================================================
102
+
103
+ # Object colors
104
+ CARD = GREEN # Credit card outline
105
+ FINGER = MAGENTA # Finger contour
106
+
107
+ # Axis/geometry colors
108
+ AXIS_PALM = CYAN # Palm-side axis endpoint
109
+ AXIS_TIP = ORANGE # Fingertip axis endpoint
110
+ AXIS_LINE = YELLOW # Finger principal axis line
111
+
112
+ # Measurement colors
113
+ RING_ZONE = CYAN # Ring-wearing zone overlay
114
+ CROSS_SECTION = ORANGE # Cross-section lines
115
+ POINT = BLUE # Intersection/measurement points
116
+
117
+ # Text colors
118
+ TEXT_PRIMARY = WHITE # Primary text (titles, main info)
119
+ TEXT_SUCCESS = GREEN # Success messages
120
+ TEXT_ERROR = RED # Error messages
121
+ TEXT_WARNING = YELLOW # Warning messages
122
+
123
+
124
+ class StrategyColor:
125
+ """
126
+ Colors for different card detection strategies.
127
+
128
+ Used to visually distinguish candidates from different detection methods
129
+ in debug visualizations.
130
+ """
131
+ CANNY = Color.CYAN # Canny edge detection (cyan)
132
+ ADAPTIVE = Color.ORANGE # Adaptive thresholding (orange)
133
+ OTSU = Color.MAGENTA # Otsu's thresholding (magenta)
134
+ COLOR_BASED = Color.GREEN # Color-based detection (green)
135
+ ALL_CANDIDATES = Color.PINK # Combined candidates (pink/purple)
136
+
137
+
138
+ # ============================================================================
139
+ # DRAWING SIZES
140
+ # ============================================================================
141
+
142
+ class Size:
143
+ """
144
+ Size constants for drawing geometric elements (circles, lines, etc.).
145
+
146
+ All sizes in pixels.
147
+ """
148
+ # Circle radii
149
+ CORNER_RADIUS = 8 # Card corners, small points
150
+ ENDPOINT_RADIUS = 15 # Axis endpoints (palm/tip)
151
+ INTERSECTION_RADIUS = 8 # Cross-section intersection points
152
+ POINT_RADIUS = 5 # Generic points
153
+
154
+ # Line thicknesses
155
+ CONTOUR_THICK = 5 # Thick contours (finger, card)
156
+ CONTOUR_NORMAL = 3 # Normal contours (candidates)
157
+ LINE_THICK = 4 # Thick lines (axis)
158
+ LINE_NORMAL = 2 # Normal lines (cross-sections)
159
+ LINE_THIN = 1 # Thin lines (grid, reference)
160
+
161
+
162
+ # ============================================================================
163
+ # LAYOUT CONSTANTS
164
+ # ============================================================================
165
+
166
+ class Layout:
167
+ """
168
+ Layout positioning constants for text and elements.
169
+
170
+ All positions in pixels from top-left corner.
171
+ """
172
+ # Title positioning (top-left text block)
173
+ TITLE_Y = 100 # Y position for main title
174
+ SUBTITLE_Y = 200 # Y position for subtitle/secondary text
175
+ LINE_SPACING = 100 # Vertical spacing between text lines
176
+
177
+ # Text offsets
178
+ TEXT_OFFSET_X = 20 # Horizontal margin from left edge
179
+ TEXT_OFFSET_Y = 25 # Vertical offset for inline text
180
+ LABEL_OFFSET = 20 # Offset for labels near objects
181
+
182
+ # Result text area (final visualization)
183
+ RESULT_TEXT_Y_START = 60 # Starting Y for result text block
184
+ RESULT_TEXT_LINE_HEIGHT = 55 # Height between result text lines
185
+ RESULT_TEXT_X_OFFSET = 40 # X offset for result text
186
+
187
+
188
+ # ============================================================================
189
+ # HELPER FUNCTIONS
190
+ # ============================================================================
191
+
192
+ def get_scaled_font_size(base_scale: float, image_height: int,
193
+ reference_height: int = 1200,
194
+ min_scale: float = 1.5) -> float:
195
+ """
196
+ Scale font size based on image dimensions for consistent appearance.
197
+
198
+ Args:
199
+ base_scale: Base font scale (e.g., FontScale.TITLE)
200
+ image_height: Height of the image in pixels
201
+ reference_height: Reference height for scaling (default: 1200px)
202
+ min_scale: Minimum scale to prevent text from being too small
203
+
204
+ Returns:
205
+ Scaled font size adjusted for image dimensions
206
+
207
+ Example:
208
+ # For a 2400px tall image, double the font size
209
+ scale = get_scaled_font_size(FontScale.TITLE, 2400)
210
+ # scale = 3.5 * 2 = 7.0
211
+ """
212
+ scale_factor = image_height / reference_height
213
+ scaled = base_scale * scale_factor
214
+ return max(scaled, min_scale)
215
+
216
+
217
+ def create_outlined_text(image, text, position, font_scale,
218
+ color, outline_color=None,
219
+ thickness=None, outline_thickness=None):
220
+ """
221
+ Draw text with outline for better visibility.
222
+
223
+ Args:
224
+ image: Image to draw on
225
+ text: Text string to draw
226
+ position: (x, y) position tuple
227
+ font_scale: Font scale (from FontScale)
228
+ color: Main text color (from Color)
229
+ outline_color: Outline color (default: Color.WHITE)
230
+ thickness: Main text thickness (auto-selected if None)
231
+ outline_thickness: Outline thickness (auto-selected if None)
232
+
233
+ Example:
234
+ create_outlined_text(img, "Title", (20, 100),
235
+ FontScale.TITLE, Color.GREEN)
236
+ """
237
+ if outline_color is None:
238
+ outline_color = Color.WHITE
239
+
240
+ # Auto-select thickness based on font scale
241
+ if thickness is None:
242
+ if font_scale >= FontScale.TITLE:
243
+ thickness = FontThickness.TITLE
244
+ elif font_scale >= FontScale.SUBTITLE:
245
+ thickness = FontThickness.SUBTITLE
246
+ elif font_scale >= FontScale.LABEL:
247
+ thickness = FontThickness.LABEL
248
+ else:
249
+ thickness = FontThickness.BODY
250
+
251
+ if outline_thickness is None:
252
+ outline_thickness = thickness + 3
253
+
254
+ # Draw outline first (background layer)
255
+ cv2.putText(image, text, position, FONT_FACE,
256
+ font_scale, outline_color, outline_thickness, cv2.LINE_AA)
257
+
258
+ # Draw main text on top
259
+ cv2.putText(image, text, position, FONT_FACE,
260
+ font_scale, color, thickness, cv2.LINE_AA)
261
+
262
+
263
+ # ============================================================================
264
+ # VALIDATION (Optional: for type checking and debugging)
265
+ # ============================================================================
266
+
267
+ def validate_color(color: Tuple[int, int, int]) -> bool:
268
+ """
269
+ Validate that a color tuple is in correct BGR format.
270
+
271
+ Args:
272
+ color: Tuple of (B, G, R) values
273
+
274
+ Returns:
275
+ True if valid, False otherwise
276
+ """
277
+ if not isinstance(color, tuple) or len(color) != 3:
278
+ return False
279
+ return all(0 <= val <= 255 for val in color)
280
+
281
+
282
+ # ============================================================================
283
+ # EXPORTS
284
+ # ============================================================================
285
+
286
+ __all__ = [
287
+ # Font settings
288
+ 'FONT_FACE',
289
+ 'FontScale',
290
+ 'FontThickness',
291
+
292
+ # Colors
293
+ 'Color',
294
+ 'StrategyColor',
295
+
296
+ # Sizes
297
+ 'Size',
298
+
299
+ # Layout
300
+ 'Layout',
301
+
302
+ # Helper functions
303
+ 'get_scaled_font_size',
304
+ 'create_outlined_text',
305
+ 'validate_color',
306
+ ]
web_demo/README.md ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Web Demo
2
+
3
+ Local Flask demo for ring-size-cv. Upload an image, run measurement, and return JSON + debug overlay.
4
+
5
+ ## Setup
6
+
7
+ ```bash
8
+ cd /Users/fengxie/Build/ring-size-cv
9
+ python -m venv .venv
10
+ source .venv/bin/activate
11
+ pip install -r requirements.txt
12
+ ```
13
+
14
+ ## Run
15
+
16
+ ```bash
17
+ python web_demo/app.py
18
+ ```
19
+
20
+ Open `http://localhost:8000`.
21
+
22
+ ## Notes
23
+ - Uploads stored in `web_demo/uploads/`
24
+ - Results stored in `web_demo/results/`
25
+ - Debug overlay auto-generated per request
26
+ - Default guided sample image is at `web_demo/static/examples/default_sample.jpg`
27
+ - `Start Measurement` uses the default sample image when no upload is selected
28
+ - Web demo enforces Sobel edge refinement only (`edge_method=sobel`)
web_demo/app.py ADDED
@@ -0,0 +1,142 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Simple web demo for ring-size-cv.
3
+
4
+ Upload an image, run measurement, and return JSON + debug overlay.
5
+ """
6
+
7
+ from __future__ import annotations
8
+
9
+ import json
10
+ import sys
11
+ import uuid
12
+ from pathlib import Path
13
+ from typing import Dict, Any
14
+
15
+ import cv2
16
+ from flask import Flask, jsonify, render_template, request, send_from_directory
17
+ from werkzeug.utils import secure_filename
18
+
19
+ ROOT_DIR = Path(__file__).resolve().parents[1]
20
+ sys.path.insert(0, str(ROOT_DIR))
21
+
22
+ from measure_finger import measure_finger
23
+
24
+ APP_ROOT = Path(__file__).resolve().parent
25
+ UPLOAD_DIR = APP_ROOT / "uploads"
26
+ RESULTS_DIR = APP_ROOT / "results"
27
+ DEFAULT_SAMPLE_PATH = APP_ROOT / "static" / "examples" / "default_sample.jpg"
28
+ DEFAULT_SAMPLE_URL = "/static/examples/default_sample.jpg"
29
+ ALLOWED_EXTENSIONS = {".jpg", ".jpeg", ".png"}
30
+ DEMO_EDGE_METHOD = "sobel"
31
+
32
+ app = Flask(__name__)
33
+
34
+
35
+ def _allowed_file(filename: str) -> bool:
36
+ return Path(filename).suffix.lower() in ALLOWED_EXTENSIONS
37
+
38
+
39
+ def _save_json(path: Path, data: Dict[str, Any]) -> None:
40
+ path.parent.mkdir(parents=True, exist_ok=True)
41
+ with path.open("w", encoding="utf-8") as f:
42
+ json.dump(data, f, indent=2, ensure_ascii=False)
43
+
44
+
45
+ @app.route("/")
46
+ def index():
47
+ return render_template("index.html", default_sample_url=DEFAULT_SAMPLE_URL)
48
+
49
+
50
+ @app.route("/results/<path:filename>")
51
+ def serve_result(filename: str):
52
+ return send_from_directory(RESULTS_DIR, filename)
53
+
54
+
55
+ @app.route("/uploads/<path:filename>")
56
+ def serve_upload(filename: str):
57
+ return send_from_directory(UPLOAD_DIR, filename)
58
+
59
+
60
+ @app.route("/api/measure", methods=["POST"])
61
+ def api_measure():
62
+ if "image" not in request.files:
63
+ return jsonify({"success": False, "error": "Missing image file"}), 400
64
+
65
+ file = request.files["image"]
66
+ if file.filename == "":
67
+ return jsonify({"success": False, "error": "Empty filename"}), 400
68
+
69
+ if not _allowed_file(file.filename):
70
+ return jsonify({"success": False, "error": "Unsupported file type"}), 400
71
+
72
+ finger_index = request.form.get("finger_index", "index")
73
+ run_id = uuid.uuid4().hex[:12]
74
+ safe_name = secure_filename(file.filename)
75
+ upload_name = f"{run_id}__{safe_name}"
76
+ upload_path = UPLOAD_DIR / upload_name
77
+ upload_path.parent.mkdir(parents=True, exist_ok=True)
78
+ file.save(upload_path)
79
+
80
+ image = cv2.imread(str(upload_path))
81
+ if image is None:
82
+ return jsonify({"success": False, "error": "Failed to load image"}), 400
83
+
84
+ return _run_measurement(
85
+ image=image,
86
+ finger_index=finger_index,
87
+ input_image_url=f"/uploads/{upload_name}",
88
+ )
89
+
90
+
91
+ @app.route("/api/measure-default", methods=["POST"])
92
+ def api_measure_default():
93
+ finger_index = request.form.get("finger_index", "index")
94
+ if not DEFAULT_SAMPLE_PATH.exists():
95
+ return jsonify({"success": False, "error": "Default sample image not found"}), 500
96
+
97
+ image = cv2.imread(str(DEFAULT_SAMPLE_PATH))
98
+ if image is None:
99
+ return jsonify({"success": False, "error": "Failed to load default sample image"}), 500
100
+
101
+ return _run_measurement(
102
+ image=image,
103
+ finger_index=finger_index,
104
+ input_image_url=DEFAULT_SAMPLE_URL,
105
+ )
106
+
107
+
108
+ def _run_measurement(
109
+ image,
110
+ finger_index: str,
111
+ input_image_url: str,
112
+ ):
113
+ run_id = uuid.uuid4().hex[:12]
114
+
115
+ result_png_name = f"{run_id}__result.png"
116
+ result_png_path = RESULTS_DIR / result_png_name
117
+
118
+ result = measure_finger(
119
+ image=image,
120
+ finger_index=finger_index,
121
+ edge_method=DEMO_EDGE_METHOD,
122
+ result_png_path=str(result_png_path),
123
+ save_debug=False,
124
+ )
125
+
126
+ result_json_name = f"{run_id}__result.json"
127
+ result_json_path = RESULTS_DIR / result_json_name
128
+ _save_json(result_json_path, result)
129
+
130
+ payload = {
131
+ "success": result.get("fail_reason") is None,
132
+ "result": result,
133
+ "result_image_url": f"/results/{result_png_name}",
134
+ "input_image_url": input_image_url,
135
+ "result_json_url": f"/results/{result_json_name}",
136
+ }
137
+
138
+ return jsonify(payload)
139
+
140
+
141
+ if __name__ == "__main__":
142
+ app.run(host="0.0.0.0", port=8000, debug=True)
web_demo/static/app.js ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ const form = document.getElementById("measureForm");
2
+ const imageInput = document.getElementById("imageInput");
3
+ const statusText = document.getElementById("statusText");
4
+ const inputPreview = document.getElementById("inputPreview");
5
+ const debugPreview = document.getElementById("debugPreview");
6
+ const inputFrame = document.getElementById("inputFrame");
7
+ const debugFrame = document.getElementById("debugFrame");
8
+ const jsonOutput = document.getElementById("jsonOutput");
9
+ const jsonLink = document.getElementById("jsonLink");
10
+ const defaultSampleUrl = window.DEFAULT_SAMPLE_URL || "";
11
+ const failReasonMessageMap = {
12
+ card_not_detected:
13
+ "Credit card not detected. Place a full card flat beside your hand.",
14
+ hand_not_detected:
15
+ "Hand not detected. Include your full palm in frame and keep fingers fully visible.",
16
+ finger_isolation_failed:
17
+ "Could not isolate the selected finger. Keep one target finger extended and separated.",
18
+ finger_mask_too_small:
19
+ "Finger region is too small. Move closer and use a higher-resolution photo.",
20
+ contour_extraction_failed:
21
+ "Finger contour extraction failed. Improve lighting and reduce background clutter.",
22
+ axis_estimation_failed:
23
+ "Finger axis estimation failed. Keep the finger straight and fully visible.",
24
+ zone_localization_failed:
25
+ "Ring zone localization failed. Keep more of the finger base visible.",
26
+ width_measurement_failed:
27
+ "Width measurement failed. Retake with phone parallel to the table and steady focus.",
28
+ sobel_edge_refinement_failed:
29
+ "Edge refinement failed. Turn on flash or use stronger, even lighting.",
30
+ width_unreasonable:
31
+ "Measured width is out of range. Retake with the phone parallel to the table.",
32
+ disagreement_with_contour:
33
+ "Edge methods disagree too much. Retake with cleaner edges and more even lighting.",
34
+ };
35
+
36
+ const formatFailReasonStatus = (failReason) => {
37
+ if (!failReason) {
38
+ return "Measurement failed.";
39
+ }
40
+
41
+ if (failReason.startsWith("quality_score_low_")) {
42
+ return `Low edge quality detected. Turn on flash and retake. (${failReason})`;
43
+ }
44
+
45
+ if (failReason.startsWith("consistency_low_")) {
46
+ return `Edge detection was inconsistent. Keep phone parallel to table and retry. (${failReason})`;
47
+ }
48
+
49
+ const friendlyMessage = failReasonMessageMap[failReason];
50
+ if (friendlyMessage) {
51
+ return `${friendlyMessage} (${failReason})`;
52
+ }
53
+
54
+ return `Measurement failed: ${failReason}`;
55
+ };
56
+
57
+ const setStatus = (text) => {
58
+ statusText.textContent = text;
59
+ };
60
+
61
+ const showImage = (imgEl, frameEl, url) => {
62
+ if (!url) return;
63
+ imgEl.src = url;
64
+ frameEl.classList.add("show");
65
+ frameEl.querySelector(".placeholder").style.display = "none";
66
+ };
67
+
68
+ const buildMeasureSettings = () => {
69
+ const fingerSelect = form.querySelector('select[name="finger_index"]');
70
+ return {
71
+ finger_index: fingerSelect ? fingerSelect.value : "index",
72
+ edge_method: "sobel",
73
+ };
74
+ };
75
+
76
+ const runMeasurement = async (endpoint, formData, inputUrlFallback = "") => {
77
+ setStatus("Measuring… Please wait.");
78
+ jsonOutput.textContent = "{\n \"status\": \"processing\"\n}";
79
+
80
+ try {
81
+ const response = await fetch(endpoint, {
82
+ method: "POST",
83
+ body: formData,
84
+ });
85
+
86
+ if (!response.ok) {
87
+ const error = await response.json();
88
+ setStatus(error.error || "Measurement failed");
89
+ return;
90
+ }
91
+
92
+ const data = await response.json();
93
+ jsonOutput.textContent = JSON.stringify(data.result, null, 2);
94
+ jsonLink.href = data.result_json_url || "#";
95
+
96
+ showImage(inputPreview, inputFrame, data.input_image_url || inputUrlFallback);
97
+ showImage(debugPreview, debugFrame, data.result_image_url);
98
+
99
+ if (data.success) {
100
+ setStatus("Measurement complete. Results updated.");
101
+ } else {
102
+ const failReason = data?.result?.fail_reason;
103
+ setStatus(formatFailReasonStatus(failReason));
104
+ }
105
+ } catch (error) {
106
+ setStatus("Network error. Please retry.");
107
+ }
108
+ };
109
+
110
+ imageInput.addEventListener("change", () => {
111
+ const file = imageInput.files[0];
112
+ if (!file) {
113
+ setStatus("Sample image loaded. Upload your own photo or click Start Measurement.");
114
+ if (defaultSampleUrl) {
115
+ showImage(inputPreview, inputFrame, defaultSampleUrl);
116
+ }
117
+ return;
118
+ }
119
+ const url = URL.createObjectURL(file);
120
+ showImage(inputPreview, inputFrame, url);
121
+ setStatus("Image ready. Click to start measurement.");
122
+ });
123
+
124
+ form.addEventListener("submit", async (event) => {
125
+ event.preventDefault();
126
+
127
+ const settings = buildMeasureSettings();
128
+ const formData = new FormData();
129
+ formData.append("finger_index", settings.finger_index);
130
+ formData.append("edge_method", settings.edge_method);
131
+
132
+ const file = imageInput.files[0];
133
+ if (file) {
134
+ formData.append("image", file);
135
+ await runMeasurement("/api/measure", formData);
136
+ return;
137
+ }
138
+
139
+ await runMeasurement("/api/measure-default", formData, defaultSampleUrl);
140
+ });
141
+
142
+ if (defaultSampleUrl) {
143
+ showImage(inputPreview, inputFrame, defaultSampleUrl);
144
+ setStatus("Sample image loaded. Upload your own photo or click Start Measurement.");
145
+ }
web_demo/static/examples/default_sample.jpg ADDED

Git LFS Details

  • SHA256: 1262e998f9e465492be2cb595ad04a0450c7bea5e37a33eeb28ff7a056c50261
  • Pointer size: 132 Bytes
  • Size of remote file: 1.62 MB
web_demo/static/styles.css ADDED
@@ -0,0 +1,288 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ :root {
2
+ --bg-1: #f5f1e7;
3
+ --bg-2: #eedad5;
4
+ --bg-3: #e7efe8;
5
+ --ink: #2b1f1f;
6
+ --ink-soft: #4b3d3d;
7
+ --accent: #bf3a2b;
8
+ --accent-dark: #8f2b22;
9
+ --sand: #f9f4ec;
10
+ --shadow: rgba(34, 26, 26, 0.12);
11
+ --border: rgba(45, 33, 33, 0.18);
12
+ }
13
+
14
+ * {
15
+ box-sizing: border-box;
16
+ }
17
+
18
+ body {
19
+ margin: 0;
20
+ min-height: 100vh;
21
+ color: var(--ink);
22
+ background: radial-gradient(circle at 10% 20%, var(--bg-3), transparent 55%),
23
+ radial-gradient(circle at 80% 10%, var(--bg-2), transparent 50%),
24
+ linear-gradient(140deg, var(--bg-1), #fff8f2 60%, #f0e2d8 100%);
25
+ font-family: "Iowan Old Style", "Palatino", "Book Antiqua", "Times New Roman", serif;
26
+ }
27
+
28
+ .background-orbit {
29
+ position: fixed;
30
+ inset: -30% 10% auto auto;
31
+ width: 60vw;
32
+ height: 60vw;
33
+ background: conic-gradient(from 120deg, rgba(191, 58, 43, 0.2), transparent, rgba(91, 44, 120, 0.18));
34
+ border-radius: 50%;
35
+ filter: blur(10px);
36
+ opacity: 0.6;
37
+ z-index: 0;
38
+ animation: slow-spin 40s linear infinite;
39
+ }
40
+
41
+ .background-glow {
42
+ position: fixed;
43
+ inset: auto auto -15% -10%;
44
+ width: 55vw;
45
+ height: 55vw;
46
+ background: radial-gradient(circle, rgba(191, 58, 43, 0.18), transparent 70%);
47
+ border-radius: 50%;
48
+ filter: blur(20px);
49
+ z-index: 0;
50
+ }
51
+
52
+ @keyframes slow-spin {
53
+ from { transform: rotate(0deg); }
54
+ to { transform: rotate(360deg); }
55
+ }
56
+
57
+ .hero {
58
+ position: relative;
59
+ z-index: 1;
60
+ display: grid;
61
+ grid-template-columns: minmax(280px, 1.2fr) minmax(280px, 0.9fr);
62
+ gap: 32px;
63
+ padding: 72px 8vw 48px;
64
+ align-items: center;
65
+ }
66
+
67
+ .hero-copy h1 {
68
+ font-family: "Futura", "Gill Sans", "Optima", "Trebuchet MS", sans-serif;
69
+ font-size: clamp(2.2rem, 4vw, 3.4rem);
70
+ margin: 0 0 12px;
71
+ letter-spacing: 0.02em;
72
+ }
73
+
74
+ .hero-kicker {
75
+ text-transform: uppercase;
76
+ letter-spacing: 0.18em;
77
+ font-size: 0.75rem;
78
+ font-weight: 600;
79
+ color: var(--accent-dark);
80
+ margin: 0 0 12px;
81
+ }
82
+
83
+ .hero-sub {
84
+ font-size: 1.05rem;
85
+ line-height: 1.7;
86
+ color: var(--ink-soft);
87
+ max-width: 36ch;
88
+ }
89
+
90
+ .hero-card {
91
+ background: rgba(255, 255, 255, 0.75);
92
+ border: 1px solid var(--border);
93
+ border-radius: 24px;
94
+ padding: 28px;
95
+ box-shadow: 0 24px 60px var(--shadow);
96
+ backdrop-filter: blur(8px);
97
+ animation: rise-in 0.8s ease;
98
+ }
99
+
100
+ @keyframes rise-in {
101
+ from { transform: translateY(16px); opacity: 0; }
102
+ to { transform: translateY(0); opacity: 1; }
103
+ }
104
+
105
+ .file-drop {
106
+ display: flex;
107
+ flex-direction: column;
108
+ gap: 8px;
109
+ padding: 24px;
110
+ border: 1.5px dashed var(--accent);
111
+ border-radius: 18px;
112
+ background: var(--sand);
113
+ cursor: pointer;
114
+ transition: transform 0.2s ease, box-shadow 0.2s ease;
115
+ }
116
+
117
+ .file-drop:hover {
118
+ transform: translateY(-2px);
119
+ box-shadow: 0 10px 20px rgba(191, 58, 43, 0.15);
120
+ }
121
+
122
+ .file-drop input {
123
+ display: none;
124
+ }
125
+
126
+ .file-title {
127
+ font-size: 1.1rem;
128
+ font-weight: 600;
129
+ }
130
+
131
+ .file-hint {
132
+ font-size: 0.9rem;
133
+ color: var(--ink-soft);
134
+ }
135
+
136
+ .controls {
137
+ display: grid;
138
+ grid-template-columns: repeat(auto-fit, minmax(160px, 1fr));
139
+ gap: 16px;
140
+ margin: 20px 0;
141
+ }
142
+
143
+ .controls label {
144
+ display: flex;
145
+ flex-direction: column;
146
+ gap: 6px;
147
+ font-size: 0.9rem;
148
+ color: var(--ink-soft);
149
+ }
150
+
151
+ select {
152
+ border: 1px solid var(--border);
153
+ border-radius: 12px;
154
+ padding: 10px 12px;
155
+ font-size: 0.95rem;
156
+ background: white;
157
+ color: var(--ink);
158
+ }
159
+
160
+ .primary {
161
+ width: 100%;
162
+ border: none;
163
+ border-radius: 14px;
164
+ padding: 12px 16px;
165
+ font-size: 1rem;
166
+ font-weight: 600;
167
+ color: white;
168
+ background: linear-gradient(120deg, var(--accent), #e25f4f);
169
+ cursor: pointer;
170
+ transition: transform 0.2s ease, box-shadow 0.2s ease;
171
+ }
172
+
173
+ .primary:hover {
174
+ transform: translateY(-1px);
175
+ box-shadow: 0 12px 24px rgba(191, 58, 43, 0.25);
176
+ }
177
+
178
+ .status {
179
+ margin-top: 12px;
180
+ font-size: 0.9rem;
181
+ color: var(--ink-soft);
182
+ }
183
+
184
+ .content {
185
+ position: relative;
186
+ z-index: 1;
187
+ padding: 0 8vw 80px;
188
+ display: flex;
189
+ flex-direction: column;
190
+ gap: 28px;
191
+ }
192
+
193
+ .preview,
194
+ .result {
195
+ display: grid;
196
+ grid-template-columns: repeat(auto-fit, minmax(280px, 1fr));
197
+ gap: 24px;
198
+ }
199
+
200
+ .panel {
201
+ background: rgba(255, 255, 255, 0.78);
202
+ border-radius: 20px;
203
+ border: 1px solid var(--border);
204
+ padding: 20px;
205
+ box-shadow: 0 18px 40px rgba(43, 31, 31, 0.08);
206
+ backdrop-filter: blur(6px);
207
+ }
208
+
209
+ .panel h2 {
210
+ margin: 0 0 12px;
211
+ font-family: "Futura", "Gill Sans", "Optima", "Trebuchet MS", sans-serif;
212
+ }
213
+
214
+ .panel-head {
215
+ display: flex;
216
+ justify-content: space-between;
217
+ align-items: center;
218
+ gap: 12px;
219
+ }
220
+
221
+ .ghost {
222
+ text-decoration: none;
223
+ font-size: 0.85rem;
224
+ color: var(--accent-dark);
225
+ border: 1px solid var(--border);
226
+ padding: 6px 10px;
227
+ border-radius: 999px;
228
+ background: white;
229
+ }
230
+
231
+ .image-frame {
232
+ position: relative;
233
+ border-radius: 16px;
234
+ overflow: hidden;
235
+ background: #f6efea;
236
+ min-height: 260px;
237
+ display: grid;
238
+ place-items: center;
239
+ }
240
+
241
+ .image-frame img {
242
+ width: 100%;
243
+ height: auto;
244
+ display: none;
245
+ }
246
+
247
+ .image-frame.show img {
248
+ display: block;
249
+ }
250
+
251
+ .placeholder {
252
+ color: var(--ink-soft);
253
+ font-size: 0.95rem;
254
+ }
255
+
256
+ pre {
257
+ background: #1f1717;
258
+ color: #f7ece8;
259
+ padding: 16px;
260
+ border-radius: 16px;
261
+ min-height: 240px;
262
+ overflow: auto;
263
+ font-size: 0.85rem;
264
+ line-height: 1.6;
265
+ }
266
+
267
+ .tips ul {
268
+ margin: 0;
269
+ padding-left: 0;
270
+ list-style: none;
271
+ color: var(--ink-soft);
272
+ line-height: 1.7;
273
+ }
274
+
275
+ .tips li + li {
276
+ margin-top: 4px;
277
+ }
278
+
279
+ @media (max-width: 960px) {
280
+ .hero {
281
+ grid-template-columns: 1fr;
282
+ padding: 56px 6vw 36px;
283
+ }
284
+
285
+ .hero-copy h1 {
286
+ font-size: 2.4rem;
287
+ }
288
+ }
web_demo/templates/index.html ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!doctype html>
2
+ <html lang="zh-CN">
3
+ <head>
4
+ <meta charset="utf-8" />
5
+ <meta name="viewport" content="width=device-width, initial-scale=1" />
6
+ <title>Ring Size CV Demo</title>
7
+ <link rel="stylesheet" href="/static/styles.css" />
8
+ </head>
9
+ <body>
10
+ <div class="background-orbit"></div>
11
+ <div class="background-glow"></div>
12
+
13
+ <header class="hero">
14
+ <div class="hero-copy">
15
+ <p class="hero-kicker">Ring Size CV · Web Demo</p>
16
+ <h1>Upload a photo to quickly measure ring size</h1>
17
+ <p class="hero-sub">
18
+ Runs locally with no cloud upload. Results include JSON output and a visual overlay.
19
+ </p>
20
+ </div>
21
+ <div class="hero-card">
22
+ <form id="measureForm">
23
+ <label class="file-drop" for="imageInput">
24
+ <input id="imageInput" name="image" type="file" accept="image/*" />
25
+ <span class="file-title">Click or drag to upload a photo</span>
26
+ <span class="file-hint">JPG / PNG supported · 1080p or higher recommended</span>
27
+ </label>
28
+
29
+ <div class="controls">
30
+ <label>
31
+ <span>Finger Selection</span>
32
+ <select name="finger_index">
33
+ <option value="index" selected>Index (Default)</option>
34
+ <option value="middle">Middle</option>
35
+ <option value="ring">Ring</option>
36
+ <option value="pinky">Pinky</option>
37
+ <option value="auto">Auto</option>
38
+ </select>
39
+ </label>
40
+ <label>
41
+ <span>Edge Method</span>
42
+ <select name="edge_method" disabled aria-disabled="true">
43
+ <option value="sobel" selected>Sobel (Locked)</option>
44
+ </select>
45
+ </label>
46
+ </div>
47
+
48
+ <button class="primary" type="submit">Start Measurement</button>
49
+ <p class="status" id="statusText">Waiting for image…</p>
50
+ </form>
51
+ </div>
52
+ </header>
53
+
54
+ <main class="content">
55
+ <section class="preview">
56
+ <div class="panel">
57
+ <h2>Input Photo</h2>
58
+ <div class="image-frame show" id="inputFrame">
59
+ <img id="inputPreview" src="{{ default_sample_url }}" alt="Default sample photo example" />
60
+ <p class="placeholder" style="display:none;">No image yet</p>
61
+ </div>
62
+ </div>
63
+
64
+ <div class="panel">
65
+ <h2>Result Overlay</h2>
66
+ <div class="image-frame" id="debugFrame">
67
+ <img id="debugPreview" alt="" />
68
+ <p class="placeholder">Waiting for result</p>
69
+ </div>
70
+ </div>
71
+ </section>
72
+
73
+ <section class="result">
74
+ <div class="panel">
75
+ <div class="panel-head">
76
+ <h2>JSON Output</h2>
77
+ <a id="jsonLink" class="ghost" href="#" target="_blank" rel="noreferrer">Open raw JSON</a>
78
+ </div>
79
+ <pre id="jsonOutput">{}</pre>
80
+ </div>
81
+
82
+ <div class="panel tips">
83
+ <h2>Photo Tips</h2>
84
+ <ul>
85
+ <li>✓ Turn on flash</li>
86
+ <li>✓ Keep phone parallel to table</li>
87
+ <li>✓ Include full palm in frame</li>
88
+ </ul>
89
+ </div>
90
+ </section>
91
+ </main>
92
+
93
+ <script>window.DEFAULT_SAMPLE_URL = "{{ default_sample_url }}";</script>
94
+ <script src="/static/app.js"></script>
95
+ </body>
96
+ </html>