| # Inkjet CDM — Final Thesis Results |
|
|
| > All results use K=100 MC trials (Algorithm 1, difference scoring). |
| > Dataset: 1,327 total samples → 20% test split = **266 samples** (GOOD=174, BAD=92). |
| > Evaluation seed: 42 (same train/test split for all λ values). |
| > Last verified: 2026-02-26 |
|
|
| --- |
|
|
| ## 1. Overall Comparison — All λ Values |
|
|
| | λ | AUROC ↑ | Accuracy ↑ | FPR@95TPR ↓ | Δ AUROC vs baseline | |
| |---|:---:|:---:|:---:|:---:| |
| | 0.0 (baseline, no sep loss) | 0.8325 | 0.7895 | 0.8161 | — | |
| | **0.01 (best AUROC)** | **0.8603** | **0.8158** | 0.6264 | **+0.0278** | |
| | 0.02 | 0.8541 | 0.8008 | 0.6609 | +0.0216 | |
| | **0.05 (best FPR)** | 0.8553 | **0.8233** | **0.5287** | +0.0228 | |
|
|
| **Recommended thesis citation:** λ=0.01 as primary result (best AUROC), mention λ=0.05 for best operational FPR. |
|
|
| --- |
|
|
| ## 2. Per-Feature AUROC (K=100) |
|
|
| | Feature | λ=0 | λ=0.01 | λ=0.02 | λ=0.05 | Best Δ | |
| |---------|:---:|:---:|:---:|:---:|:---:| |
| | angle | 0.5556 | 0.5679 | 0.6173 | 0.5556 | +0.0617 | |
| | dist1 | 0.9000 | 0.8571 | 0.9429 | 0.9143 | — | |
| | dist6 | 0.8278 | 0.8111 | 0.8389 | 0.8278 | — | |
| | dots | 0.9126 | **0.9266** | 0.9126 | 0.8881 | +0.0140 | |
| | edge1 | 0.7760 | 0.8177 | **0.8594** | 0.8542 | **+0.0834** | |
| | edge2 | 0.7302 | 0.7242 | 0.6786 | **0.7857** | — | |
| | edge3 | 0.7188 | **0.8750** | 0.7708 | 0.8229 | **+0.1562** | |
| | edge4 | 0.6667 | **0.7361** | 0.6806 | 0.6597 | **+0.0694** | |
| | **Overall** | **0.8325** | **0.8603** | **0.8541** | **0.8553** | **+0.0278** | |
|
|
| > Note: `angle` has only 3 BAD samples in the test set (27 GOOD, 3 BAD) — AUROC is statistically unreliable for this feature. `edge3` shows the largest absolute gain (+15.62pp at λ=0.01). |
|
|
| --- |
|
|
| ## 3. Per-Feature FPR@95TPR (K=100) — lower is better |
|
|
| | Feature | λ=0 | λ=0.01 | Δ | |
| |---------|:---:|:---:|:---:| |
| | angle | 0.9630 | 0.9630 | 0.000 | |
| | dist1 | 0.2381 | 0.4286 | +0.190 (regression) | |
| | dist6 | 0.9667 | 0.9333 | **−0.033** | |
| | dots | 0.9615 | 0.8077 | **−0.154** | |
| | edge1 | 0.9167 | 0.9167 | 0.000 | |
| | edge2 | 0.8889 | 0.7778 | **−0.111** | |
| | edge3 | 0.5625 | 0.4375 | **−0.125** | |
| | edge4 | 0.5833 | 0.4583 | **−0.125** | |
| | **Overall** | **0.8161** | **0.6264** | **−0.190** | |
|
|
| > Overall FPR@95TPR drops by **19pp** at λ=0.01 — at 95% defect detection sensitivity, 19% fewer good products are falsely rejected. |
|
|
| --- |
|
|
| ## 4. Per-Template Breakdown (λ=0.01) |
|
|
| | Template | AUROC | Acc | FPR@95TPR | N | |
| |----------|:---:|:---:|:---:|:---:| |
| | A | 0.7981 | 0.7048 | 0.6863 | 105 | |
| | B | **0.8750** | 0.8214 | 0.4375 | 28 | |
| | C | 0.8325 | 0.9023 | 0.9533 | 133 | |
|
|
| --- |
|
|
| ## 5. Stochastic Variance (MC Scoring) |
|
|
| The OOD score is stochastic (K random timestep samples per image). Across 3 evaluations of the λ=0.02 checkpoint: |
|
|
| | Eval run | K | AUROC | |
| |----------|---|-------| |
| | Run 1 | 50 | 0.8581 | |
| | Run 2 | 50 | 0.8474 | |
| | Run 3 | 100 | 0.8541 | |
|
|
| **Variance ≈ ±0.005** at K=50, **±0.003** at K=100. All final results use K=100. |
|
|
| --- |
|
|
| ## 6. Training Configuration |
|
|
| ``` |
| model: CDM UNet (multi-head conditioning) |
| template + feature + quality + bbox heads |
| dataset: 1,327 inkjet print samples (174 GOOD, 92 BAD → test set) |
| train/test: 80/20 stratified split (seed=42) |
| oversampling: BAD samples × 3.0 (minority oversampling during training only) |
| image_size: crop-based (YOLO bbox region) |
| epochs: 100 |
| schedule: cosine (Nichol & Dhariwal 2021) |
| optimizer: AdamW, lr=1e-4 |
| sep_loss: L = L_MSE + λ · L_sep |
| scoring: Algorithm 1 (difference method), K=100 MC trials |
| batch_size: 64 (sep loss runs) / 128 (baseline λ=0) |
| GPU: CUDA device 3 (32 GB VRAM) |
| ``` |
|
|