Inkjet CDM — Final Thesis Results

All results use K=100 MC trials (Algorithm 1, difference scoring). Dataset: 1,327 total samples → 20% test split = 266 samples (GOOD=174, BAD=92). Evaluation seed: 42 (same train/test split for all λ values). Last verified: 2026-02-26

1. Overall Comparison — All λ Values

λ	AUROC ↑	Accuracy ↑	FPR@95TPR ↓	Δ AUROC vs baseline
0.0 (baseline, no sep loss)	0.8325	0.7895	0.8161	—
0.01 (best AUROC)	0.8603	0.8158	0.6264	+0.0278
0.02	0.8541	0.8008	0.6609	+0.0216
0.05 (best FPR)	0.8553	0.8233	0.5287	+0.0228

Recommended thesis citation: λ=0.01 as primary result (best AUROC), mention λ=0.05 for best operational FPR.

2. Per-Feature AUROC (K=100)

Feature	λ=0	λ=0.01	λ=0.02	λ=0.05	Best Δ
angle	0.5556	0.5679	0.6173	0.5556	+0.0617
dist1	0.9000	0.8571	0.9429	0.9143	—
dist6	0.8278	0.8111	0.8389	0.8278	—
dots	0.9126	0.9266	0.9126	0.8881	+0.0140
edge1	0.7760	0.8177	0.8594	0.8542	+0.0834
edge2	0.7302	0.7242	0.6786	0.7857	—
edge3	0.7188	0.8750	0.7708	0.8229	+0.1562
edge4	0.6667	0.7361	0.6806	0.6597	+0.0694
Overall	0.8325	0.8603	0.8541	0.8553	+0.0278

Note: angle has only 3 BAD samples in the test set (27 GOOD, 3 BAD) — AUROC is statistically unreliable for this feature. edge3 shows the largest absolute gain (+15.62pp at λ=0.01).

3. Per-Feature FPR@95TPR (K=100) — lower is better

Feature	λ=0	λ=0.01	Δ
angle	0.9630	0.9630	0.000
dist1	0.2381	0.4286	+0.190 (regression)
dist6	0.9667	0.9333	−0.033
dots	0.9615	0.8077	−0.154
edge1	0.9167	0.9167	0.000
edge2	0.8889	0.7778	−0.111
edge3	0.5625	0.4375	−0.125
edge4	0.5833	0.4583	−0.125
Overall	0.8161	0.6264	−0.190

Overall FPR@95TPR drops by 19pp at λ=0.01 — at 95% defect detection sensitivity, 19% fewer good products are falsely rejected.

4. Per-Template Breakdown (λ=0.01)

Template	AUROC	Acc	FPR@95TPR	N
A	0.7981	0.7048	0.6863	105
B	0.8750	0.8214	0.4375	28
C	0.8325	0.9023	0.9533	133

5. Stochastic Variance (MC Scoring)

The OOD score is stochastic (K random timestep samples per image). Across 3 evaluations of the λ=0.02 checkpoint:

Eval run	K	AUROC
Run 1	50	0.8581
Run 2	50	0.8474
Run 3	100	0.8541

Variance ≈ ±0.005 at K=50, ±0.003 at K=100. All final results use K=100.

6. Training Configuration

model:           CDM UNet (multi-head conditioning)
                 template + feature + quality + bbox heads
dataset:         1,327 inkjet print samples (174 GOOD, 92 BAD → test set)
train/test:      80/20 stratified split (seed=42)
oversampling:    BAD samples × 3.0 (minority oversampling during training only)
image_size:      crop-based (YOLO bbox region)
epochs:          100
schedule:        cosine (Nichol & Dhariwal 2021)
optimizer:       AdamW, lr=1e-4
sep_loss:        L = L_MSE + λ · L_sep
scoring:         Algorithm 1 (difference method), K=100 MC trials
batch_size:      64 (sep loss runs) / 128 (baseline λ=0)
GPU:             CUDA device 3 (32 GB VRAM)