File size: 5,917 Bytes
f5f0f0e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 | # Inkjet CDM — Experiment Log
Full chronological log of all inkjet CDM experiments.
---
## Model Architecture
```
CDM UNet — Multi-Head Conditioning
Parameters: ~9.3M (inkjet model, smaller than CIFAR-10's 35.7M)
Input: YOLO-cropped inkjet print region (bbox crop)
Conditioning: 4 heads — template_id, feature_id, quality_label, bbox_coords
Schedule: Cosine (Nichol & Dhariwal 2021), 1000 timesteps, epsilon prediction
Scoring: Algorithm 1 — difference method (pred_good_err - pred_bad_err)
```
---
## Dataset Statistics
```
Total samples: 1,327 (from feature_cls.csv)
Train split: 80% = ~1,061 samples (with BAD oversampling ×3 → effective train size larger)
Test split: 20% = 266 samples (FIXED, seed=42, no oversampling)
Test set breakdown:
Templates: A=105, B=28, C=133
Labels: GOOD=174, BAD=92 (imbalance: 1.89:1)
Per-feature test counts:
angle: 27 GOOD, 3 BAD ← WARNING: only 3 BAD — AUROC unreliable
dist1: 21 GOOD, 10 BAD
dist6: 30 GOOD, 6 BAD
dots: 26 GOOD, 11 BAD
edge1: 12 GOOD, 16 BAD
edge2: 18 GOOD, 28 BAD
edge3: 16 GOOD, 12 BAD
edge4: 24 GOOD, 6 BAD
```
---
## Experiment 1 — Baseline CDM (λ=0)
**Goal:** Establish CDM performance without separation loss.
**Config:**
```
sep_loss_weight: 0.0
batch_size: 128 (single forward pass, fits 32GB VRAM comfortably)
epochs: 100
schedule: cosine
num_trials: 100 (K=100 at evaluation)
out_dir: results/inkjet_lambda0
```
**Key fix applied:** Trainer gated separation loss computation behind `if sep_loss_weight > 0.0`
to prevent 3× VRAM usage and OOM errors on the baseline run.
**Results (K=100):**
```
Overall AUROC=0.8325 Acc=0.7895 FPR@95TPR=0.8161 N=266
```
**Output files:**
- `results/inkjet_lambda0/cdm_best.pt`
- `results/inkjet_lambda0_k100/scores.csv`
- `results/inkjet_lambda0.log`
---
## Experiment 2 — Separation Loss λ=0.02
**Goal:** Apply CIFAR-10 optimal λ directly to inkjet.
**Config:**
```
sep_loss_weight: 0.02
batch_size: 64 (3 forward passes per step: main + good + bad)
epochs: 100
```
**Results (K=100, re-evaluated):**
```
Overall AUROC=0.8541 Acc=0.8008 FPR@95TPR=0.6609 N=266
```
*Note: First evaluation (K=50) showed 0.8581; variance ±0.005 at K=50.*
**Output files:**
- `results/inkjet_lambda0.02/cdm_best.pt`
- `results/inkjet_lambda0.02_k100/scores.csv`
- `results/inkjet_lambda0.02.log`
---
## Experiment 3 — Separation Loss λ=0.01
**Goal:** Test CIFAR-10 sub-optimal value on inkjet (one step below λ=0.02).
**Config:**
```
sep_loss_weight: 0.01
batch_size: 64
epochs: 100
num_trials: 100
```
**Results (K=100, inline with training via --eval_after):**
```
Overall AUROC=0.8603 Acc=0.8158 FPR@95TPR=0.6264 N=266
```
**→ Best AUROC of all λ values on inkjet.**
**Output files:**
- `results/inkjet_lambda0.01/cdm_best.pt`
- `results/inkjet_lambda0.01/scores_final.csv`
- `results/inkjet_lambda0.01.log`
---
## Experiment 4 — Separation Loss λ=0.05
**Goal:** Test CIFAR-10 upper boundary of optimal zone on inkjet.
**Config:**
```
sep_loss_weight: 0.05
batch_size: 64
epochs: 100
num_trials: 100
```
**Results (K=100):**
```
Overall AUROC=0.8553 Acc=0.8233 FPR@95TPR=0.5287 N=266
```
**→ Best FPR@95TPR of all λ values. Best accuracy.**
**Output files:**
- `results/inkjet_lambda0.05/cdm_best.pt`
- `results/inkjet_lambda0.05/scores_final.csv`
- `results/inkjet_lambda0.05.log`
---
## Experiment 5 — 5-Fold CV Ablation (Phase 2, FINAL)
**Date:** March 1-2, 2026
**Goal:** Rigorous evaluation with confidence intervals. All four λ values evaluated under identical conditions (batch=64 for fair comparison).
**Config (all runs):**
```
batch_size: 64 (all runs — eliminates confound)
epochs: 100 per fold
seed: 42
num_trials: 100
n_folds: 5 (stratified on label)
Total GPU time: ~24h
```
**Results:**
| λ | AUROC (5-fold CV) | Accuracy | FPR@95TPR |
|---|:---:|:---:|:---:|
| **0.0** | **0.8673 ± 0.0230** | **0.8094 ± 0.0151** | 0.5631 ± 0.1697 |
| 0.01 | 0.8628 ± 0.0286 | 0.7928 ± 0.0291 | **0.5516 ± 0.1841** |
| 0.02 | 0.8510 ± 0.0326 | 0.8003 ± 0.0246 | 0.6240 ± 0.1334 |
| 0.05 | 0.8670 ± 0.0256 | 0.8071 ± 0.0241 | 0.5700 ± 0.1948 |
**→ Separation loss does NOT improve over baseline on inkjet. All λ values within std.**
**Output files:**
- `results/cv_lambda0.0/cv_summary.json` (+ per-fold scores)
- `results/cv_lambda0.01/cv_summary.json`
- `results/cv_lambda0.02/cv_summary.json`
- `results/cv_lambda0.05/cv_summary.json`
---
## Key Decisions & Rationale
| Decision | Choice | Rationale |
|----------|--------|-----------|
| 5-fold CV for final results | 5-fold stratified | Single-split results unreliable; CV provides confidence intervals |
| batch=64 for ALL runs | 64 | Eliminates batch-size confound; sep loss needs 3 forward passes |
| λ values tested | 0.0, 0.01, 0.02, 0.05 | Covers CIFAR-10 optimal zone; more points not justified |
| K=100 for final eval | K=100 | Stable estimates; CIFAR-10 K-ablation showed diminishing returns after K=25 |
| seed=42 | 42 | Matches CIFAR-10 primary seed |
---
## Known Issues
1. **`angle` feature**: Only 2-4 BAD samples per fold. AUROC highly variable (0.54–0.95). Report with caveat.
2. **MC variance**: ±0.003 at K=100. Negligible compared to cross-fold variance (±0.025).
3. **Separation loss ineffective**: The primary thesis finding for inkjet. See cross-domain comparison in RESULTS.md.
---
## Hardware
- **GPU:** CUDA device 3 (Quadro GV100, 32 GB VRAM)
- **Environment:** `/system/apps/studentenv/mohammed/sdm/`
- **Training time per fold:** ~70 min (100 epochs, batch=64, λ>0)
- **Evaluation time per fold (K=100):** ~45 seconds
- **Total 5-fold CV ablation time:** ~24h (4 λ values × 5 folds)
|