InkjetOOD / README.md
ahmed-3m's picture
fix: complete models table with real param counts (9.33M/34.2M/25.86M/8.94M)
378f0ad verified
---
library_name: pytorch
tags:
- conditional-diffusion
- inkjet-quality-control
- yolo
- out-of-distribution-detection
- anomaly-detection
- thesis
license: cc-by-4.0
---
# InkjetOOD β€” Pretrained Models & Dataset
[![GitHub Code](https://img.shields.io/badge/GitHub-ahmed--3m%2FInkjetOOD-black)](https://github.com/ahmed-3m/InkjetOOD)
[![Companion Repo](https://img.shields.io/badge/πŸ€—%20HuggingFace-DiffusionOOD%20(CIFAR--10)-blue)](https://huggingface.co/ahmed-3m/DiffusionOOD)
**Thesis:** *Conditional Diffusion Models as Generative Classifiers for Out-of-Distribution Detection in Inkjet Print Quality Control*
**Author:** Ahmed Mohammed β€” MSc AI, Johannes Kepler University Linz (2026)
**Supervisor:** Univ.-Prof. Dr. Sepp Hochreiter Β· Industrial Partner: PROFACTOR GmbH
---
## What Is This?
This HuggingFace repository stores **all artifacts** for the inkjet print quality control experiments from the above thesis:
- Trained model weights (CDM + YOLO feature detector)
- Inkjet print dataset (FTI_Zer0P format: images + labels)
- All experiment results (metrics, score CSVs, figures, tables)
The **code** lives on GitHub: [https://github.com/ahmed-3m/InkjetOOD](https://github.com/ahmed-3m/InkjetOOD)
---
## Quick Start
### Option A β€” Evaluate the pretrained model (fastest, ~5 min)
```bash
# 1. Clone the code repo
git clone https://github.com/ahmed-3m/InkjetOOD
cd InkjetOOD
pip install -r requirements.txt
# 2. Download weights from this HF repo
python download_weights.py
# 3. Set your dataset path
export INKJET_DATA_DIR=/path/to/FTI_Zer0P_dataset
# 4. Evaluate
python evaluate.py \
--checkpoint models/cdm_proposed.pt \
--num_trials 100 \
--out_dir results/eval_pretrained
```
Expected: AUROC β‰ˆ 0.860 (single-split, N=266 test samples, seed=42).
---
### Option B β€” Train from scratch (single split)
```bash
export INKJET_DATA_DIR=/path/to/FTI_Zer0P_dataset
python download_weights.py # fetches YOLO weights
# Baseline (Ξ»=0)
python train.py --epochs 100 --batch_size 128 --sep_loss_weight 0.0 \
--eval_after --out_dir results/train_baseline
# Proposed (Ξ»=0.01)
python train.py --epochs 100 --batch_size 64 --sep_loss_weight 0.01 \
--eval_after --out_dir results/train_proposed
```
---
### Option C β€” 5-Fold cross-validation (thesis main result)
```bash
export INKJET_DATA_DIR=/path/to/FTI_Zer0P_dataset
# Reproduces thesis Table 6.x: AUROC 0.8673 Β± 0.023
python run_cv.py \
--sep_loss_weight 0.0 \
--epochs 100 --batch_size 128 \
--num_trials 50 --n_folds 5 --seed 42 \
--out_dir results/cv_lambda0
```
---
## Step-by-Step: Load and Use Pretrained Weights
**Step 1 β€” Download weights**
Using the helper script (recommended):
```bash
git clone https://github.com/ahmed-3m/InkjetOOD && cd InkjetOOD
python download_weights.py
```
Or download manually from this repo's `models/` folder:
```python
from huggingface_hub import hf_hub_download
import torch
# Download the proposed CDM (Ξ»=0.01, single-split AUROC 0.860)
path = hf_hub_download(repo_id="ahmed-3m/InkjetOOD", filename="models/cdm_v3_yolo_bbox.pt")
ckpt = torch.load(path, map_location="cpu", weights_only=False)
state_dict = ckpt["state_dict"] # or ckpt["model_state_dict"]
print(f"Loaded {sum(v.numel() for v in state_dict.values()):,} parameters")
```
**Step 2 β€” Load the model**
```python
from src.model import NoisePredictorV3
from src.diffusion import DiffusionSchedule
device = "cuda"
model = NoisePredictorV3(base_channels=64).to(device)
schedule = DiffusionSchedule(schedule="cosine", device=device)
ckpt = torch.load("models/cdm_proposed.pt", map_location=device)
model.load_state_dict(ckpt["model_state_dict"])
model.eval()
print("Model loaded.")
```
**Step 3 β€” Run evaluation**
```python
from src.evaluate import evaluate_cdm, save_results
# ... (set up test_loader with your data)
scores_df = evaluate_cdm(model, schedule, test_loader, num_trials=100, device=device)
save_results(scores_df, out_dir="results/my_eval")
```
For full data loading details, see [`evaluate.py`](https://github.com/ahmed-3m/InkjetOOD/blob/main/evaluate.py) in the GitHub repo.
**Step 4 β€” Load the YOLO detector**
```python
from ultralytics import YOLO
yolo = YOLO("models/yolo_best.pt")
results = yolo("path/to/print_image.png", conf=0.3)
# Each detection is one print feature crop
```
---
## Models
| File | Description | AUROC | Params |
|---|---|---|---|
| `models/cdm_v3_yolo_bbox.pt` | **CDM Ξ»=0.01, base_ch=64 (proposed)** | 0.8603 single-split | **9.33 M** |
| `models/cdm_v3_baseline.pt` | CDM Ξ»=0, base_ch=128 (thesis CV result) | 0.8673 Β± 0.023 CV | **34.2 M** |
| `models/cdm_v3_test.pt` | CDM Ξ»=0.01, base_ch=64 (dev checkpoint) | β‰ˆ0.85 | 9.33 M |
| `models/yolo_best.pt` | YOLOv8 feature detector (8 print features) | mAP@50=0.950 | **25.86 M** |
| `models/semantic_mismatch_angle_model.pt` | Per-feature CDM: angle (~9.67:1 imbalance) | ~0.82 | **8.94 M** |
| `models/semantic_mismatch_dist1_model.pt` | Per-feature CDM: dist1 | ~0.89 | **8.94 M** |
| `models/semantic_mismatch_dots_model.pt` | Per-feature CDM: dots (best feature) | ~0.96 | **8.94 M** |
### Which model should I use?
- **Quick evaluation:** `cdm_v3_yolo_bbox.pt` β€” the thesis "proposed CDM", Ξ»=0.01, base_ch=64 (**9.33 M params**)
- **Reproducing the thesis CV result:** `cdm_v3_baseline.pt` β€” Ξ»=0, base_ch=128 (**34.2 M params**); the 5-fold CV AUROC 0.8673 Β± 0.023 comes from this wider model
- **YOLO feature detection only:** `yolo_best.pt` (25.86 M params, YOLOv8-based)
- **Per-feature analysis:** `semantic_mismatch_*.pt` β€” one model per feature, 8.94 M each, trained with a DiffGuard-style contrastive approach
> **Why does the baseline win on CV?** On this small dataset (~1330 samples), the 5-fold CV shows Ξ»=0 and Ξ»=0.01 are statistically indistinguishable (0.8673 vs 0.8628). The single-split evaluation clearly favors Ξ»=0.01 (+2.8 pp AUROC, βˆ’19 pp FPR@95). The thesis reports the CV result as the primary finding since it is more reliable.
---
## Results
### 5-Fold Cross-Validation (Thesis Main Result)
| Ξ» | Mean AUROC | Std | Mean FPR@95 |
|---|---|---|---|
| **0.0 (baseline)** | **0.8673** | **0.023** | 0.563 |
| 0.01 | 0.8628 | 0.029 | 0.552 |
| 0.02 | 0.8510 | 0.033 | 0.624 |
| 0.05 | 0.8670 | 0.026 | 0.570 |
Protocol: 5-fold stratified CV, K=50 Monte Carlo trials, seed=42.
### Single-Split Evaluation (N=266 test, seed=42)
| Ξ» | AUROC | FPR@95 | K |
|---|---|---|---|
| 0.0 | 0.8325 | 0.816 | 100 |
| **0.01** | **0.8603** | **0.626** | 100 |
| 0.02 | 0.8541 | 0.661 | 100 |
| 0.05 | 0.8553 | 0.529 | 100 |
### Per-Feature AUROC (5-Fold CV, Ξ»=0)
| Feature | AUROC | Std |
|---|---|---|
| dots | 0.956 | 0.035 |
| dist6 | 0.936 | 0.067 |
| dist1 | 0.887 | 0.073 |
| angle | 0.817 | 0.138 |
| edge2 | 0.813 | 0.031 |
| edge1 | 0.796 | 0.138 |
| edge4 | 0.762 | 0.099 |
| edge3 | 0.744 | 0.076 |
### YOLO Feature Detector
| Metric | Value |
|---|---|
| Precision | 94.1% |
| Recall | 89.1% |
| **mAP@50** | **95.0%** |
| mAP@50-95 | 84.7% |
---
## Dataset
The **FTI_Zer0P** dataset contains inkjet print images with YOLO-format annotations and quality labels.
| Path | Contents |
|---|---|
| `data/yolo_format_v2/` | Full YOLO dataset (train/val/test images + labels) |
| `data/metadata/feature_cls.csv` | Per-sample metadata (feature class, quality label, template type) |
| `data/crop_examples/` | Sample 128Γ—128 crops for 6 print features |
---
## Contents of This Repository
```
ahmed-3m/InkjetOOD (HuggingFace)
β”œβ”€β”€ models/
β”‚ β”œβ”€β”€ cdm_v3_yolo_bbox.pt ← CDM Ξ»=0.01 (thesis proposed)
β”‚ β”œβ”€β”€ cdm_v3_baseline.pt ← CDM Ξ»=0 baseline (thesis CV result)
β”‚ β”œβ”€β”€ cdm_v3_test.pt ← Development checkpoint
β”‚ β”œβ”€β”€ yolo_best.pt ← YOLOv8 feature detector
β”‚ β”œβ”€β”€ semantic_mismatch_angle_model.pt
β”‚ β”œβ”€β”€ semantic_mismatch_dist1_model.pt
β”‚ └── semantic_mismatch_dots_model.pt
β”‚
β”œβ”€β”€ data/
β”‚ β”œβ”€β”€ yolo_format_v2/ ← YOLO dataset (images + labels)
β”‚ β”œβ”€β”€ metadata/ ← feature_cls.csv, yolo_dataset.yaml
β”‚ └── crop_examples/ ← sample crops for 6 features
β”‚
└── results/
β”œβ”€β”€ cv_lambda_ablation/ ← 5-fold CV results for Ξ» ∈ {0, 0.01, 0.02, 0.05}
β”œβ”€β”€ single_split_lambda_ablation/ ← single-split ablation results
β”œβ”€β”€ figures/ ← all thesis figures (PNG)
└── tables/ ← all thesis tables (LaTeX)
```
---
## Citation
```bibtex
@mastersthesis{mohammed2026inkjet,
title = {Conditional Diffusion Models as Generative Classifiers for
Out-of-Distribution Detection in Inkjet Print Quality Control},
author = {Mohammed, Ahmed},
school = {Johannes Kepler University Linz},
year = {2026},
type = {Master's Thesis}
}
```
---
## License
Code (GitHub): MIT License.
Dataset (FTI_Zer0P) and model weights: Creative Commons Attribution 4.0 International (CC BY 4.0).