fix: complete models table with real param counts (9.33M/34.2M/25.86M/8.94M)

378f0ad verified 24 days ago

9.1 kB

	---
	library_name: pytorch
	tags:
	- conditional-diffusion
	- inkjet-quality-control
	- yolo
	- out-of-distribution-detection
	- anomaly-detection
	- thesis
	license: cc-by-4.0
	---

	# InkjetOOD — Pretrained Models & Dataset

	[![GitHub Code](https://img.shields.io/badge/GitHub-ahmed--3m%2FInkjetOOD-black)](https://github.com/ahmed-3m/InkjetOOD)
	[![Companion Repo](https://img.shields.io/badge/🤗%20HuggingFace-DiffusionOOD%20(CIFAR--10)-blue)](https://huggingface.co/ahmed-3m/DiffusionOOD)

	Thesis: Conditional Diffusion Models as Generative Classifiers for Out-of-Distribution Detection in Inkjet Print Quality Control
	Author: Ahmed Mohammed — MSc AI, Johannes Kepler University Linz (2026)
	Supervisor: Univ.-Prof. Dr. Sepp Hochreiter · Industrial Partner: PROFACTOR GmbH

	---

	## What Is This?

	This HuggingFace repository stores all artifacts for the inkjet print quality control experiments from the above thesis:

	- Trained model weights (CDM + YOLO feature detector)
	- Inkjet print dataset (FTI_Zer0P format: images + labels)
	- All experiment results (metrics, score CSVs, figures, tables)

	The code lives on GitHub: [https://github.com/ahmed-3m/InkjetOOD](https://github.com/ahmed-3m/InkjetOOD)

	---

	## Quick Start

	### Option A — Evaluate the pretrained model (fastest, ~5 min)

	```bash
	# 1. Clone the code repo
	git clone https://github.com/ahmed-3m/InkjetOOD
	cd InkjetOOD
	pip install -r requirements.txt

	# 2. Download weights from this HF repo
	python download_weights.py

	# 3. Set your dataset path
	export INKJET_DATA_DIR=/path/to/FTI_Zer0P_dataset

	# 4. Evaluate
	python evaluate.py \
	--checkpoint models/cdm_proposed.pt \
	--num_trials 100 \
	--out_dir results/eval_pretrained
	```

	Expected: AUROC ≈ 0.860 (single-split, N=266 test samples, seed=42).

	---

	### Option B — Train from scratch (single split)

	```bash
	export INKJET_DATA_DIR=/path/to/FTI_Zer0P_dataset
	python download_weights.py # fetches YOLO weights

	# Baseline (λ=0)
	python train.py --epochs 100 --batch_size 128 --sep_loss_weight 0.0 \
	--eval_after --out_dir results/train_baseline

	# Proposed (λ=0.01)
	python train.py --epochs 100 --batch_size 64 --sep_loss_weight 0.01 \
	--eval_after --out_dir results/train_proposed
	```

	---

	### Option C — 5-Fold cross-validation (thesis main result)

	```bash
	export INKJET_DATA_DIR=/path/to/FTI_Zer0P_dataset

	# Reproduces thesis Table 6.x: AUROC 0.8673 ± 0.023
	python run_cv.py \
	--sep_loss_weight 0.0 \
	--epochs 100 --batch_size 128 \
	--num_trials 50 --n_folds 5 --seed 42 \
	--out_dir results/cv_lambda0
	```

	---

	## Step-by-Step: Load and Use Pretrained Weights

	Step 1 — Download weights

	Using the helper script (recommended):
	```bash
	git clone https://github.com/ahmed-3m/InkjetOOD && cd InkjetOOD
	python download_weights.py
	```

	Or download manually from this repo's `models/` folder:
	```python
	from huggingface_hub import hf_hub_download
	import torch

	# Download the proposed CDM (λ=0.01, single-split AUROC 0.860)
	path = hf_hub_download(repo_id="ahmed-3m/InkjetOOD", filename="models/cdm_v3_yolo_bbox.pt")
	ckpt = torch.load(path, map_location="cpu", weights_only=False)
	state_dict = ckpt["state_dict"] # or ckpt["model_state_dict"]
	print(f"Loaded {sum(v.numel() for v in state_dict.values()):,} parameters")
	```

	Step 2 — Load the model

	```python
	from src.model import NoisePredictorV3
	from src.diffusion import DiffusionSchedule

	device = "cuda"
	model = NoisePredictorV3(base_channels=64).to(device)
	schedule = DiffusionSchedule(schedule="cosine", device=device)

	ckpt = torch.load("models/cdm_proposed.pt", map_location=device)
	model.load_state_dict(ckpt["model_state_dict"])
	model.eval()
	print("Model loaded.")
	```

	Step 3 — Run evaluation

	```python
	from src.evaluate import evaluate_cdm, save_results
	# ... (set up test_loader with your data)
	scores_df = evaluate_cdm(model, schedule, test_loader, num_trials=100, device=device)
	save_results(scores_df, out_dir="results/my_eval")
	```

	For full data loading details, see [`evaluate.py`](https://github.com/ahmed-3m/InkjetOOD/blob/main/evaluate.py) in the GitHub repo.

	Step 4 — Load the YOLO detector

	```python
	from ultralytics import YOLO
	yolo = YOLO("models/yolo_best.pt")
	results = yolo("path/to/print_image.png", conf=0.3)
	# Each detection is one print feature crop
	```

	---

	## Models

	\| File \| Description \| AUROC \| Params \|
	\|---\|---\|---\|---\|
	\| `models/cdm_v3_yolo_bbox.pt` \| CDM λ=0.01, base_ch=64 (proposed) \| 0.8603 single-split \| 9.33 M \|
	\| `models/cdm_v3_baseline.pt` \| CDM λ=0, base_ch=128 (thesis CV result) \| 0.8673 ± 0.023 CV \| 34.2 M \|
	\| `models/cdm_v3_test.pt` \| CDM λ=0.01, base_ch=64 (dev checkpoint) \| ≈0.85 \| 9.33 M \|
	\| `models/yolo_best.pt` \| YOLOv8 feature detector (8 print features) \| mAP@50=0.950 \| 25.86 M \|
	\| `models/semantic_mismatch_angle_model.pt` \| Per-feature CDM: angle (~9.67:1 imbalance) \| ~0.82 \| 8.94 M \|
	\| `models/semantic_mismatch_dist1_model.pt` \| Per-feature CDM: dist1 \| ~0.89 \| 8.94 M \|
	\| `models/semantic_mismatch_dots_model.pt` \| Per-feature CDM: dots (best feature) \| ~0.96 \| 8.94 M \|

	### Which model should I use?

	- Quick evaluation: `cdm_v3_yolo_bbox.pt` — the thesis "proposed CDM", λ=0.01, base_ch=64 (9.33 M params)
	- Reproducing the thesis CV result: `cdm_v3_baseline.pt` — λ=0, base_ch=128 (34.2 M params); the 5-fold CV AUROC 0.8673 ± 0.023 comes from this wider model
	- YOLO feature detection only: `yolo_best.pt` (25.86 M params, YOLOv8-based)
	- Per-feature analysis: `semantic_mismatch_*.pt` — one model per feature, 8.94 M each, trained with a DiffGuard-style contrastive approach

	> Why does the baseline win on CV? On this small dataset (~1330 samples), the 5-fold CV shows λ=0 and λ=0.01 are statistically indistinguishable (0.8673 vs 0.8628). The single-split evaluation clearly favors λ=0.01 (+2.8 pp AUROC, −19 pp FPR@95). The thesis reports the CV result as the primary finding since it is more reliable.

	---

	## Results

	### 5-Fold Cross-Validation (Thesis Main Result)

	\| λ \| Mean AUROC \| Std \| Mean FPR@95 \|
	\|---\|---\|---\|---\|
	\| 0.0 (baseline) \| 0.8673 \| 0.023 \| 0.563 \|
	\| 0.01 \| 0.8628 \| 0.029 \| 0.552 \|
	\| 0.02 \| 0.8510 \| 0.033 \| 0.624 \|
	\| 0.05 \| 0.8670 \| 0.026 \| 0.570 \|

	Protocol: 5-fold stratified CV, K=50 Monte Carlo trials, seed=42.

	### Single-Split Evaluation (N=266 test, seed=42)

	\| λ \| AUROC \| FPR@95 \| K \|
	\|---\|---\|---\|---\|
	\| 0.0 \| 0.8325 \| 0.816 \| 100 \|
	\| 0.01 \| 0.8603 \| 0.626 \| 100 \|
	\| 0.02 \| 0.8541 \| 0.661 \| 100 \|
	\| 0.05 \| 0.8553 \| 0.529 \| 100 \|

	### Per-Feature AUROC (5-Fold CV, λ=0)

	\| Feature \| AUROC \| Std \|
	\|---\|---\|---\|
	\| dots \| 0.956 \| 0.035 \|
	\| dist6 \| 0.936 \| 0.067 \|
	\| dist1 \| 0.887 \| 0.073 \|
	\| angle \| 0.817 \| 0.138 \|
	\| edge2 \| 0.813 \| 0.031 \|
	\| edge1 \| 0.796 \| 0.138 \|
	\| edge4 \| 0.762 \| 0.099 \|
	\| edge3 \| 0.744 \| 0.076 \|

	### YOLO Feature Detector

	\| Metric \| Value \|
	\|---\|---\|
	\| Precision \| 94.1% \|
	\| Recall \| 89.1% \|
	\| mAP@50 \| 95.0% \|
	\| mAP@50-95 \| 84.7% \|

	---

	## Dataset

	The FTI_Zer0P dataset contains inkjet print images with YOLO-format annotations and quality labels.

	\| Path \| Contents \|
	\|---\|---\|
	\| `data/yolo_format_v2/` \| Full YOLO dataset (train/val/test images + labels) \|
	\| `data/metadata/feature_cls.csv` \| Per-sample metadata (feature class, quality label, template type) \|
	\| `data/crop_examples/` \| Sample 128×128 crops for 6 print features \|

	---

	## Contents of This Repository

	```
	ahmed-3m/InkjetOOD (HuggingFace)
	├── models/
	│ ├── cdm_v3_yolo_bbox.pt ← CDM λ=0.01 (thesis proposed)
	│ ├── cdm_v3_baseline.pt ← CDM λ=0 baseline (thesis CV result)
	│ ├── cdm_v3_test.pt ← Development checkpoint
	│ ├── yolo_best.pt ← YOLOv8 feature detector
	│ ├── semantic_mismatch_angle_model.pt
	│ ├── semantic_mismatch_dist1_model.pt
	│ └── semantic_mismatch_dots_model.pt
	│
	├── data/
	│ ├── yolo_format_v2/ ← YOLO dataset (images + labels)
	│ ├── metadata/ ← feature_cls.csv, yolo_dataset.yaml
	│ └── crop_examples/ ← sample crops for 6 features
	│
	└── results/
	├── cv_lambda_ablation/ ← 5-fold CV results for λ ∈ {0, 0.01, 0.02, 0.05}
	├── single_split_lambda_ablation/ ← single-split ablation results
	├── figures/ ← all thesis figures (PNG)
	└── tables/ ← all thesis tables (LaTeX)
	```

	---

	## Citation

	```bibtex
	@mastersthesis{mohammed2026inkjet,
	title = {Conditional Diffusion Models as Generative Classifiers for
	Out-of-Distribution Detection in Inkjet Print Quality Control},
	author = {Mohammed, Ahmed},
	school = {Johannes Kepler University Linz},
	year = {2026},
	type = {Master's Thesis}
	}
	```

	---

	## License

	Code (GitHub): MIT License.
	Dataset (FTI_Zer0P) and model weights: Creative Commons Attribution 4.0 International (CC BY 4.0).