Update model card with V2 results and V1/V2 comparison

06fc3fd verified 19 days ago

4.09 kB

	---
	license: apache-2.0
	tags:
	- image-classification
	- moire-detection
	- document-analysis
	- document-quality
	- vision
	datasets:
	- hf-tuner/rvl-cdip-document-classification
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	pipeline_tag: image-classification
	---

	# Document Moiré Detection Model (V2)

	A fine-tuned DeiT-small Vision Transformer for detecting moiré patterns in document images.

	## Model Description

	Binary classifier: detects whether a document image contains moiré artifacts
	(common from screen photography, scanning, or screen captures).

	Labels:
	- `clean` (0): No moiré patterns
	- `moire` (1): Moiré patterns detected

	## V1 → V2 Comparison

	\| \| V1 (DeiT-tiny) \| V2 (DeiT-small) \|
	\|---\|---\|---\|
	\| Parameters \| 5.5M \| 22M \|
	\| Training samples \| 6,000 \| 8,000 \|
	\| Moiré methods \| 4 \| 6 (+subtle, +localized) \|
	\| Label smoothing \| — \| 0.05 \|
	\| Accuracy \| 99.5% \| 99.1% \|
	\| F1 Score \| 0.995 \| 0.991 \|
	\| Precision \| 99.3% \| 98.5% \|
	\| Recall \| 99.7% \| 99.8% \|

	> Note: V2 was evaluated on harder examples including subtle single-frequency moiré and localized
	> moiré patterns that V1 never trained on. V2 achieves near-perfect recall (99.75%) — it catches
	> virtually all moiré patterns including very subtle ones, at the cost of slightly lower precision.

	## Training Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Base model \| `facebook/deit-small-patch16-224` \|
	\| Parameters \| 22M \|
	\| Training samples \| 8,000 (4,000 clean + 4,000 moiré) \|
	\| Eval samples \| 800 (400 clean + 400 moiré) \|
	\| Epochs \| 5 \|
	\| Learning rate \| 3e-5 (cosine schedule) \|
	\| Effective batch size \| 64 \|
	\| Label smoothing \| 0.05 \|
	\| Warmup steps \| 60 \|
	\| Best checkpoint \| Epoch 2 (by F1) \|

	### Moiré Generation Methods
	1. Resize aliasing — downscale+upscale with NEAREST interpolation + pattern overlay
	2. Pattern overlay — sinusoidal interference with per-channel color variation
	3. Multi-frequency — 2-4 patterns at different frequencies + color displacement
	4. Screen simulation — pixel grid + rotation + moiré overlay
	5. Subtle moiré — very low strength single-frequency (hard examples)
	6. Localized moiré — moiré in elliptical region with gaussian mask

	## Performance (Best Checkpoint)

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Accuracy \| 99.12% \|
	\| F1 Score \| 0.9913 \|
	\| Precision \| 98.52% \|
	\| Recall \| 99.75% \|
	\| Eval Loss \| 0.170 \|

	### Training Progression

	\| Epoch \| Eval Loss \| Accuracy \| F1 \| Precision \| Recall \|
	\|-------\|-----------\|----------\|-----\|-----------\|--------\|
	\| 1 \| 0.173 \| 99.0% \| 0.990 \| 99.3% \| 98.8% \|
	\| 2 \| 0.177 \| 99.1% \| 0.991 \| 99.0% \| 99.3% \|
	\| 3 \| 0.154 \| 99.1% \| 0.991 \| 99.3% \| 99.0% \|
	\| 4 \| 0.170 \| 99.1% \| 0.991 \| 98.5% \| 99.8% \|
	\| 5 \| 0.168 \| 99.0% \| 0.990 \| 98.3% \| 99.8% \|

	## Usage

	```python
	from transformers import pipeline

	classifier = pipeline("image-classification", model="Jwalit/document-moire-detector")
	result = classifier("path/to/document.jpg")
	print(result)
	# [{'label': 'clean', 'score': 0.99}, {'label': 'moire', 'score': 0.01}]
	```

	Or manually:
	```python
	from transformers import AutoImageProcessor, AutoModelForImageClassification
	from PIL import Image
	import torch

	processor = AutoImageProcessor.from_pretrained("Jwalit/document-moire-detector")
	model = AutoModelForImageClassification.from_pretrained("Jwalit/document-moire-detector")

	image = Image.open("document.jpg")
	inputs = processor(image, return_tensors="pt")
	with torch.no_grad():
	logits = model(**inputs).logits
	predicted = logits.argmax(-1).item()

	print(model.config.id2label[predicted]) # 'clean' or 'moire'
	```

	## Limitations
	- Trained on synthetic moiré patterns — may not capture all real-world moiré variations
	- Optimized for document images; performance on natural scene images may vary
	- Input images resized to 224×224; very subtle moiré in high-resolution images may be lost
	- Higher recall than precision — may occasionally flag clean images as moiré (false positive rate ~1.5%)