Jwalit's picture
Update model card with V2 results and V1/V2 comparison
06fc3fd verified
---
license: apache-2.0
tags:
- image-classification
- moire-detection
- document-analysis
- document-quality
- vision
datasets:
- hf-tuner/rvl-cdip-document-classification
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: image-classification
---
# Document Moiré Detection Model (V2)
A fine-tuned **DeiT-small** Vision Transformer for detecting moiré patterns in document images.
## Model Description
Binary classifier: detects whether a document image contains moiré artifacts
(common from screen photography, scanning, or screen captures).
**Labels:**
- `clean` (0): No moiré patterns
- `moire` (1): Moiré patterns detected
## V1 → V2 Comparison
| | V1 (DeiT-tiny) | **V2 (DeiT-small)** |
|---|---|---|
| **Parameters** | 5.5M | **22M** |
| **Training samples** | 6,000 | **8,000** |
| **Moiré methods** | 4 | **6** (+subtle, +localized) |
| **Label smoothing** | — | **0.05** |
| **Accuracy** | 99.5% | **99.1%** |
| **F1 Score** | 0.995 | **0.991** |
| **Precision** | 99.3% | **98.5%** |
| **Recall** | 99.7% | **99.8%** |
> **Note:** V2 was evaluated on harder examples including subtle single-frequency moiré and localized
> moiré patterns that V1 never trained on. V2 achieves near-perfect recall (99.75%) — it catches
> virtually all moiré patterns including very subtle ones, at the cost of slightly lower precision.
## Training Details
| Parameter | Value |
|-----------|-------|
| Base model | `facebook/deit-small-patch16-224` |
| Parameters | 22M |
| Training samples | 8,000 (4,000 clean + 4,000 moiré) |
| Eval samples | 800 (400 clean + 400 moiré) |
| Epochs | 5 |
| Learning rate | 3e-5 (cosine schedule) |
| Effective batch size | 64 |
| Label smoothing | 0.05 |
| Warmup steps | 60 |
| Best checkpoint | Epoch 2 (by F1) |
### Moiré Generation Methods
1. **Resize aliasing** — downscale+upscale with NEAREST interpolation + pattern overlay
2. **Pattern overlay** — sinusoidal interference with per-channel color variation
3. **Multi-frequency** — 2-4 patterns at different frequencies + color displacement
4. **Screen simulation** — pixel grid + rotation + moiré overlay
5. **Subtle moiré** — very low strength single-frequency (hard examples)
6. **Localized moiré** — moiré in elliptical region with gaussian mask
## Performance (Best Checkpoint)
| Metric | Value |
|--------|-------|
| Accuracy | 99.12% |
| F1 Score | 0.9913 |
| Precision | 98.52% |
| Recall | 99.75% |
| Eval Loss | 0.170 |
### Training Progression
| Epoch | Eval Loss | Accuracy | F1 | Precision | Recall |
|-------|-----------|----------|-----|-----------|--------|
| 1 | 0.173 | 99.0% | 0.990 | 99.3% | 98.8% |
| 2 | 0.177 | 99.1% | **0.991** | 99.0% | 99.3% |
| 3 | 0.154 | 99.1% | 0.991 | 99.3% | 99.0% |
| 4 | 0.170 | 99.1% | **0.991** | 98.5% | **99.8%** |
| 5 | 0.168 | 99.0% | 0.990 | 98.3% | **99.8%** |
## Usage
```python
from transformers import pipeline
classifier = pipeline("image-classification", model="Jwalit/document-moire-detector")
result = classifier("path/to/document.jpg")
print(result)
# [{'label': 'clean', 'score': 0.99}, {'label': 'moire', 'score': 0.01}]
```
Or manually:
```python
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch
processor = AutoImageProcessor.from_pretrained("Jwalit/document-moire-detector")
model = AutoModelForImageClassification.from_pretrained("Jwalit/document-moire-detector")
image = Image.open("document.jpg")
inputs = processor(image, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
predicted = logits.argmax(-1).item()
print(model.config.id2label[predicted]) # 'clean' or 'moire'
```
## Limitations
- Trained on synthetic moiré patterns — may not capture all real-world moiré variations
- Optimized for document images; performance on natural scene images may vary
- Input images resized to 224×224; very subtle moiré in high-resolution images may be lost
- Higher recall than precision — may occasionally flag clean images as moiré (false positive rate ~1.5%)