File size: 4,086 Bytes

6928d76
3c77c20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6928d76
 
049434c
6928d76
4d3fefd
6928d76
3c77c20
6928d76
4d3fefd
 
6928d76
3c77c20
4d3fefd
3c77c20
6928d76
06fc3fd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4d3fefd
 
 
 
 
 
 
 
06fc3fd
4d3fefd
06fc3fd
4d3fefd
 
06fc3fd
 
4d3fefd
 
 
 
 
 
 
 
6928d76
06fc3fd
6928d76
06fc3fd
 
 
4d3fefd
06fc3fd
 
 
 
 
 
 
 
 
 
 
 
 
6928d76
3c77c20
6928d76
3c77c20
 
6928d76
3c77c20
4d3fefd
3c77c20
 
 
6928d76
3c77c20
 
 
 
 
6928d76
3c77c20
 
6928d76
3c77c20
 
 
 
06fc3fd
 
 
3c77c20
6928d76
3c77c20
06fc3fd

---
license: apache-2.0
tags:
- image-classification
- moire-detection
- document-analysis
- document-quality
- vision
datasets:
- hf-tuner/rvl-cdip-document-classification
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: image-classification
---

# Document Moiré Detection Model (V2)

A fine-tuned **DeiT-small** Vision Transformer for detecting moiré patterns in document images.

## Model Description

Binary classifier: detects whether a document image contains moiré artifacts
(common from screen photography, scanning, or screen captures).

**Labels:**
- `clean` (0): No moiré patterns
- `moire` (1): Moiré patterns detected

## V1 → V2 Comparison

| | V1 (DeiT-tiny) | **V2 (DeiT-small)** |
|---|---|---|
| **Parameters** | 5.5M | **22M** |
| **Training samples** | 6,000 | **8,000** |
| **Moiré methods** | 4 | **6** (+subtle, +localized) |
| **Label smoothing** | — | **0.05** |
| **Accuracy** | 99.5% | **99.1%** |
| **F1 Score** | 0.995 | **0.991** |
| **Precision** | 99.3% | **98.5%** |
| **Recall** | 99.7% | **99.8%** |

> **Note:** V2 was evaluated on harder examples including subtle single-frequency moiré and localized
> moiré patterns that V1 never trained on. V2 achieves near-perfect recall (99.75%) — it catches
> virtually all moiré patterns including very subtle ones, at the cost of slightly lower precision.

## Training Details

| Parameter | Value |
|-----------|-------|
| Base model | `facebook/deit-small-patch16-224` |
| Parameters | 22M |
| Training samples | 8,000 (4,000 clean + 4,000 moiré) |
| Eval samples | 800 (400 clean + 400 moiré) |
| Epochs | 5 |
| Learning rate | 3e-5 (cosine schedule) |
| Effective batch size | 64 |
| Label smoothing | 0.05 |
| Warmup steps | 60 |
| Best checkpoint | Epoch 2 (by F1) |

### Moiré Generation Methods
1. **Resize aliasing** — downscale+upscale with NEAREST interpolation + pattern overlay
2. **Pattern overlay** — sinusoidal interference with per-channel color variation
3. **Multi-frequency** — 2-4 patterns at different frequencies + color displacement
4. **Screen simulation** — pixel grid + rotation + moiré overlay
5. **Subtle moiré** — very low strength single-frequency (hard examples)
6. **Localized moiré** — moiré in elliptical region with gaussian mask

## Performance (Best Checkpoint)

| Metric | Value |
|--------|-------|
| Accuracy | 99.12% |
| F1 Score | 0.9913 |
| Precision | 98.52% |
| Recall | 99.75% |
| Eval Loss | 0.170 |

### Training Progression

| Epoch | Eval Loss | Accuracy | F1 | Precision | Recall |
|-------|-----------|----------|-----|-----------|--------|
| 1 | 0.173 | 99.0% | 0.990 | 99.3% | 98.8% |
| 2 | 0.177 | 99.1% | **0.991** | 99.0% | 99.3% |
| 3 | 0.154 | 99.1% | 0.991 | 99.3% | 99.0% |
| 4 | 0.170 | 99.1% | **0.991** | 98.5% | **99.8%** |
| 5 | 0.168 | 99.0% | 0.990 | 98.3% | **99.8%** |

## Usage

```python
from transformers import pipeline

classifier = pipeline("image-classification", model="Jwalit/document-moire-detector")
result = classifier("path/to/document.jpg")
print(result)
# [{'label': 'clean', 'score': 0.99}, {'label': 'moire', 'score': 0.01}]
```

Or manually:
```python
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch

processor = AutoImageProcessor.from_pretrained("Jwalit/document-moire-detector")
model = AutoModelForImageClassification.from_pretrained("Jwalit/document-moire-detector")

image = Image.open("document.jpg")
inputs = processor(image, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
    predicted = logits.argmax(-1).item()

print(model.config.id2label[predicted])  # 'clean' or 'moire'
```

## Limitations
- Trained on synthetic moiré patterns — may not capture all real-world moiré variations
- Optimized for document images; performance on natural scene images may vary
- Input images resized to 224×224; very subtle moiré in high-resolution images may be lost
- Higher recall than precision — may occasionally flag clean images as moiré (false positive rate ~1.5%)