| --- |
| license: apache-2.0 |
| tags: |
| - image-classification |
| - moire-detection |
| - document-analysis |
| - document-quality |
| - vision |
| datasets: |
| - hf-tuner/rvl-cdip-document-classification |
| metrics: |
| - accuracy |
| - f1 |
| - precision |
| - recall |
| pipeline_tag: image-classification |
| --- |
| |
| # Document Moiré Detection Model (V2) |
|
|
| A fine-tuned **DeiT-small** Vision Transformer for detecting moiré patterns in document images. |
|
|
| ## Model Description |
|
|
| Binary classifier: detects whether a document image contains moiré artifacts |
| (common from screen photography, scanning, or screen captures). |
|
|
| **Labels:** |
| - `clean` (0): No moiré patterns |
| - `moire` (1): Moiré patterns detected |
|
|
| ## V1 → V2 Comparison |
|
|
| | | V1 (DeiT-tiny) | **V2 (DeiT-small)** | |
| |---|---|---| |
| | **Parameters** | 5.5M | **22M** | |
| | **Training samples** | 6,000 | **8,000** | |
| | **Moiré methods** | 4 | **6** (+subtle, +localized) | |
| | **Label smoothing** | — | **0.05** | |
| | **Accuracy** | 99.5% | **99.1%** | |
| | **F1 Score** | 0.995 | **0.991** | |
| | **Precision** | 99.3% | **98.5%** | |
| | **Recall** | 99.7% | **99.8%** | |
|
|
| > **Note:** V2 was evaluated on harder examples including subtle single-frequency moiré and localized |
| > moiré patterns that V1 never trained on. V2 achieves near-perfect recall (99.75%) — it catches |
| > virtually all moiré patterns including very subtle ones, at the cost of slightly lower precision. |
|
|
| ## Training Details |
|
|
| | Parameter | Value | |
| |-----------|-------| |
| | Base model | `facebook/deit-small-patch16-224` | |
| | Parameters | 22M | |
| | Training samples | 8,000 (4,000 clean + 4,000 moiré) | |
| | Eval samples | 800 (400 clean + 400 moiré) | |
| | Epochs | 5 | |
| | Learning rate | 3e-5 (cosine schedule) | |
| | Effective batch size | 64 | |
| | Label smoothing | 0.05 | |
| | Warmup steps | 60 | |
| | Best checkpoint | Epoch 2 (by F1) | |
|
|
| ### Moiré Generation Methods |
| 1. **Resize aliasing** — downscale+upscale with NEAREST interpolation + pattern overlay |
| 2. **Pattern overlay** — sinusoidal interference with per-channel color variation |
| 3. **Multi-frequency** — 2-4 patterns at different frequencies + color displacement |
| 4. **Screen simulation** — pixel grid + rotation + moiré overlay |
| 5. **Subtle moiré** — very low strength single-frequency (hard examples) |
| 6. **Localized moiré** — moiré in elliptical region with gaussian mask |
|
|
| ## Performance (Best Checkpoint) |
|
|
| | Metric | Value | |
| |--------|-------| |
| | Accuracy | 99.12% | |
| | F1 Score | 0.9913 | |
| | Precision | 98.52% | |
| | Recall | 99.75% | |
| | Eval Loss | 0.170 | |
|
|
| ### Training Progression |
|
|
| | Epoch | Eval Loss | Accuracy | F1 | Precision | Recall | |
| |-------|-----------|----------|-----|-----------|--------| |
| | 1 | 0.173 | 99.0% | 0.990 | 99.3% | 98.8% | |
| | 2 | 0.177 | 99.1% | **0.991** | 99.0% | 99.3% | |
| | 3 | 0.154 | 99.1% | 0.991 | 99.3% | 99.0% | |
| | 4 | 0.170 | 99.1% | **0.991** | 98.5% | **99.8%** | |
| | 5 | 0.168 | 99.0% | 0.990 | 98.3% | **99.8%** | |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import pipeline |
| |
| classifier = pipeline("image-classification", model="Jwalit/document-moire-detector") |
| result = classifier("path/to/document.jpg") |
| print(result) |
| # [{'label': 'clean', 'score': 0.99}, {'label': 'moire', 'score': 0.01}] |
| ``` |
|
|
| Or manually: |
| ```python |
| from transformers import AutoImageProcessor, AutoModelForImageClassification |
| from PIL import Image |
| import torch |
| |
| processor = AutoImageProcessor.from_pretrained("Jwalit/document-moire-detector") |
| model = AutoModelForImageClassification.from_pretrained("Jwalit/document-moire-detector") |
| |
| image = Image.open("document.jpg") |
| inputs = processor(image, return_tensors="pt") |
| with torch.no_grad(): |
| logits = model(**inputs).logits |
| predicted = logits.argmax(-1).item() |
| |
| print(model.config.id2label[predicted]) # 'clean' or 'moire' |
| ``` |
|
|
| ## Limitations |
| - Trained on synthetic moiré patterns — may not capture all real-world moiré variations |
| - Optimized for document images; performance on natural scene images may vary |
| - Input images resized to 224×224; very subtle moiré in high-resolution images may be lost |
| - Higher recall than precision — may occasionally flag clean images as moiré (false positive rate ~1.5%) |
|
|