Mitchins
/

anime-frame-garbage-classifier-s

+---
+library_name: timm
+license: mit
+tags:
+  - anime
+  - garbage-detection
+  - image-classification
+  - video-preprocessing
+  - frame-filtering
+---
+# MobileViT-S Garbage Classifier
+Binary classification model for filtering objectively-bad frames (black, blurry, uniform-color, low-detail) in anime video preprocessing pipelines.
+## Model Details
+- **Architecture**: MobileViT-S
+- **Parameters**: 4.94M
+- **Model Size**: 20MB
+- **Input Size**: 256×256
+- **Classes**: [quality, garbage]
+## Performance
+**Without threshold (0.5):**
+- Accuracy: 93.46%
+- Precision: 92.24%
+- Recall: 95.47%
+- F1-Score: 93.83%
+**With optimal threshold (0.7115):**
+- Accuracy: 93.62%
+- Precision: 93.92%
+- Recall: 93.82%
+- F1-Score: 93.87%
+## Usage
+```python
+import torch
+import timm
+from torchvision import transforms
+from PIL import Image
+# Load model
+model = timm.create_model('mobilevit_s', num_classes=2, pretrained=False)
+model.load_state_dict(torch.load('pytorch_model.bin'))
+model.eval()
+# Prepare image
+transform = transforms.Compose([
+    transforms.Resize((256, 256)),
+    transforms.ToTensor(),
+    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
+])
+img = transform(Image.open('frame.webp').convert('RGB')).unsqueeze(0).cuda()
+# Predict
+with torch.no_grad():
+    logits = model(img)
+    probs = torch.softmax(logits, dim=1)
+    garbage_prob = probs[0, 0].item()  # Class 0 = garbage
+# Decision
+is_garbage = garbage_prob > 0.7115  # Use optimal threshold
+```
+## Training Data
+- **Total frames**: 12,440
+- **Training**: 10,574 frames
+- **Validation**: 1,866 frames (895 garbage, 971 quality)
+- **Labeling**: Verified via reverse-engineered frame matching
+## Garbage Detection
+Filters frames with:
+- Solid black/white/uniform color (33%)
+- No edge patterns (33%)
+- Low detail content (16%)
+- Extreme outliers (15%)
+## Threshold Recommendations
+- **Default (0.5)**: Good starting point, slightly higher recall
+- **Optimal (0.7115)**: Best F1-score, balanced precision/recall
+- **High precision (0.75-0.80)**: Reduce false positives
+- **High recall (0.60-0.65)**: Catch more garbage, accept more false positives
+## License
+MIT

config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+  "architecture": "mobilevit_s",
+  "num_classes": 2,
+  "input_size": 256,
+  "class_names": ["garbage", "quality"],
+  "optimal_threshold": 0.7115,
+  "performance": {
+    "without_threshold": {
+      "threshold": 0.5,
+      "accuracy": 0.9346,
+      "precision": 0.9224,
+      "recall": 0.9547,
+      "f1": 0.9383
+    },
+    "with_optimal_threshold": {
+      "threshold": 0.7115,
+      "accuracy": 0.9362,
+      "precision": 0.9392,
+      "recall": 0.9382,
+      "f1": 0.9387
+    }
+  },
+  "training": {
+    "total_frames": 12440,
+    "train_frames": 10574,
+    "val_frames": 1866,
+    "validation_distribution": {
+      "garbage": 895,
+      "quality": 971
+    }
+  }
+}

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:82df3609ac1bc59384576ec12b4abad489afe1a63b84783f288ab90cae9b7e48
+size 19930669