Mitchins commited on
Commit
e02864e
·
verified ·
1 Parent(s): c5e19fd

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +94 -0
  2. config.json +32 -0
  3. pytorch_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: timm
3
+ license: mit
4
+ tags:
5
+ - anime
6
+ - garbage-detection
7
+ - image-classification
8
+ - video-preprocessing
9
+ - frame-filtering
10
+ ---
11
+
12
+ # MobileViT-S Garbage Classifier
13
+
14
+ Binary classification model for filtering objectively-bad frames (black, blurry, uniform-color, low-detail) in anime video preprocessing pipelines.
15
+
16
+ ## Model Details
17
+
18
+ - **Architecture**: MobileViT-S
19
+ - **Parameters**: 4.94M
20
+ - **Model Size**: 20MB
21
+ - **Input Size**: 256×256
22
+ - **Classes**: [quality, garbage]
23
+
24
+ ## Performance
25
+
26
+ **Without threshold (0.5):**
27
+ - Accuracy: 93.46%
28
+ - Precision: 92.24%
29
+ - Recall: 95.47%
30
+ - F1-Score: 93.83%
31
+
32
+ **With optimal threshold (0.7115):**
33
+ - Accuracy: 93.62%
34
+ - Precision: 93.92%
35
+ - Recall: 93.82%
36
+ - F1-Score: 93.87%
37
+
38
+ ## Usage
39
+
40
+ ```python
41
+ import torch
42
+ import timm
43
+ from torchvision import transforms
44
+ from PIL import Image
45
+
46
+ # Load model
47
+ model = timm.create_model('mobilevit_s', num_classes=2, pretrained=False)
48
+ model.load_state_dict(torch.load('pytorch_model.bin'))
49
+ model.eval()
50
+
51
+ # Prepare image
52
+ transform = transforms.Compose([
53
+ transforms.Resize((256, 256)),
54
+ transforms.ToTensor(),
55
+ transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
56
+ ])
57
+
58
+ img = transform(Image.open('frame.webp').convert('RGB')).unsqueeze(0).cuda()
59
+
60
+ # Predict
61
+ with torch.no_grad():
62
+ logits = model(img)
63
+ probs = torch.softmax(logits, dim=1)
64
+ garbage_prob = probs[0, 0].item() # Class 0 = garbage
65
+
66
+ # Decision
67
+ is_garbage = garbage_prob > 0.7115 # Use optimal threshold
68
+ ```
69
+
70
+ ## Training Data
71
+
72
+ - **Total frames**: 12,440
73
+ - **Training**: 10,574 frames
74
+ - **Validation**: 1,866 frames (895 garbage, 971 quality)
75
+ - **Labeling**: Verified via reverse-engineered frame matching
76
+
77
+ ## Garbage Detection
78
+
79
+ Filters frames with:
80
+ - Solid black/white/uniform color (33%)
81
+ - No edge patterns (33%)
82
+ - Low detail content (16%)
83
+ - Extreme outliers (15%)
84
+
85
+ ## Threshold Recommendations
86
+
87
+ - **Default (0.5)**: Good starting point, slightly higher recall
88
+ - **Optimal (0.7115)**: Best F1-score, balanced precision/recall
89
+ - **High precision (0.75-0.80)**: Reduce false positives
90
+ - **High recall (0.60-0.65)**: Catch more garbage, accept more false positives
91
+
92
+ ## License
93
+
94
+ MIT
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architecture": "mobilevit_s",
3
+ "num_classes": 2,
4
+ "input_size": 256,
5
+ "class_names": ["garbage", "quality"],
6
+ "optimal_threshold": 0.7115,
7
+ "performance": {
8
+ "without_threshold": {
9
+ "threshold": 0.5,
10
+ "accuracy": 0.9346,
11
+ "precision": 0.9224,
12
+ "recall": 0.9547,
13
+ "f1": 0.9383
14
+ },
15
+ "with_optimal_threshold": {
16
+ "threshold": 0.7115,
17
+ "accuracy": 0.9362,
18
+ "precision": 0.9392,
19
+ "recall": 0.9382,
20
+ "f1": 0.9387
21
+ }
22
+ },
23
+ "training": {
24
+ "total_frames": 12440,
25
+ "train_frames": 10574,
26
+ "val_frames": 1866,
27
+ "validation_distribution": {
28
+ "garbage": 895,
29
+ "quality": 971
30
+ }
31
+ }
32
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:82df3609ac1bc59384576ec12b4abad489afe1a63b84783f288ab90cae9b7e48
3
+ size 19930669