File size: 4,040 Bytes
4ee4910 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | ---
license: openrail
language: en
library_name: timm
tags:
- image-classification
- anime
- real
- rendered
- 3d-graphics
datasets:
- coco
- custom-anime
- steam-screenshots
---
# EfficientNet-B0 - Anime/Real/Rendered Classifier
Fast, lightweight image classifier distinguishing photographs from anime and 3D rendered images.
## Model Summary
- **Model Name:** efficientnet_b0
- **Framework:** PyTorch + TIMM
- **Input:** 224×224 RGB images
- **Output:** 3 classes (anime, real, rendered)
- **Parameters:** 5.3M
- **Size:** 16.2 MB
## Intended Use
Classify images into three categories:
- **anime**: Drawn 2D or cel-shaded animation
- **real**: Photographs and real-world footage
- **rendered**: 3D graphics (games, CGI, Pixar, etc.)
## Performance
**Validation Accuracy:** 97.44%
| Class | Precision | Recall | F1-Score | Support |
|-------|-----------|--------|----------|---------|
| anime | 0.98 | 0.99 | 0.99 | 236 |
| real | 0.98 | 0.98 | 0.98 | 500 |
| rendered | 0.96 | 0.93 | 0.94 | 161 |
| **weighted avg** | **0.97** | **0.97** | **0.97** | **897** |
## Training Data
- **Real images:** 5,000 COCO 2017 validation set images
- **Anime images:** 2,357 curated animation frames and key scenes
- **Rendered images:** 1,549 AAA game screenshots (Metacritic ≥75) + 61 Pixar movie stills
- **Total:** 8,967 images, 8,070 training, 897 validation (perceptually-hashed for diversity)
## Training Details
- **Framework:** PyTorch
- **Augmentation:** Resize only (224×224)
- **Loss Function:** CrossEntropyLoss with inverse frequency class weights
- **Optimizer:** AdamW (lr=0.001)
- **Batch Size:** 80
- **Epochs:** 20
- **Hardware:** NVIDIA RTX 3060 (12GB VRAM)
- **Training Time:** ~20 minutes
## Limitations
1. Photorealistic video games sometimes classified as real (90% recall on rendered class)
2. Cel-shaded games may score as anime rather than rendered
3. Artistic 3D renders (Pixar, high-quality CGI) show mixed confidence
4. Performance degrades on images <224×224
## Recommendations
- Use confidence threshold of ≥80% for reliable predictions
- For critical applications, ensemble with tf_efficientnetv2_s
- Check confusion patterns in own use cases
- Manually review edge cases (game screenshots, stylized renders)
## How to Use
```python
from PIL import Image
import torch
from torchvision import transforms
import timm
from safetensors.torch import load_file
# Load
model = timm.create_model('efficientnet_b0', num_classes=3, pretrained=False)
state_dict = load_file('model.safetensors')
model.load_state_dict(state_dict)
model.eval()
# Prepare image
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
])
img = Image.open('image.jpg').convert('RGB')
x = transform(img).unsqueeze(0)
# Infer
with torch.no_grad():
logits = model(x)
probs = torch.softmax(logits, dim=1)
pred = probs.argmax().item()
labels = ['anime', 'real', 'rendered']
print(f"{labels[pred]}: {probs[0, pred]:.1%}")
```
## Benchmarks
**Inference Speed (RTX 3060)**
- Single image: ~20ms
- Batch of 32: ~150ms
**Accuracy Comparison**
| Model | Accuracy | Speed | Params |
|-------|----------|-------|--------|
| EfficientNet-B0 | 97.44% | Fast | 5.3M |
| TF-EfficientNetV2-S | 97.55% | Moderate | 21.5M |
## Ethical Considerations
This model classifies images by visual style/source. Potential misuse:
- Detecting deepfakes/AI-generated content (not designed for this)
- Filtering user-generated content (may have cultural bias)
- Surveillance or profiling
**Recommendations:**
- Use with human review for content moderation
- Test on your target domain before deployment
- Don't rely solely on automatic classification for safety-critical decisions
- Consider cultural representation in anime/rendered content
## Contact
For questions or issues: [GitHub repo]
## License
OpenRAIL (Open Responsible AI License) - free for research and commercial use with proper attribution
|