Anime/VN Overlay Text Filter (EfficientNet-B2, 640px)
Binary image classifier for filtering assets with overlaid communication text.
Labels:
0:no_text1:has_text
Objective used during curation:
has_text = 1if the image contains text that is non-organic to the depicted scene and intentionally overlaid for communication.
Checkpoint
- Backbone:
efficientnet_b2 - Input size:
640 - Format:
safetensors - File:
model.safetensors - Source run:
runs/backbone_res_sweep_20260222_211117/tf_efficientnet_b2__img640 - Selection monitor during sweep:
ood_fixed05_h_bal
Recommended Threshold
0.50(balanced deployment default)
OOD test (anime_foreground_text_eval/ood_test_v2.csv) at thr=0.50:
- TP: 98
- FP: 6
- TN: 92
- FN: 4
- Precision:
0.9423 - Recall:
0.9608 - F1:
0.9515 - H_bal:
0.9497
Note: no threshold reached precision >= 0.98 on OOD dev for this checkpoint.
OOD Visual Audit (Recommended Threshold)
Green check = correct prediction, red X = incorrect prediction.
Images are linked to 4x retina versions.
Ground truth has_text:
Ground truth no_text:
Quick Inference
python inference.py --image path/to/image.jpg
python inference.py --glob "path/to/images/*.png" --threshold 0.50 --json
Notes
- Use
threshold_tuning.jsonfor full threshold sweep details. - Re-tune threshold on your own live sample if deployment domain shifts.