Anime/VN Overlay Text Filter (EfficientNet-B2, 640px)

Binary image classifier for filtering assets with overlaid communication text.

Labels:

  • 0: no_text
  • 1: has_text

Objective used during curation:

  • has_text = 1 if the image contains text that is non-organic to the depicted scene and intentionally overlaid for communication.

Checkpoint

  • Backbone: efficientnet_b2
  • Input size: 640
  • Format: safetensors
  • File: model.safetensors
  • Source run: runs/backbone_res_sweep_20260222_211117/tf_efficientnet_b2__img640
  • Selection monitor during sweep: ood_fixed05_h_bal

Recommended Threshold

  • 0.50 (balanced deployment default)

OOD test (anime_foreground_text_eval/ood_test_v2.csv) at thr=0.50:

  • TP: 98
  • FP: 6
  • TN: 92
  • FN: 4
  • Precision: 0.9423
  • Recall: 0.9608
  • F1: 0.9515
  • H_bal: 0.9497

Note: no threshold reached precision >= 0.98 on OOD dev for this checkpoint.

OOD Visual Audit (Recommended Threshold)

Green check = correct prediction, red X = incorrect prediction.
Images are linked to 4x retina versions.

Ground truth has_text:

Ground truth no_text:

Quick Inference

python inference.py --image path/to/image.jpg
python inference.py --glob "path/to/images/*.png" --threshold 0.50 --json

Notes

  • Use threshold_tuning.json for full threshold sweep details.
  • Re-tune threshold on your own live sample if deployment domain shifts.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support