Jewelry Photo Classifier
Two-stage waterfall pipeline for classifying jewelry photos as real customer submissions vs. catalog/screenshot/AI-generated images.
Architecture
| Stage | Model | Resolution | Task | Parameters |
|---|---|---|---|---|
| A | ConvNeXt-Base (convnext_base.fb_in22k_ft_in1k) |
384x384 | Jewelry vs Not Jewelry | 87.6M |
| B | DeiT-Small (deit_small_patch16_224) |
512x512 | Real vs Not Real | 22.0M |
Both models are ImageNet-pretrained and fine-tuned on proprietary jewelry photo data.
Decision Flow
Image -> Stage A (jewelry?) -> p(jewelry) >= 0.88 -> Stage B (real?)
-> p(jewelry) <= 0.12 -> NOT_JEWELRY
-> otherwise -> NEEDS_REVIEW
Stage B -> p(real) >= 0.71 -> JEWELRY_REAL
-> p(real) <= 0.30 -> JEWELRY_NOT_REAL
-> otherwise -> NEEDS_REVIEW
Temperature scaling is applied before softmax (Stage A: T=1.502, Stage B: T=1.397).
Performance (4,406 test images)
| Metric | Value |
|---|---|
| Stage A jewelry recall | 99.78% |
| Stage B real precision | 95.1% |
| Stage B real recall | 93.7% |
| Total review rate | 8.1% |
Files
stageA_convnext_b_best.ptโ Stage A checkpoint (state_dict)stageB_deit_s_clean_best.ptโ Stage B checkpoint (state_dict)thresholds.jsonโ Threshold/temperature configuration
Usage
from huggingface_hub import hf_hub_download
import timm, torch, json
from PIL import Image
from torchvision import transforms
# Download files
config = hf_hub_download("Valdos33/jewelry-photo-classifier", "thresholds.json")
ckpt_a = hf_hub_download("Valdos33/jewelry-photo-classifier", "stageA_convnext_b_best.pt")
ckpt_b = hf_hub_download("Valdos33/jewelry-photo-classifier", "stageB_deit_s_clean_best.pt")
Built by BriteCo.
- Downloads last month
- -