miewid-msv3-latonia-1233

Finetuned checkpoint for Hula painted frog (Latonia nigriventer) photo-ID.
Lineage: EfficientNetV2 -> miewid-msv3 -> miewid-msv3-latonia-1233.

Model details

Base model: EfficientNetV2 (via miewid-msv3)
Task: Individual photo-identification via deep local-feature matching
Domain: Hula painted frog, ventral pattern images
Training data: 1,233 photos (Latonia dataset; more information in the preprint)
Classifier head: ArcFace used during training only; for re-ID use embeddings + cosine similarity (do not use the classifier head).

Preprocessing

Images are preprocessed as described in the preprint:

Rotate images so the head is oriented upwards.
Detect a bounding box using MegaDetector and crop to the bbox.
Apply the transforms shown in the Usage example below.

Training

See the preprint for full training details and evaluation protocol.

Usage

This repository provides a single PyTorch checkpoint:

miewid-msv3-latonia-1233.pt

Example (embed images and compute cosine similarity):

import torch
from PIL import Image
from torchvision import transforms

from transformers import AutoModel


class ZoomCenterCrop:
    def __init__(self, zoom=1.0):
        self.zoom = zoom

    def __call__(self, img):
        w, h = img.size
        m = int(min(h, w) / self.zoom)
        left = (w - m) // 2
        top = (h - m) // 2
        return img.crop((left, top, left + m, top + m))


preprocess = transforms.Compose([
    ZoomCenterCrop(zoom=2.0),
    transforms.Resize((440, 440)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
])

model = AutoModel.from_pretrained("conservationxlabs/miewid-msv3", trust_remote_code=True)
ckpt = torch.load("miewid-msv3-latonia-1233.pt", map_location="cpu")
model.load_state_dict(ckpt["model"], strict=True)
model.eval()

def embed(path):
    img = Image.open(path).convert("RGB")
    x = preprocess(img).unsqueeze(0)
    with torch.no_grad():
        emb = model(x)
    return emb / emb.norm(dim=1, keepdim=True)

e1 = embed("img1.jpg")
e2 = embed("img2.jpg")
cosine_sim = (e1 @ e2.T).item()
print(cosine_sim)

Intended use

Research on automated photo-ID of Hula painted frogs
Evaluation and reproduction of results in the associated preprint

Limitations

Trained on a specific dataset (1,233 images) and may not generalize to other species or imaging setups.
Performance depends on image quality and pose/lighting conditions.

Citation

If you use this model, please cite:

Yesharim, M., Bina Perl, R. G., Roll, U., Gafny, S., Geffen, E., Ram, Y.
"Near-perfect photo-ID of the Hula painted frog with zero-shot deep local-feature matching."
arXiv:2601.08798 (2026). https://arxiv.org/abs/2601.08798

License

See LICENSE in this repository.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for yoavram-lab/Latonia

Near-perfect photo-ID of the Hula painted frog with zero-shot deep local-feature matching

Paper • 2601.08798 • Published Jan 13