miewid-msv3-latonia-1233

Finetuned checkpoint for Hula painted frog (Latonia nigriventer) photo-ID.
Lineage: EfficientNetV2 -> miewid-msv3 -> miewid-msv3-latonia-1233.

Model details

  • Base model: EfficientNetV2 (via miewid-msv3)
  • Task: Individual photo-identification via deep local-feature matching
  • Domain: Hula painted frog, ventral pattern images
  • Training data: 1,233 photos (Latonia dataset; more information in the preprint)
  • Classifier head: ArcFace used during training only; for re-ID use embeddings + cosine similarity (do not use the classifier head).

Preprocessing

Images are preprocessed as described in the preprint:

  • Rotate images so the head is oriented upwards.
  • Detect a bounding box using MegaDetector and crop to the bbox.
  • Apply the transforms shown in the Usage example below.

Training

See the preprint for full training details and evaluation protocol.

Usage

This repository provides a single PyTorch checkpoint:

  • miewid-msv3-latonia-1233.pt

Example (embed images and compute cosine similarity):

import torch
from PIL import Image
from torchvision import transforms

from transformers import AutoModel


class ZoomCenterCrop:
    def __init__(self, zoom=1.0):
        self.zoom = zoom

    def __call__(self, img):
        w, h = img.size
        m = int(min(h, w) / self.zoom)
        left = (w - m) // 2
        top = (h - m) // 2
        return img.crop((left, top, left + m, top + m))


preprocess = transforms.Compose([
    ZoomCenterCrop(zoom=2.0),
    transforms.Resize((440, 440)),
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225]),
])

model = AutoModel.from_pretrained("conservationxlabs/miewid-msv3", trust_remote_code=True)
ckpt = torch.load("miewid-msv3-latonia-1233.pt", map_location="cpu")
model.load_state_dict(ckpt["model"], strict=True)
model.eval()

def embed(path):
    img = Image.open(path).convert("RGB")
    x = preprocess(img).unsqueeze(0)
    with torch.no_grad():
        emb = model(x)
    return emb / emb.norm(dim=1, keepdim=True)

e1 = embed("img1.jpg")
e2 = embed("img2.jpg")
cosine_sim = (e1 @ e2.T).item()
print(cosine_sim)

Intended use

  • Research on automated photo-ID of Hula painted frogs
  • Evaluation and reproduction of results in the associated preprint

Limitations

  • Trained on a specific dataset (1,233 images) and may not generalize to other species or imaging setups.
  • Performance depends on image quality and pose/lighting conditions.

Citation

If you use this model, please cite:

Yesharim, M., Bina Perl, R. G., Roll, U., Gafny, S., Geffen, E., Ram, Y.
"Near-perfect photo-ID of the Hula painted frog with zero-shot deep local-feature matching."
arXiv:2601.08798 (2026). https://arxiv.org/abs/2601.08798

License

See LICENSE in this repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for yoavram-lab/Latonia