Replace with Hyper3-CLIP beta hier-beta scratch checkpoint

b341bc3 verified 11 days ago

2.35 kB

license: openmdw-1.0
pipeline_tag: feature-extraction
tags:
  - vision-language
  - multimodal
  - image-text-retrieval
  - hyperbolic-embeddings
  - clip
  - research
  - scratch-training
  - hier-beta
  - argent

Hyper3-CLIP beta

Hyper3-CLIP beta is the hyper³labs ViT-B scratch checkpoint trained with the hier-beta ARGENT objective.

This repository publishes the raw PyTorch training checkpoint for the completed 500k-step paper-scratch run. It is not the older Hyper3-CLIP v0.5 SentenceTransformers package.

Artifact

Checkpoint: checkpoint_final.pt
Config: config.yaml
Training metadata: metadata.json
Run: hyper3_vitb_clip_uncha_hier_beta_argent_mp5_paper_scratch_8x500k_s31
Objective: uncha with uncha_entailment_loss: hier_beta_argent
Vision backbone: vit_base_patch16_224
Vision pretrained: false
Text model architecture/tokenizer: openai/clip-vit-base-patch32
Text pretrained: false
Embedding dimension: 512
Training steps: 500,000
Global batch size: 768

Evaluation

The eval/ directory includes the paper-comparable full benchmark table and the raw wide summary row used for the current model comparison.

Headline row from the local full eval:

ImageNet top-1: 46.984%
COCO I2T/T2I R@10: 84.30 / 73.19
Flickr I2T/T2I R@10: 97.60 / 91.44
WordNet hierarchy: TIE 3.1597, LCA 2.0786, Jaccard 0.8179
PEP AUC/AP: 96.07 / 69.36

The checkpoint is strong on retrieval in the paper-comparable table, but weak on several flat/fine-grained zero-shot datasets such as Food101, CUB, Flowers102, Cars, and Aircraft. Treat this release as a research checkpoint, not a polished production model.

Loading

This is a raw training checkpoint. Use the hyper³labs hyper3-clip codebase and the included config.yaml to instantiate the model, then load checkpoint_final.pt.

import torch

checkpoint = torch.load("checkpoint_final.pt", map_location="cpu", weights_only=False)
state_dict = checkpoint.get("model", checkpoint)

License And Attribution

The model materials in this repository are released under OpenMDW-1.0. Redistributions should preserve NOTICE, LICENSE, and the model card when practical.

Please cite and link to the original hyper³labs model repository when publishing benchmarks, papers, derivative checkpoints, or public demos based on this model.