hyper3-clip-beta / README.md
mnm-matin's picture
Replace with Hyper3-CLIP beta hier-beta scratch checkpoint
b341bc3 verified
|
Raw
History Blame Contribute Delete
2.35 kB
metadata
license: openmdw-1.0
pipeline_tag: feature-extraction
tags:
  - vision-language
  - multimodal
  - image-text-retrieval
  - hyperbolic-embeddings
  - clip
  - research
  - scratch-training
  - hier-beta
  - argent

Hyper3-CLIP beta

Hyper3-CLIP beta is the hyper³labs ViT-B scratch checkpoint trained with the hier-beta ARGENT objective.

This repository publishes the raw PyTorch training checkpoint for the completed 500k-step paper-scratch run. It is not the older Hyper3-CLIP v0.5 SentenceTransformers package.

Artifact

  • Checkpoint: checkpoint_final.pt
  • Config: config.yaml
  • Training metadata: metadata.json
  • Run: hyper3_vitb_clip_uncha_hier_beta_argent_mp5_paper_scratch_8x500k_s31
  • Objective: uncha with uncha_entailment_loss: hier_beta_argent
  • Vision backbone: vit_base_patch16_224
  • Vision pretrained: false
  • Text model architecture/tokenizer: openai/clip-vit-base-patch32
  • Text pretrained: false
  • Embedding dimension: 512
  • Training steps: 500,000
  • Global batch size: 768

Evaluation

The eval/ directory includes the paper-comparable full benchmark table and the raw wide summary row used for the current model comparison.

Headline row from the local full eval:

  • ImageNet top-1: 46.984%
  • COCO I2T/T2I R@10: 84.30 / 73.19
  • Flickr I2T/T2I R@10: 97.60 / 91.44
  • WordNet hierarchy: TIE 3.1597, LCA 2.0786, Jaccard 0.8179
  • PEP AUC/AP: 96.07 / 69.36

The checkpoint is strong on retrieval in the paper-comparable table, but weak on several flat/fine-grained zero-shot datasets such as Food101, CUB, Flowers102, Cars, and Aircraft. Treat this release as a research checkpoint, not a polished production model.

Loading

This is a raw training checkpoint. Use the hyper³labs hyper3-clip codebase and the included config.yaml to instantiate the model, then load checkpoint_final.pt.

import torch

checkpoint = torch.load("checkpoint_final.pt", map_location="cpu", weights_only=False)
state_dict = checkpoint.get("model", checkpoint)

License And Attribution

The model materials in this repository are released under OpenMDW-1.0. Redistributions should preserve NOTICE, LICENSE, and the model card when practical.

Please cite and link to the original hyper³labs model repository when publishing benchmarks, papers, derivative checkpoints, or public demos based on this model.