| --- |
| license: openmdw-1.0 |
| pipeline_tag: feature-extraction |
| tags: |
| - vision-language |
| - multimodal |
| - image-text-retrieval |
| - hyperbolic-embeddings |
| - clip |
| - research |
| - scratch-training |
| - hier-beta |
| - argent |
| --- |
| |
| # Hyper3-CLIP beta |
|
|
| Hyper3-CLIP beta is the hyper³labs ViT-B scratch checkpoint trained with the |
| hier-beta ARGENT objective. |
|
|
| This repository publishes the raw PyTorch training checkpoint for the completed |
| 500k-step paper-scratch run. It is not the older Hyper3-CLIP v0.5 |
| SentenceTransformers package. |
|
|
| ## Artifact |
|
|
| - Checkpoint: `checkpoint_final.pt` |
| - Config: `config.yaml` |
| - Training metadata: `metadata.json` |
| - Run: `hyper3_vitb_clip_uncha_hier_beta_argent_mp5_paper_scratch_8x500k_s31` |
| - Objective: `uncha` with `uncha_entailment_loss: hier_beta_argent` |
| - Vision backbone: `vit_base_patch16_224` |
| - Vision pretrained: `false` |
| - Text model architecture/tokenizer: `openai/clip-vit-base-patch32` |
| - Text pretrained: `false` |
| - Embedding dimension: 512 |
| - Training steps: 500,000 |
| - Global batch size: 768 |
|
|
| ## Evaluation |
|
|
| The `eval/` directory includes the paper-comparable full benchmark table and the |
| raw wide summary row used for the current model comparison. |
|
|
| Headline row from the local full eval: |
|
|
| - ImageNet top-1: 46.984% |
| - COCO I2T/T2I R@10: 84.30 / 73.19 |
| - Flickr I2T/T2I R@10: 97.60 / 91.44 |
| - WordNet hierarchy: TIE 3.1597, LCA 2.0786, Jaccard 0.8179 |
| - PEP AUC/AP: 96.07 / 69.36 |
|
|
| The checkpoint is strong on retrieval in the paper-comparable table, but weak on |
| several flat/fine-grained zero-shot datasets such as Food101, CUB, Flowers102, |
| Cars, and Aircraft. Treat this release as a research checkpoint, not a polished |
| production model. |
|
|
| ## Loading |
|
|
| This is a raw training checkpoint. Use the hyper³labs `hyper3-clip` codebase and |
| the included `config.yaml` to instantiate the model, then load |
| `checkpoint_final.pt`. |
|
|
| ```python |
| import torch |
| |
| checkpoint = torch.load("checkpoint_final.pt", map_location="cpu", weights_only=False) |
| state_dict = checkpoint.get("model", checkpoint) |
| ``` |
|
|
| ## License And Attribution |
|
|
| The model materials in this repository are released under OpenMDW-1.0. |
| Redistributions should preserve `NOTICE`, `LICENSE`, and the model card when |
| practical. |
|
|
| Please cite and link to the original hyper³labs model repository when publishing |
| benchmarks, papers, derivative checkpoints, or public demos based on this model. |
|
|