--- license: openmdw-1.0 pipeline_tag: feature-extraction tags: - vision-language - multimodal - image-text-retrieval - hyperbolic-embeddings - clip - research - scratch-training - hier-beta - argent --- # Hyper3-CLIP beta Hyper3-CLIP beta is the hyper³labs ViT-B scratch checkpoint trained with the hier-beta ARGENT objective. This repository publishes the raw PyTorch training checkpoint for the completed 500k-step paper-scratch run. It is not the older Hyper3-CLIP v0.5 SentenceTransformers package. ## Artifact - Checkpoint: `checkpoint_final.pt` - Config: `config.yaml` - Training metadata: `metadata.json` - Run: `hyper3_vitb_clip_uncha_hier_beta_argent_mp5_paper_scratch_8x500k_s31` - Objective: `uncha` with `uncha_entailment_loss: hier_beta_argent` - Vision backbone: `vit_base_patch16_224` - Vision pretrained: `false` - Text model architecture/tokenizer: `openai/clip-vit-base-patch32` - Text pretrained: `false` - Embedding dimension: 512 - Training steps: 500,000 - Global batch size: 768 ## Evaluation The `eval/` directory includes the paper-comparable full benchmark table and the raw wide summary row used for the current model comparison. Headline row from the local full eval: - ImageNet top-1: 46.984% - COCO I2T/T2I R@10: 84.30 / 73.19 - Flickr I2T/T2I R@10: 97.60 / 91.44 - WordNet hierarchy: TIE 3.1597, LCA 2.0786, Jaccard 0.8179 - PEP AUC/AP: 96.07 / 69.36 The checkpoint is strong on retrieval in the paper-comparable table, but weak on several flat/fine-grained zero-shot datasets such as Food101, CUB, Flowers102, Cars, and Aircraft. Treat this release as a research checkpoint, not a polished production model. ## Loading This is a raw training checkpoint. Use the hyper³labs `hyper3-clip` codebase and the included `config.yaml` to instantiate the model, then load `checkpoint_final.pt`. ```python import torch checkpoint = torch.load("checkpoint_final.pt", map_location="cpu", weights_only=False) state_dict = checkpoint.get("model", checkpoint) ``` ## License And Attribution The model materials in this repository are released under OpenMDW-1.0. Redistributions should preserve `NOTICE`, `LICENSE`, and the model card when practical. Please cite and link to the original hyper³labs model repository when publishing benchmarks, papers, derivative checkpoints, or public demos based on this model.