Replace with Hyper3-CLIP beta hier-beta scratch checkpoint

b341bc3 verified 12 days ago

2.35 kB

	---
	license: openmdw-1.0
	pipeline_tag: feature-extraction
	tags:
	- vision-language
	- multimodal
	- image-text-retrieval
	- hyperbolic-embeddings
	- clip
	- research
	- scratch-training
	- hier-beta
	- argent
	---

	# Hyper3-CLIP beta

	Hyper3-CLIP beta is the hyper³labs ViT-B scratch checkpoint trained with the
	hier-beta ARGENT objective.

	This repository publishes the raw PyTorch training checkpoint for the completed
	500k-step paper-scratch run. It is not the older Hyper3-CLIP v0.5
	SentenceTransformers package.

	## Artifact

	- Checkpoint: `checkpoint_final.pt`
	- Config: `config.yaml`
	- Training metadata: `metadata.json`
	- Run: `hyper3_vitb_clip_uncha_hier_beta_argent_mp5_paper_scratch_8x500k_s31`
	- Objective: `uncha` with `uncha_entailment_loss: hier_beta_argent`
	- Vision backbone: `vit_base_patch16_224`
	- Vision pretrained: `false`
	- Text model architecture/tokenizer: `openai/clip-vit-base-patch32`
	- Text pretrained: `false`
	- Embedding dimension: 512
	- Training steps: 500,000
	- Global batch size: 768

	## Evaluation

	The `eval/` directory includes the paper-comparable full benchmark table and the
	raw wide summary row used for the current model comparison.

	Headline row from the local full eval:

	- ImageNet top-1: 46.984%
	- COCO I2T/T2I R@10: 84.30 / 73.19
	- Flickr I2T/T2I R@10: 97.60 / 91.44
	- WordNet hierarchy: TIE 3.1597, LCA 2.0786, Jaccard 0.8179
	- PEP AUC/AP: 96.07 / 69.36

	The checkpoint is strong on retrieval in the paper-comparable table, but weak on
	several flat/fine-grained zero-shot datasets such as Food101, CUB, Flowers102,
	Cars, and Aircraft. Treat this release as a research checkpoint, not a polished
	production model.

	## Loading

	This is a raw training checkpoint. Use the hyper³labs `hyper3-clip` codebase and
	the included `config.yaml` to instantiate the model, then load
	`checkpoint_final.pt`.

	```python
	import torch

	checkpoint = torch.load("checkpoint_final.pt", map_location="cpu", weights_only=False)
	state_dict = checkpoint.get("model", checkpoint)
	```

	## License And Attribution

	The model materials in this repository are released under OpenMDW-1.0.
	Redistributions should preserve `NOTICE`, `LICENSE`, and the model card when
	practical.

	Please cite and link to the original hyper³labs model repository when publishing
	benchmarks, papers, derivative checkpoints, or public demos based on this model.