File size: 2,350 Bytes
207c208
 
 
 
 
 
 
 
 
 
 
b341bc3
 
207c208
 
 
 
b341bc3
 
207c208
b341bc3
 
 
207c208
b341bc3
207c208
b341bc3
 
 
 
 
207c208
b341bc3
 
 
207c208
 
 
 
b341bc3
207c208
b341bc3
 
207c208
b341bc3
207c208
b341bc3
 
 
 
 
207c208
b341bc3
 
 
 
207c208
b341bc3
207c208
b341bc3
 
 
207c208
 
 
 
b341bc3
 
207c208
 
 
 
b341bc3
 
 
207c208
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
license: openmdw-1.0
pipeline_tag: feature-extraction
tags:
- vision-language
- multimodal
- image-text-retrieval
- hyperbolic-embeddings
- clip
- research
- scratch-training
- hier-beta
- argent
---

# Hyper3-CLIP beta

Hyper3-CLIP beta is the hyper³labs ViT-B scratch checkpoint trained with the
hier-beta ARGENT objective.

This repository publishes the raw PyTorch training checkpoint for the completed
500k-step paper-scratch run. It is not the older Hyper3-CLIP v0.5
SentenceTransformers package.

## Artifact

- Checkpoint: `checkpoint_final.pt`
- Config: `config.yaml`
- Training metadata: `metadata.json`
- Run: `hyper3_vitb_clip_uncha_hier_beta_argent_mp5_paper_scratch_8x500k_s31`
- Objective: `uncha` with `uncha_entailment_loss: hier_beta_argent`
- Vision backbone: `vit_base_patch16_224`
- Vision pretrained: `false`
- Text model architecture/tokenizer: `openai/clip-vit-base-patch32`
- Text pretrained: `false`
- Embedding dimension: 512
- Training steps: 500,000
- Global batch size: 768

## Evaluation

The `eval/` directory includes the paper-comparable full benchmark table and the
raw wide summary row used for the current model comparison.

Headline row from the local full eval:

- ImageNet top-1: 46.984%
- COCO I2T/T2I R@10: 84.30 / 73.19
- Flickr I2T/T2I R@10: 97.60 / 91.44
- WordNet hierarchy: TIE 3.1597, LCA 2.0786, Jaccard 0.8179
- PEP AUC/AP: 96.07 / 69.36

The checkpoint is strong on retrieval in the paper-comparable table, but weak on
several flat/fine-grained zero-shot datasets such as Food101, CUB, Flowers102,
Cars, and Aircraft. Treat this release as a research checkpoint, not a polished
production model.

## Loading

This is a raw training checkpoint. Use the hyper³labs `hyper3-clip` codebase and
the included `config.yaml` to instantiate the model, then load
`checkpoint_final.pt`.

```python
import torch

checkpoint = torch.load("checkpoint_final.pt", map_location="cpu", weights_only=False)
state_dict = checkpoint.get("model", checkpoint)
```

## License And Attribution

The model materials in this repository are released under OpenMDW-1.0.
Redistributions should preserve `NOTICE`, `LICENSE`, and the model card when
practical.

Please cite and link to the original hyper³labs model repository when publishing
benchmarks, papers, derivative checkpoints, or public demos based on this model.