AIT-75M-GGUF โ Quantized Audio, Image, Text Embeddings
GGUF-quantized versions of AIT-75M for edge deployment.
AIT-75M maps image, audio, and text into a shared 1280-dim embedding space with Matryoshka truncation support (1280/768/512/256/128 dims).
Available Quantizations
| File | Quant | Size | Compression | Notes |
|---|---|---|---|---|
| AIT-75M-q5_0.gguf | Corrected Q5_0 text encoder, Q8/F16/F32 remaining tensors | 106 MB | 2.7x | Smaller edge artifact; EMEL runtime golden validation pending |
| AIT-75M-q8_0.gguf | Q8_0-compatible encoders, F16/F32 projection heads | 114 MB | 2.5x | Minimal quality loss |
Corrected q5 artifact
The current AIT-75M-q5_0.gguf is the corrected c63 build from the trimodal_3head_h1920_textft_v1 lineage. It keeps projection-head tensors aligned with the q8 release while applying Q5_0 to the quantizable text-encoder tensors. Exporter/dequant SALT checks pass; EMEL runtime golden validation is pending.
Source Model
See augmem/AIT-75M for full architecture details, benchmarks, and usage.
License
Apache 2.0
- Downloads last month
- 242
Hardware compatibility
Log In to add your hardware
5-bit
8-bit
Model tree for augmem/AIT-75M-GGUF
Base model
augmem/AIT-75M