AIT-75M-GGUF — Quantized Audio, Image, Text Embeddings

GGUF-quantized versions of AIT-75M for edge deployment.

AIT-75M maps image, audio, and text into a shared 1280-dim embedding space with Matryoshka truncation support (1280/768/512/256/128 dims).

Available Quantizations

File	Quant	Size	Compression	Notes
AIT-75M-q5_0.gguf	Corrected Q5_0 text encoder, Q8/F16/F32 remaining tensors	106 MB	2.7x	Smaller edge artifact; EMEL runtime golden validation pending
AIT-75M-q8_0.gguf	Q8_0-compatible encoders, F16/F32 projection heads	114 MB	2.5x	Minimal quality loss

Corrected q5 artifact

The current AIT-75M-q5_0.gguf is the corrected c63 build from the trimodal_3head_h1920_textft_v1 lineage. It keeps projection-head tensors aligned with the q8 release while applying Q5_0 to the quantizable text-encoder tensors. Exporter/dequant SALT checks pass; EMEL runtime golden validation is pending.

Source Model

See augmem/AIT-75M for full architecture details, benchmarks, and usage.

License

Apache 2.0

Downloads last month: 1

GGUF

Model size

75.4M params

Architecture

omniembed

Hardware compatibility

5-bit

8-bit

Model tree for augmem/AIT-75M-GGUF

Base model

augmem/AIT-75M

Quantized

(1)

this model