TEG-421M-GGUF — Quantized Trimodal Embeddings Gemma

GGUF-quantized versions of TEG-421M for edge deployment.

TEG (Trimodal Embeddings Gemma) maps image, audio, and text into a shared 768-dim embedding space via Google's embeddinggemma-300M, with Matryoshka truncation support down to 128 dims.

Available quantizations

File	Quant	Size	Description
`teg-421m-q8_0.gguf`	Q8_0	501 MB	8-bit — minimal quality loss
`teg-421m-q5_0.gguf`	Q5_0	408 MB	5-bit — good balance of size and quality
`teg-421m-q4_1.gguf`	Q4_1	390 MB	4-bit with offsets — best for constrained devices

All variants use per-component quantization: Gemma text model gets the target quant, image/audio encoders stay at Q8_0, and projection heads stay at F16 to preserve retrieval quality.

Architecture

See the full model card for complete architecture details, benchmarks, and training information.

Text  --> embeddinggemma-300M --------------------------> 768-dim
Image --> MobileNetV4-Medium (1280-d) --> DeepProjection -> 768-dim
Audio --> EfficientAT mn20_as (1920-d) --> DeepProjection -> 768-dim

Total parameters (fp32): 420.6M

Source model

These quantizations were produced from augmem/teg-421m.

License

Apache 2.0

Downloads last month: 85

GGUF

Model size

0.4B params

Architecture

omniembed

Hardware compatibility

4-bit

5-bit

8-bit

Model tree for augmem/teg-421m-gguf

Base model

google/embeddinggemma-300m

Finetuned

augmem/teg-421m

Quantized

(1)

this model

augmem
/

teg-421m-gguf

TEG-421M-GGUF — Quantized Trimodal Embeddings Gemma

Available quantizations

Architecture

Source model

Links

License

Model tree for augmem/teg-421m-gguf