AIT-86M / README.md
gcoderw's picture
Restore local gate eval block and baseline JSON on AIT-86M
8e011be verified
---
language:
- en
license: apache-2.0
tags:
- multimodal
- embedding
- matryoshka
- trimodal
- image-text-audio
- retrieval
- cross-modal
- edge
- rag
library_name: safetensors
pipeline_tag: feature-extraction
datasets:
- custom
---
# AIT-86M — Audio, Image, Text Embeddings (Depth-2)
**AIT-86M** maps image, audio, and text into a shared 1280-dim embedding space for cross-modal retrieval with a single vector index. All three modalities share one space with full Matryoshka truncation support down to 128 dims.
Built for edge deployment, with a single combined safetensors artifact.
Successor to [TE-75M](https://huggingface.co/augmem/TE-75M).
> Also available in [GGUF format](https://huggingface.co/augmem/AIT-86M-GGUF) for quantized edge deployment.
## Why This Matters
The notable result for this family is preserving the shared semantic retrieval path while adding anchor-style decision behavior in downstream variants. In practice, the hard part is usually keeping the retrieval backbone flat while adding new decision surfaces on top of it.
## File layout
```text
AIT-86M.safetensors
```
## Notes
- shared trimodal embedding space
- Matryoshka truncation: `1280 / 768 / 512 / 256 / 128`
- intended for retrieval and embedding use, not generation
## Historical Local Gate Baseline
The exact local gate baseline that was previously attached under the `TE-86M` release directory is restored here for continuity in the `AIT-86M` artifact line.
Attached JSON:
- `teacher_dual_mn20whisper_exact_gate_baseline_20260424T155324Z.json`
Seeded split-excluded baseline at `1280d`:
| Slice | Metric |
|---|---:|
| Speech holdout A->T R@1 | 0.5652 |
| Speech holdout T->A R@1 | 0.5992 |
| Speech holdout avg R@1 | 0.5822 |
| WavCaps FSD A->T R@1 | 0.1078 |
| WavCaps FSD T->A R@1 | 0.1030 |
| WavCaps FSD avg R@1 | 0.1054 |
| SALT A->I R@1 | 0.1692 |
| SALT I->A R@1 | 0.1261 |
Scope note:
- These are the canonical local gate numbers used for bounded continuation and recovery experiments in this model family.
- They are not a claim of broad public benchmark superiority.
- They are restored here because the prior card revision dropped the attached evaluation summary.
## Evaluation Scope
Published evaluations for this model family include targeted retrieval and anchor-style discrimination tasks. Those targeted evaluations are useful, but they are not a substitute for a published adversarial or out-of-distribution benchmark. Downstream runtime validation remains application-specific.
## Files
| File | Purpose |
|---|---|
| `AIT-86M.safetensors` | Base trimodal checkpoint |
| `teacher_dual_mn20whisper_exact_gate_baseline_20260424T155324Z.json` | Restored canonical local gate baseline summary |
## License
Apache 2.0