AAIT-86M / README.md

Slim AAIT-86M main repo and move GGUF artifacts to dedicated repo

f415dc1 verified 29 days ago

4.64 kB

	---
	library_name: pytorch
	pipeline_tag: feature-extraction
	tags:
	- multimodal
	- embeddings
	- feature-extraction
	- audio
	- image
	- text
	- retrieval
	- entity-tracking
	---

	# AAIT-86M

	`AAIT-86M` bundles the preserved `TE-86M` trimodal retrieval checkpoint together with the published ingress-anchor head.

	This package is for:

	- text, image, and audio retrieval embeddings
	- ingress-time anchor decisions over active tracks

	This package is not:

	- a generative model
	- a Cortext integration by itself

	## Why This Matters

	The important result is not the ceiling anchor scores by themselves.

	The important result is that the anchor head was added without changing the preserved retrieval path under the corrected artifact-specific evaluator:

	- retrieval artifact-specific delta vs stage 1 = `0.0`

	That is the number to look at first. Adding an ingress-time decision surface on top of a retrieval model usually trades off base retrieval quality. This package did not under the final publication gate.

	## Outputs

	- `semantic_vector` (`1280`, Matryoshka truncation supported at `1280 / 768 / 512 / 256 / 128`)
	- `anchor_key` (`128`, L2-normalized)
	- `anchor_action_logits`
	- `anchor_confidence`
	- `salience_delta`

	Action logit order:

	1. `CREATE_ANCHOR`
	2. `UPDATE_EXISTING_ANCHOR`
	3. `SPLIT_ANCHOR`
	4. `CLOSE_ANCHOR`
	5. `ABSTAIN`

	## Package Layout

	- `te86m_base_best_model.pt`
	- base trimodal retrieval checkpoint
	- `aait86m_anchor_best_model.pt`
	- anchor head checkpoint
	- `AAIT-86M.safetensors`
	- combined self-contained release artifact
	- `config.json`
	- `load_aait86m.py`
	- `example_inference.py`

	## Key Metrics

	- `same_track_accuracy = 1.0000`
	- `bind_precision = 1.0000`
	- `bind_recall = 1.0000`
	- `no_anchor_abstain_accuracy = 1.0000`
	- `wrong_active_reject_accuracy = 1.0000`
	- `stale_reject_accuracy = 1.0000`
	- `create_action_accuracy = 0.9908`
	- `create_overbind_rate = 0.0000`
	- `update_false_positive_rate = 0.0081`
	- retrieval artifact-specific delta vs stage 1 = `0.0`

	Leakage audit:

	- `episode_overlap_count = 0`
	- `entity_overlap_count = 0`
	- `cross_split_duplicate_signature_count = 0`

	## Evaluation Scope

	The published anchor evaluation is purpose-built for ingress decisions:

	- same-track continuation
	- wrong-active rejection
	- stale same-source rejection
	- no-anchor abstention
	- create vs update discrimination

	The leakage audit is clean, but the current published evaluation is still drawn from the available anchor-labeled data distribution. It is not a published adversarial or out-of-distribution Cortext replay benchmark.

	So:

	- the current numbers support the claim that the decision boundary was learned cleanly
	- they do not yet prove full wild-distribution robustness for Cortext ingress

	## Runtime Note

	The retrieval checkpoint stores the trained projection heads and runtime config for the `TE-86M` stack. Full end-to-end modality inference still depends on the upstream encoder dependencies used by `triembed`.

	## GGUF Note

	GGUF exports for this model live in the separate repository:

	- `augmem/AAIT-86M-GGUF`

	Those artifacts are quantized exports of the combined `AAIT-86M` package using the custom `triembed` architecture metadata. They are not generic llama.cpp text-model artifacts.

	## Operational Caveats

	`update_false_positive_rate = 0.0081` is the operationally most important remaining risk.

	False `CREATE_ANCHOR` errors are usually conservative memory failures. False `UPDATE_EXISTING_ANCHOR` errors can write the new signal into the wrong active track, which is the more expensive failure mode for a memory system.

	The observed create behavior is on the conservative side:

	- `create_action_accuracy = 0.9908`
	- `create_overbind_rate = 0.0000`

	That means create errors are not primarily aggressive over-binding failures.

	## Matryoshka Note

	The preserved semantic retrieval vector supports Matryoshka truncation at:

	- `1280`
	- `768`
	- `512`
	- `256`
	- `128`

	The anchor head in this release was trained and publication-gated using the full `1280`-dimensional semantic vector.

	Anchor-action quality under truncated semantic input (`256` or `128`) is not part of the current published gate and should be treated as future validation work.

	## Packaging Note

	This repo now publishes the base retrieval checkpoint and the anchor checkpoint together in one model package.

	The remaining practical limitation is dependency shape:

	- the package includes both checkpoints
	- but full end-to-end modality inference still follows the `triembed` runtime path and its encoder dependencies

	A stricter single-binary runtime path is still future packaging work.