ESS-AIST-81M Preview

ESS-AIST-81M Preview is the current Cortext trial checkpoint from the ESS line.

release checkpoint: ess_aist_full_v9_subjectfix_l4k/best_model.pt
exported checkpoint epoch: 3
text encoder: MongoDB/mdbr-leaf-ir
image encoder: mobilenetv4_conv_medium.e180_r384_in12k
audio encoder: native mn20_as EfficientAT LoRA audio backbone

This preview is the current bridge artifact for Cortext. It keeps the ESS semantic / subject / event slice layout, but the v9 dataset repair moved the subject slice much closer to the entity signal Cortext actually needs.

GGUF quantizations for this exact release live in:

augmem/ESS-AIST-81M-preview-GGUF

Tradeoff:

held-out subject/entity separation is much stronger than the earlier v7 preview
speech and SALT retrieval are weaker than the earlier v7 retrieval-max point

For Cortext, this is still the better preview because the entity-side signal is materially stronger.

Embedding Layout

Output embedding: 1536d

0:512 semantic
512:1024 subject
1024:1536 event

Recommended normalized runtime views:

semantic_key = l2norm(z[0:512])
subject_key = l2norm(z[512:1024])
event_key = l2norm(z[1024:1536])
full_key = l2norm(z[0:1536])

Exact Release Metrics

All numbers below are from the exact published checkpoint state exported from ess_aist_full_v9_subjectfix_l4k/best_model.pt at checkpoint epoch 3.

Evaluation scope note:

SALT is no longer a clean out-of-training benchmark for this ESS line because SALT-derived rows were used during training.
speech holdout is also train-adjacent rather than fully external because explicit speech/audio-text data was added back into the corpus.
In this release, those two surfaces should be read as regression gates and in-domain checks, not as contamination-free generalization claims.
A later full external sweep (MTEB / MIEB / MAEB) is still pending.

512d Retrieval

Source:

retrieval_512_gt1030.json

Speech holdout:

A->T_r1 = 0.3276
T->A_r1 = 0.3202
A->T_r5 = 0.6120
T->A_r5 = 0.6046

SALT:

I->T_r1 = 0.3179
T->I_r1 = 0.3425
A->T_r1 = 0.1226
T->A_r1 = 0.1272
I->A_r1 = 0.1970
A->I_r1 = 0.2148

Held-Out ESS Eval

Sources:

subject_eval.json
event_eval.json
prefix_eval.json

Subject / entity surface:

subject_key same/different AUC: 0.9881
subject_key same-topic-different-subject rejection AUC: 0.9881

Event / disambiguation surface:

subject_key event same/different AUC: 0.8855
event_key event same/different AUC: 0.8193
subject_key same-subject-different-event rejection AUC: 0.7381
event_key same-subject-different-event rejection AUC: 0.6807
subject_key topic-shift rejection AUC: 0.9513
event_key topic-shift rejection AUC: 0.8969

Interpretation:

the repaired v9 held-out surface is no longer near-random on subject/entity
the current subject slice is the strongest entity carrier in the model
event structure is usable, but still entangled with subject
this is the right bridge checkpoint for Cortext, not the final semantic/entity architecture

Architecture

This preview is a frozen-encoder / trainable-projector stack:

text encoder params: 22,861,056
image encoder params: 8,434,512
audio encoder params: 20,639,974
image projection params: 9,975,296
audio projection params: 9,975,296
text projection params: 8,926,720
total exact loaded params: 80,812,854

The audio path is not the old dual-audio teacher path. It uses the native audioheavy LoRA EfficientAT backbone.

Files

File	Purpose
`ESS-AIST-81M.safetensors`	Full preview release artifact
`export_metadata.json`	ESS export contract
`manifest.json`	Release manifest
`parameter_breakdown.json`	Exact parameter accounting
`ess_ait_86m_spec.yaml`	Training config used for the release line
`retrieval_512_gt1030.json`	Exact 512d retrieval eval for this checkpoint
`subject_eval.json`	Exact held-out subject eval for this checkpoint
`event_eval.json`	Exact held-out event eval for this checkpoint
`prefix_eval.json`	Prefix-level AUC summary

Caveats

This is the current preview checkpoint, not the final Cortext model family.
The current runtime slices are still named semantic / subject / event; the next family will move toward semantic / entity.
Subject/entity is now strong on the repaired v9 held-out surface, but event remains entangled and the engine still needs attention over active anchors for weak-reference resolution.
Retrieval on speech holdout and SALT is lower than the earlier v7 preview.
SALT and speech holdout are useful release gates for this line, but they are no longer fully external benchmarks in the same way they were for the earlier pre-ESS artifacts.
Use this for internal Cortext trials, not as the final memory-model release.

Downloads last month: 4

GGUF

Model size

80.9M params

Architecture

triembed

Hardware compatibility

5-bit

8-bit

Model tree for augmem/ESS-AIST-81M-preview

Quantizations

1 model