ESS-AIST-81M Preview
ESS-AIST-81M Preview is the current Cortext trial checkpoint from the ESS line.
- release checkpoint:
ess_aist_full_v9_subjectfix_l4k/best_model.pt - exported checkpoint epoch:
3 - text encoder:
MongoDB/mdbr-leaf-ir - image encoder:
mobilenetv4_conv_medium.e180_r384_in12k - audio encoder: native
mn20_asEfficientAT LoRA audio backbone
This preview is the current bridge artifact for Cortext. It keeps the ESS
semantic / subject / event slice layout, but the v9 dataset repair moved
the subject slice much closer to the entity signal Cortext actually needs.
GGUF quantizations for this exact release live in:
augmem/ESS-AIST-81M-preview-GGUF
Tradeoff:
- held-out subject/entity separation is much stronger than the earlier
v7preview - speech and SALT retrieval are weaker than the earlier
v7retrieval-max point
For Cortext, this is still the better preview because the entity-side signal is materially stronger.
Embedding Layout
Output embedding: 1536d
0:512semantic512:1024subject1024:1536event
Recommended normalized runtime views:
semantic_key = l2norm(z[0:512])subject_key = l2norm(z[512:1024])event_key = l2norm(z[1024:1536])full_key = l2norm(z[0:1536])
Exact Release Metrics
All numbers below are from the exact published checkpoint state exported from
ess_aist_full_v9_subjectfix_l4k/best_model.pt at checkpoint epoch 3.
Evaluation scope note:
SALTis no longer a clean out-of-training benchmark for this ESS line because SALT-derived rows were used during training.speech holdoutis also train-adjacent rather than fully external because explicit speech/audio-text data was added back into the corpus.- In this release, those two surfaces should be read as regression gates and in-domain checks, not as contamination-free generalization claims.
- A later full external sweep (
MTEB / MIEB / MAEB) is still pending.
512d Retrieval
Source:
retrieval_512_gt1030.json
Speech holdout:
A->T_r1 = 0.3276T->A_r1 = 0.3202A->T_r5 = 0.6120T->A_r5 = 0.6046
SALT:
I->T_r1 = 0.3179T->I_r1 = 0.3425A->T_r1 = 0.1226T->A_r1 = 0.1272I->A_r1 = 0.1970A->I_r1 = 0.2148
Held-Out ESS Eval
Sources:
subject_eval.jsonevent_eval.jsonprefix_eval.json
Subject / entity surface:
subject_keysame/different AUC:0.9881subject_keysame-topic-different-subject rejection AUC:0.9881
Event / disambiguation surface:
subject_keyevent same/different AUC:0.8855event_keyevent same/different AUC:0.8193subject_keysame-subject-different-event rejection AUC:0.7381event_keysame-subject-different-event rejection AUC:0.6807subject_keytopic-shift rejection AUC:0.9513event_keytopic-shift rejection AUC:0.8969
Interpretation:
- the repaired
v9held-out surface is no longer near-random on subject/entity - the current
subjectslice is the strongest entity carrier in the model - event structure is usable, but still entangled with subject
- this is the right bridge checkpoint for Cortext, not the final
semantic/entityarchitecture
Architecture
This preview is a frozen-encoder / trainable-projector stack:
- text encoder params:
22,861,056 - image encoder params:
8,434,512 - audio encoder params:
20,639,974 - image projection params:
9,975,296 - audio projection params:
9,975,296 - text projection params:
8,926,720 - total exact loaded params:
80,812,854
The audio path is not the old dual-audio teacher path. It uses the native audioheavy LoRA EfficientAT backbone.
Files
| File | Purpose |
|---|---|
ESS-AIST-81M.safetensors |
Full preview release artifact |
export_metadata.json |
ESS export contract |
manifest.json |
Release manifest |
parameter_breakdown.json |
Exact parameter accounting |
ess_ait_86m_spec.yaml |
Training config used for the release line |
retrieval_512_gt1030.json |
Exact 512d retrieval eval for this checkpoint |
subject_eval.json |
Exact held-out subject eval for this checkpoint |
event_eval.json |
Exact held-out event eval for this checkpoint |
prefix_eval.json |
Prefix-level AUC summary |
Caveats
- This is the current preview checkpoint, not the final Cortext model family.
- The current runtime slices are still named
semantic / subject / event; the next family will move towardsemantic / entity. - Subject/entity is now strong on the repaired
v9held-out surface, but event remains entangled and the engine still needs attention over active anchors for weak-reference resolution. - Retrieval on
speech holdoutandSALTis lower than the earlierv7preview. SALTandspeech holdoutare useful release gates for this line, but they are no longer fully external benchmarks in the same way they were for the earlier pre-ESS artifacts.- Use this for internal Cortext trials, not as the final memory-model release.
- Downloads last month
- 13
5-bit
8-bit