blt-reasoner-pilot1 / code /__init__.py
LauraGG's picture
BLT-Reasoner pilot 1: ckpts + code + logs + ablations
9477b5c verified
raw
history blame contribute delete
820 Bytes
"""BLT-Reasoner: Bottlenecked Latent Thoughts with explicit info objective.
Replaces Abstract-CoT's discrete-sampled z̃ with a continuous latent loop,
adds an InfoNCE z↔y identifiability objective that makes the constant-z
basin mechanically impossible, and enforces a strict y→only-z attention
mask so the latent is the only information channel from prompt to answer.
Modules:
model.py — continuous latent loop + 4D bottleneck mask
losses.py — InfoNCE, centroid (warmup), LM loss
data.py — GSM8K + chat template + GSM8K-final-answer extraction
train.py — Phase B training loop (LM + InfoNCE)
eval.py — pre-registered z-ablation eval (normal/zero/random/shuffled)
smoke_test.py — 30-min identifiability existence proof (decision gate)
"""