YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

BLT-Reasoner Pilot 1 β€” checkpoints + code

Compute-constrained latent reasoning pilot on Qwen2.5-1.5B-Instruct + GSM8K. Continuous M-step latent loop + strict yβ†’only-z bottleneck + InfoNCE z↔y identifiability loss. See code/README.md for architecture details and HANDOFF_DACOT_PROPOSAL_2026-05-16.md (in the main repo) for full motivation.

Checkpoints (LoRA adapter + projector + InfoNCE head)

Each ckpt is ~25 MB β€” only the trained adapter/projector/head; the base Qwen2.5-1.5B-Instruct is loaded fresh from HF on resume.

step K_train files
2000 4 ckpts/ckpt-step2000/model/, projector.pt, head.pt
4000 8 ckpts/ckpt-step4000/model/, projector.pt, head.pt
6000 8 ckpts/ckpt-step6000/model/, projector.pt, head.pt
8000 16 ckpts/ckpt-step8000/model/, projector.pt, head.pt
10000 16 ckpts/ckpt-step10000/model/, projector.pt, head.pt
12000 16 ckpts/ckpt-step12000/model/, projector.pt, head.pt

Pre-registered z-ablation results

Pre-registered success criterion: Ξ”_random β‰₯ 15 pp AND Ξ”_zero β‰₯ 25 pp on GSM8K-test. Below are the interim results captured during training.

ckpt K_eval n acc(normal) acc(random) acc(zero) Ξ”_random Ξ”_zero
ckpt-step10000 16 100 0.090 0.000 0.000 +0.090 +0.090
ckpt-step2000 4 100 0.030 0.000 0.000 +0.030 +0.030
ckpt-step2000 16 100 0.000 0.000 0.000 +0.000 +0.000
ckpt-step6000 8 100 0.110 0.000 0.010 +0.110 +0.100
ckpt-step8000 16 100 0.040 0.010 0.000 +0.030 +0.040

Resume training on a fresh instance

git clone <main-repo-with-experiments/blt_reasoner>  # or pull the code/ subdir here
pip install transformers peft bitsandbytes datasets safetensors huggingface_hub
python3 -m experiments.blt_reasoner.train \
    --config experiments/blt_reasoner/configs/pilot_qwen15b_gsm8k.json \
    --resume_from LauraGG/blt-reasoner-pilot1:ckpts/ckpt-step6000

Notes:

  • The --resume_from flag (in train.py) accepts either a local ckpt path or a LauraGG/blt-reasoner-pilot1:ckpts/ckpt-stepN HF-Hub reference.
  • Optimizer state is not preserved across resume. Expect a short loss spike (~100–300 steps) while Adam moments re-stabilize. The latent geometry (LoRA weights, projector, head) survives intact.
  • The base model Qwen/Qwen2.5-1.5B-Instruct is fetched automatically.

Logs and intermediate artifacts

  • logs/run.log β€” full training log
  • logs/metrics.jsonl β€” per-step loss/metric breakdown
  • logs/auto_eval.log β€” poller daemon log (auto-eval on train exit)
  • logs/interim_*.log β€” interim ablation logs
  • code/ β€” full experiments/blt_reasoner/ source tree at upload time
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support