CoT Oracle Paper Ablations And Baselines

ceselder 's Collections

Building Better Activation Oracles

LoRAcle — training data + eval

LoRAcle OOD eval models

Loracle: weight-reading model interpretability

CoT Oracle Paper Ablations And Baselines

loracle

CoT Oracle Training Data

CoT Oracle Evals

CoT Oracle Paper Ablations And Baselines

updated Apr 2

All models used for my LessWrong post. Generally recommended to use latest adam oracle, or the checkpoint confusingly labelled "no DPO"

Upvote

ceselder/adam-reupload-qwen3-8b-latentqa-cls-past-lens

Text Generation • Updated Mar 30 • 42

Note Adam original AO checkpoint re-upload with a detailed card. Closest documented aggregate stats: `66,469,521` tokens; paper shorthand `~60M`.
ceselder/adam-reupload-qwen3-8b-full-mix-synthetic-qa-v3-replace-lqa

Text Generation • Updated Mar 30 • 2

Note Adam synthetic-QA checkpoint re-upload with a detailed card derived from `ao_config.json`. Exact token count remains undocumented in the source repo.
ceselder/cot-oracle-paper-ablation-adam-recipe-1layer

Text Generation • Updated Mar 30 • 1

Note Paper ablation: Adam recipe inside `cot-oracle`, 1 layer, paper label `17M` logged training tokens.
ceselder/cot-oracle-paper-ablation-ours-1layer

Text Generation • Updated Mar 30 • 2

Note Paper ablation: ours, 1 layer, paper label `22.5M` logged training tokens.
ceselder/cot-oracle-paper-ablation-ours-3layers

Text Generation • Updated Mar 30

Note Paper ablation: ours, 3 layers, latest recoverable checkpoint from a run labeled `18M` logged training tokens.
ceselder/cot-oracle-paper-ablation-ours-3layers-onpolicy-lens-only

Text Generation • Updated Mar 30 • 38

Note Paper ablation: ours, 3 layers with FineWeb lens replaced by extra on-policy lens data. Run later reached `22.3M` logged training tokens before crash; repo contains the latest successfully uploaded checkpoint.
ceselder/cot-oracle-qwen3-8b-final-sprint-checkpoint-no-DPO

Text Generation • Updated Mar 30

Note Final no-DPO CoT Oracle checkpoint with the full task mixture, labeled `100M` training tokens.
ceselder/cot-oracle-grpo-step-500

Text Generation • Updated Mar 30 • 1

Note Best GRPO checkpoint, re-uploaded as a standalone model repo from step `500`.

Upvote