Instructions to use cds-jb/qwen3-8b-codi-pointer-chase with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use cds-jb/qwen3-8b-codi-pointer-chase with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Qwen3-8B · CODI pointer-chase — a strongly load-bearing latent-reasoning organism
A CODI (Continuous Chain-of-thought via self-DIstillation) organism finetuned from Qwen/Qwen3-8B.
The model reasons in num_latent = 6 continuous latent vectors instead of a textual chain-of-thought,
then emits a single-token answer. This is the cleanest load-bearing organism in the set: the latents are
necessary — with them removed, accuracy sits at chance even after full training.
What it does
A 26-symbol pointer chase. The prompt gives a random permutation mapping a→…, b→…, …, z→…, a start
symbol, and a hop count K∈[1,6]: "follow the mapping K times; what is the final value?" The answer is a
single letter. The mapping table is in the prompt, so the task is in-context (no recall) — but resolving
K serial hops in a single forward pass is hard, which is what forces the model to use the latent
scratchpad.
Training recipe
Standard CODI self-distillation (teacher reads the worked chase, student generates the latents and is
distilled onto the teacher) with the one principled change that makes the organism load-bearing:
sft_loss_factor = 0 — the direct question→answer pass is removed, so the answer must route through the
latents.
| base | Qwen/Qwen3-8B |
| adapter | LoRA r=128, α=32 (+ projection, resized embed/lm_head for <|bocot|>/<|eocot|>) |
num_latent |
6 |
sft_loss_factor |
0 · distill_loss_factor 20 |
| optimizer | lr 1e-4, cosine, 4 epochs, bf16, answer_only |
| dataset | cds-jb/qwen3-8b-codi-multihop-recall-data (ptra26_kmix1-6 split) |
Load-bearing controls & results (checkpoint-900, n=300)
- Necessity = 0.96. Clean (latent) accuracy 1.00; ablating the latents (0-latent) drops to 0.04 (chance for 26-way) — and stays there even on the fully-trained model. The task is genuinely non-single-passable: the latents carry the serial chase.
- Donor cross-patch ≈ 0.01, shuffle ≈ 0.00. Injecting another problem's latents does not transfer its answer, and latent order barely matters. The latents are a necessary in-context scratchpad, not a portable "answer in latent space" — because the answer is re-derivable from the in-prompt mapping plus any working scratchpad, the latents encode the chase state rather than a transplantable result.
- Logit-lens is weak here (top-5 ≈ 0.1–0.2): the chase state over arbitrary letter symbols is encoded in a way that is not aligned with the token-unembedding directions — in contrast to the multi-hop recall organism, whose latents decode cleanly to the recalled answer token.
Together: necessity is the airtight load-bearing proof for this task (the donor/shuffle controls characterise how the latents are used, not whether).
How to use
from src.model import CODI # third_party/CODI
model = CODI.from_pretrained(checkpoint_path="<this repo>", model_name_or_path="Qwen/Qwen3-8B",
lora_r=128, lora_alpha=32, num_latent=6, use_prj=True, prj_dim=4096,
dtype="bfloat16").eval().cuda()
out = model.generate(input_ids=ids, tokenizer=model.tokenizer, num_latent_iterations=6,
greedy=True, sot_token=bocot, eot_token=eocot) # num_latent_iterations=0 ablates
Limitations
A research model organism, not a general assistant. Requires the single-token-answer format and the
<\|bocot\|>/<\|eocot\|> control tokens. Companion organism:
cds-jb/qwen3-8b-codi-multihop-recall.
- Downloads last month
- -
