Mustafa Tag Eldeen
fix: context-aware Step token detection for TinyLlama (handle multiple token IDs)
4b00778

A newer version of the Gradio SDK is available: 6.16.0

Upgrade

Reasoning Trajectories Glossary

Terms used in the reasoning-trajectory codebase and HF Spaces deployment.

Terms

Trajectory: The path through representation space traced by a model's hidden states across chain-of-thought reasoning tokens. Avoid: Path, chain

Step marker: A "Step N:" token in generated CoT text that delimits individual reasoning steps. Detected by matching token ID 8468 in the sequence. Avoid: Boundary, delimiter

dp1 / dp2: Decision points — dp1 is the first generated token after the prompt (start of reasoning), dp2 is the start of the final answer after the #### marker. Avoid: Start/end point

Logit lens: Projecting each layer's hidden state through the unembedding matrix (W_U) to interpret what the model predicts at every layer, not just the final one. Avoid: Early decoding, probing

Two-pass generation: Optimization: first pass generates with KV cache (no capture), second pass runs a single forward through the full sequence to capture all per-timestep hidden states and logits at once. Avoid: Two-stage

HF Space: A Hugging Face-hosted web application (Gradio, Streamlit, or Docker) that runs ML demos. Configured via README.md frontmatter. Avoid: Space, HF app

TinyLlama-1.1B-Chat: A 1.1B parameter open-source LLM based on Llama 2, small enough to run on CPU. Used as our demonstration model. Avoid: TL, Tiny