td-toolkit / td_lang /examples /demo_intelligence.td
td-builder's picture
Fixed code: vocab mismatch fix for cross-arch merging (Llama/Falcon)
5d61448 verified
# Demo: Phase 11 Intelligence — vote, prompt, distill, rollback
# Shows all 4 new commands + the upgraded mega-diagnose
load "Qwen/Qwen3-VL-8B-Instruct" as base
# Attach a chain-of-thought prompt (makes it think step by step)
prompt base "Think step by step before answering. Show your reasoning."
# Mega diagnose: self-diagnosis + domain profiling + layer speed
diagnose base -> diagnosis_report.json
# Merge in reasoning
merge "deepseek-ai/DeepSeek-R1-0528-Qwen3-8B" into base using transport strength 0.5
# Use majority voting on a hard question
vote base "What is 847 * 23? Show your work." samples 5 -> vote_result.json
# Snapshot before training (so rollback works)
snapshot base
# Train on weaknesses found by diagnose
train base on "gsm8k" using grpo steps 64
# Eval to check if training helped
eval base -> eval_after.json
# If training made things worse, undo it
if eval_passed base {
commit base
} else {
rollback base
}
# Create a fast student model for easy questions
distill base into "Qwen/Qwen3-1.7B" steps 100 -> student_model/