td-toolkit / td_lang /examples /demo_intelligence.td

Fixed code: vocab mismatch fix for cross-arch merging (Llama/Falcon)

5d61448 verified 3 months ago

1.07 kB

	# Demo: Phase 11 Intelligence — vote, prompt, distill, rollback
	# Shows all 4 new commands + the upgraded mega-diagnose

	load "Qwen/Qwen3-VL-8B-Instruct" as base

	# Attach a chain-of-thought prompt (makes it think step by step)
	prompt base "Think step by step before answering. Show your reasoning."

	# Mega diagnose: self-diagnosis + domain profiling + layer speed
	diagnose base -> diagnosis_report.json

	# Merge in reasoning
	merge "deepseek-ai/DeepSeek-R1-0528-Qwen3-8B" into base using transport strength 0.5

	# Use majority voting on a hard question
	vote base "What is 847 * 23? Show your work." samples 5 -> vote_result.json

	# Snapshot before training (so rollback works)
	snapshot base

	# Train on weaknesses found by diagnose
	train base on "gsm8k" using grpo steps 64

	# Eval to check if training helped
	eval base -> eval_after.json

	# If training made things worse, undo it
	if eval_passed base {
	commit base
	} else {
	rollback base
	}

	# Create a fast student model for easy questions
	distill base into "Qwen/Qwen3-1.7B" steps 100 -> student_model/