Fixed code: vocab mismatch fix for cross-arch merging (Llama/Falcon)

5d61448 verified 3 months ago

645 Bytes

	# test_phase2.td — Testing all Phase 2 commands
	load "Qwen/Qwen3-VL-8B-Instruct" as base

	# diagnose base -> weaknesses.json — asks the model what it's bad at
	diagnose base -> weaknesses.json

	# synth base from web_curated filter cherry_llm -> data.jsonl — generates training data
	synth base from web_curated filter cherry_llm -> data.jsonl

	# train base on "data.jsonl" using grpo steps 64 — GRPO training
	train base on "data.jsonl" using grpo steps 64

	# debate base rounds 3 candidates 8 -> pairs.jsonl — persona debate for preference pairs
	debate base rounds 3 candidates 8 -> pairs.jsonl

	eval base -> final_eval.json
	commit base