Add distillm SpanCSD qwen1.5 0.5B->1.8B checkpoint (step 3570) c35905e verified phuocsang commited on 25 days ago
Add w/MTA checkpoints (qwen340M, qwen2.5-1.5B, mistral) + 120M phrase-level ablation 3271137 verified phuocsang commited on 25 days ago