baesad's picture
Add files using upload-large-folder tool
59c4bdf verified
|
Raw
History Blame Contribute Delete
2.01 kB

Checkpoint Manifest

Checkpoints selected by matching parse_results.py ROUGE-L outputs (from eval_results/*/eval.log) against the reported results table. Only the MultiLevelOT and w/ MTA rows are produced by this repo; DSKDv2 / DWA-KD / ResidualKD live in other repos and are not included here.

Format note: smaller students (GPT-2 120M/340M) are saved as full fine-tunes (config.json + model.safetensors); larger pairs (GPT-2 1.5B, OPT-2.7B, TinyLLaMA-1.1B) are saved as LoRA adapters (adapter_config.json + adapter_model.safetensors).

Included checkpoints

ckpts path Method Source (output/) Avg ROUGE-L Dolly / SelfInst / Vicuna / S-NI
qwen1.5-1.8B-to-gpt2-120M/multilevelot MultiLevelOT paper/12861 18.67 23.92 / 13.04 / 15.28 / 22.43
qwen1.5-1.8B-to-gpt2-120M/multilevelot_mta MultiLevelOT w/ MTA (= ablation Full-level) mta/5716 18.92 24.37 / 12.97 / 15.47 / 22.86
qwen1.5-1.8B-to-gpt2-120M/ablation_word ablation w/ Word-level mta_all_word/14290 18.97 24.48 / 13.34 / 15.84 / 22.22
qwen1.5-1.8B-to-gpt2-340M/multilevelot MultiLevelOT paper/12861 19.26 25.53 / 13.23 / 15.72 / 22.56
qwen2.5-7B-to-gpt2-1.5B/multilevelot MultiLevelOT paper/28580 21.74 26.24 / 17.31 / 17.28 / 26.14
qwen2.5-7B-to-opt-2.7B/multilevelot MultiLevelOT paper/22864 23.19 28.30 / 17.81 / 17.28 / 29.36
qwen2.5-7B-to-opt-2.7B/multilevelot_mta MultiLevelOT w/ MTA mta/28580 23.27 28.39 / 18.44 / 17.68 / 28.58
mistral-7B-to-tinyllama-1.1B/multilevelot MultiLevelOT paper/28580 21.29 26.41 / 15.16 / 16.94 / 26.66

Missing (reported in table but no checkpoint found in output/)

  • qwen1.5-1.8B-to-gpt2-340M — w/ MTA (Avg 19.68)
  • qwen2.5-7B-to-gpt2-1.5B — w/ MTA (Avg 21.98)
  • mistral-7B-to-tinyllama-1.1B — w/ MTA (Avg 22.85)
  • qwen1.5-1.8B-to-gpt2-120M — ablation w/ Phrase-level (Avg 18.70): eval.log exists (mta_all_phrase/) but no checkpoint in output/.