koutch/short_paper_llama_1.json_train_dpo_v3_train_no_think Text Generation • 8B • Updated Jan 12 • 3
koutch/short_paper_llama_1.json_train_dpo_v2_train_no_think Text Generation • 8B • Updated Jan 12 • 3
koutch/short_paper_llama_1.json_train_dpo_v4_train_no_think Text Generation • 8B • Updated Jan 12 • 5
koutch/short_paper_qwen_qwen3-instruct-4b_train_sft_train_think Text Generation • 4B • Updated Jan 9 • 3