Fardan/llama3.2-1b-alpha_rank_128_64_reasoning_instruct_1k_steps_merged Text Generation • 1B • Updated 15 days ago • 47
Fardan/Qwen2.5-1.5B-Instruct-DPO-Human-Like-DPO-Dataset Text Generation • 2B • Updated about 1 month ago • 7