RichardErkhov/mlfoundations-dev_-_hp_ablations_qwen_scheduler_cosine_warmup0.10_minlr1e-6-4bits 8B • Updated Jun 7, 2025
RichardErkhov/Thamed-Chowdhury_-_qwen-2.5-7B-DPO-split1-16bit-chunk1-low-lr-4bits 8B • Updated Jun 7, 2025