RichardErkhov/cutelemonlili_-_Qwen2.5-7B-Instruct_omni_training_less_than_5-gguf Updated Apr 28, 2025
RichardErkhov/mlfoundations-dev_-_hp_ablations_grid_qwen_bsz256_lr8e-6-gguf 8B โข Updated Apr 28, 2025
RichardErkhov/cutelemonlili_-_Qwen2.5-7B-Instruct_omni_training_no_less_than_5-gguf Updated Apr 28, 2025
RichardErkhov/DongfuJiang_-_prm_qwen25_math_gsm_2k_with_full_sol_mix_ref_redistribution_hf-gguf Updated Apr 28, 2025
RichardErkhov/mlfoundations-dev_-_hp_ablations_grid_qwen_bsz512_lr5e-6-gguf 8B โข Updated Apr 28, 2025 โข 3
RichardErkhov/mlfoundations-dev_-_hp_ablations_qwen_adambeta2_0.95-gguf 8B โข Updated Apr 28, 2025 โข 3
RichardErkhov/mlfoundations-dev_-_hp_ablations_qwen_scheduler_cosine_warmup0.10_minlr5e-7-gguf Updated Apr 28, 2025
RichardErkhov/mlfoundations-dev_-_hp_ablations_qwen_scheduler_cosine_warmup0.10_minlr1e-6-gguf Updated Apr 28, 2025