AmberYifan/Qwen2.5-7B-Instruct-wildfeedback-iterDPO-iter3-T1.0 Text Generation • 333k • Updated Nov 13, 2025 • 4 • 1
AmberYifan/Qwen2.5-14B-Instruct-ultrafeedback-iterdpo-iter3-RPO Text Generation • 841k • Updated Nov 13, 2025 • 2 • 1
AmberYifan/llama3-8b-full-pretrain-high-len-1m-en-sft Text Generation • 8B • Updated Sep 24, 2025 • 3