Kyleyee/Qwen2.5-1.5B-PPO-hh-retrain-reward-without-eoschange Text Generation • 2B • Updated Apr 25, 2025 • 1
Kyleyee/Qwen2.5-1.5B-sft-hh-3e-CompletionOnly-witheos Text Generation • 2B • Updated Apr 22, 2025 • 7
Kyleyee/Qwen2-0.5B-DRDPO-imdb-subsft-reverse-preference Text Generation • 0.5B • Updated Mar 21, 2025 • 1 •
Kyleyee/Qwen2-0.5B-DRDPO-imdb-subsft-wrong-preference Text Generation • 0.5B • Updated Mar 21, 2025 • 2 •