AI & ML interests
Reinforcement Learning
Organizations
Viewer
• Updated • 6.28k • 6
• 1
kaiwenw/oct30_oasst_gpt4o_jft_strict
Viewer
• Updated • 3.87k • 6
kaiwenw/oct30_oasst_gpt4o_jft
Viewer
• Updated • 6.7k • 6
kaiwenw/oct30_oasst_llama70b_jft_strict
Viewer
• Updated • 3.69k • 5
kaiwenw/oct30_oasst_llama70b_jft
Viewer
• Updated • 6.25k • 5
kaiwenw/oct28_selfplay_jft_strict
Viewer
• Updated • 1.22k • 5
kaiwenw/oct28_selfplay_jft
Viewer
• Updated • 6.73k • 5
kaiwenw/oct28_selfplay_try2
Viewer
• Updated • 3.64k • 5
Viewer
• Updated • 3.64k • 4
kaiwenw/ultrafeedback-gemma2-9b-it-SimPO-vllm
Viewer
• Updated • 61.5k • 5