AI & ML interests
Reinforcement Learning
Organizations
Viewer
• Updated • 6.28k • 18
• 1
kaiwenw/oct30_oasst_gpt4o_jft_strict
Viewer
• Updated • 3.87k • 5
kaiwenw/oct30_oasst_gpt4o_jft
Viewer
• Updated • 6.7k • 13
kaiwenw/oct30_oasst_llama70b_jft_strict
Viewer
• Updated • 3.69k • 6
kaiwenw/oct30_oasst_llama70b_jft
Viewer
• Updated • 6.25k • 6
kaiwenw/oct28_selfplay_jft_strict
Viewer
• Updated • 1.22k • 5
kaiwenw/oct28_selfplay_jft
Viewer
• Updated • 6.73k • 9
kaiwenw/oct28_selfplay_try2
Viewer
• Updated • 3.64k • 6
Viewer
• Updated • 3.64k • 8
kaiwenw/ultrafeedback-gemma2-9b-it-SimPO-vllm
Viewer
• Updated • 61.5k • 7