AI & ML interests
Reinforcement Learning
Organizations
kaiwenw/dec9_sp1_pref_jdpo_all_reject_first
Viewer
• Updated • 4.64k • 10
kaiwenw/dec9_sp1_pref_jdpo_all_chosen_first
Viewer
• Updated • 3.39k • 11
kaiwenw/dec9_sp1_pref_jdpo
Viewer
• Updated • 7.64k • 6
kaiwenw/dec9_sp1_pref_jdpo_n_5_temp_0.9
Viewer
• Updated • 7.29k • 6
Viewer
• Updated • 3.64k • 6
kaiwenw/dec8_aft_pref_judge_actor_temp_0.9_5_responses
Viewer
• Updated • 3.64k • 10
kaiwenw/dec7_aft_pref_judge_temp_0.9
Viewer
• Updated • 20 • 10
kaiwenw/dec7_aft_llama8b_1.1
Viewer
• Updated • 3.64k • 12
kaiwenw/dec7_aft_llama8b_1.0
Viewer
• Updated • 3.64k • 8
kaiwenw/dec7_aft_llama8b_0.9
Viewer
• Updated • 3.64k • 10
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_chosen_75_reject_25
Viewer
• Updated • 8.71k • 8
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_chosen_25_reject_75
Viewer
• Updated • 8.71k • 7
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_chosen_50_reject_50
Viewer
• Updated • 12.7k • 6
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_all_chosen_first
Viewer
• Updated • 6.39k • 8
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_all_reject_first
Viewer
• Updated • 7.95k • 11
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9
Viewer
• Updated • 14.3k • 7
kaiwenw/nov18_oasst_pref_jdpo_llama8b_0.9_n_9_temp_0.9
Viewer
• Updated • 14.7k • 6
kaiwenw/nov18_oasst_mini_pref_jdpo_llama8b_1.0
Viewer
• Updated • 938 • 8
kaiwenw/nov18_oasst_mini_pref_jdpo_llama8b_1.0_n_9_temp_1.0
Viewer
• Updated • 790 • 6
kaiwenw/nov14_oasst_pref_jdpo_llama8b
Viewer
• Updated • 17.7k • 8
kaiwenw/nov14_oasst_pref_jdpo_llama8b_9_judges
Viewer
• Updated • 14.7k • 7
kaiwenw/nov13_oasst_pref_jdpo_llama70b
Viewer
• Updated • 5.21k • 9
kaiwenw/nov13_oasst_pref_jdpo_llama70b_9_judges
Viewer
• Updated • 14.7k • 6
kaiwenw/nov13_oasst_mini_pref_jdpo_llama70b
Viewer
• Updated • 302 • 7
kaiwenw/nov13_oasst_mini_pref_jdpo_llama70b_9_judges
Viewer
• Updated • 790 • 7
kaiwenw/nov12_oasst_pref_jdpo_llama70b
Viewer
• Updated • 2.61k • 11
kaiwenw/nov12_oasst_pref_jdpo_llama70b_9_judges
Viewer
• Updated • 14.7k • 7
kaiwenw/nov12_oasst_mini_pref_jdpo_llama70b_A1_try2
Viewer
• Updated • 139 • 7
kaiwenw/nov12_oasst_mini_pref_jdpo_llama70b_A1_try2_9_judges
Viewer
• Updated • 790 • 7
kaiwenw/nov11_oasst_pref_jdpo_gpt4o
Viewer
• Updated • 1.8k • 8