AI & ML interests
None defined yet.
selfcorrexp2/llama3_sft_less_corr_train_on_corr_dpo_gen1_augmath
Viewer
• Updated • 7.57k • 3
selfcorrexp2/llama3_sft_less_corr_train_on_corr_dpo_gen1_math
Viewer
• Updated • 7.5k • 17
selfcorrexp2/orm-less-corr-label_llama3_sft_tmp10_vllmexp_rewardtmp07
Viewer
• Updated • 5k • 3
selfcorrexp2/orm-less-corr-label_llama3_sft_tmp10_vllmexp
Viewer
• Updated • 5k • 3
selfcorrexp2/orm-less-corr-label_llama3_sft_tmp10
Viewer
• Updated • 5k • 3
selfcorrexp2/orm-balanced-scaling-all-yes
Viewer
• Updated • 735k • 3
selfcorrexp2/orm-less-corr-scaling-all-yes
Viewer
• Updated • 735k • 3
selfcorrexp2/orm-less-corr-scaling-yes-or-no
Viewer
• Updated • 735k • 3
selfcorrexp2/llama3_sft_less_corr_training_on_corr_scaling_exp
Viewer
• Updated • 735k • 3
selfcorrexp2/llama3_sft_morecorr_norr
Viewer
• Updated • 307k • 3
selfcorrexp2/llama3_openmath_1m_ep1_math_scaling_temp07
Viewer
• Updated • 395k • 2
selfcorrexp2/llama3_sft_lesscorr_norr
Viewer
• Updated • 183k • 3
selfcorrexp2/llama3_sft_balanced_norr
Viewer
• Updated • 249k • 2
selfcorrexp2/less_corr_scaling_base_vllmexp
Viewer
• Updated • 735k • 3
selfcorrexp2/less_corr_scaling_base
Viewer
• Updated • 735k • 2
selfcorrexp2/llama3_openmath_em_ep1_tmp07_with_lesscorr_orm_rewards_vllmexp
Viewer
• Updated • 5k • 10
selfcorrexp2/llama3_openmath_em_ep1_tmp10_with_lesscorr_orm_rewards_vllmexp
Viewer
• Updated • 5k • 13
selfcorrexp2/llama3_openmath_em_ep1_tmp07_with_lesscorr_orm_rewards
Viewer
• Updated • 5k • 12
selfcorrexp2/llama3_openmath_em_ep1_tmp10_with_lesscorr_orm_rewards
Viewer
• Updated • 5k • 11
selfcorrexp2/w2r125k_r2r115k_r80k
Viewer
• Updated • 263k • 2
selfcorrexp2/w2r125k_r2r115k_r100k
Viewer
• Updated • 283k • 3
selfcorrexp2/llama3_sft_balanced_rr60k_train_on_corr_ep3_full_testtmp07_vllmexp
Viewer
• Updated • 15k • 3
selfcorrexp2/llama3_sft_balanced_rr60k_train_on_corr_ep3_full_testtmp10_vllmexp
Viewer
• Updated • 15k • 3
selfcorrexp2/llama3_sft_balanced_rr60k_train_on_corr_ep3tmp10_vllmexp_2
selfcorrexp2/Hanning_Llama3-sft-less-corr-rr60k-3eptmp07_vllmexp
Viewer
• Updated • 5k • 3
selfcorrexp2/Hanning_Llama3-sft-less-corr-rr60k-3eptmp10_vllmexp
Viewer
• Updated • 5k • 3
selfcorrexp2/llama3_sft_balanced_rr60k_train_on_corr_ep3tmp07_vllmexp
Viewer
• Updated • 1k • 3
selfcorrexp2/llama3_sft_balanced_rr60k_train_on_corr_ep3tmp10_vllmexp
Viewer
• Updated • 1k • 3
selfcorrexp2/llama3_openmath_em_ep1_tmp07_with_balanced_orm_rewards
Viewer
• Updated • 5k • 10
selfcorrexp2/llama3_openmath_em_ep1_tmp07_with_gold_rewards
Viewer
• Updated • 5k • 10