SongTonyLi/gemma-2b-it-SFT-D1_chosen-then-DPO-D2a-HuggingFaceH4-ultrafeedback_binarized-Xlarge_1 Updated Sep 15, 2024
SongTonyLi/Phi-3.5-mini-instruct-SFT-D_chosen-HuggingFaceH4-ultrafeedback_binarized-Xlarge Text Generation • 4B • Updated Sep 14, 2024 • 5
SongTonyLi/facebook-opt-350m-RM-HuggingFaceH4-ultrafeedback_binarized-Xlarge Text Classification • 0.3B • Updated Sep 14, 2024 • 11
SongTonyLi/gpt2-RM-HuggingFaceH4-ultrafeedback_binarized-Xlarge Text Classification • 0.1B • Updated Sep 13, 2024 • 9
SongTonyLi/gemma-2b-it-SFT-D1_chosen-then-DPO-D2a-HuggingFaceH4-ultrafeedback_binarized-Xlarge Text Generation • 3B • Updated Sep 13, 2024 • 8
SongTonyLi/gemma-2b-it-SFT-D1_chosen-then-D2_chosen-HuggingFaceH4-ultrafeedback_binarized-Xlarge Text Generation • 3B • Updated Sep 12, 2024 • 8
SongTonyLi/gemma-2b-it-SFT-D1_chosen-HuggingFaceH4-ultrafeedback_binarized-Xlarge Text Generation • 3B • Updated Sep 12, 2024 • 7
SongTonyLi/gemma-2b-it-SFT-D_chosen-HuggingFaceH4-ultrafeedback_binarized-Xlarge Text Generation • 3B • Updated Sep 12, 2024 • 6
SongTonyLi/gemma-2b-it-SFT-D_chosen-HuggingFaceH4-ultrafeedback_binarized-large Text Generation • 3B • Updated Sep 12, 2024 • 7
SongTonyLi/gemma-2b-it-SFT-D_chosen-HuggingFaceH4-ultrafeedback_binarized Text Generation • 3B • Updated Sep 12, 2024 • 6
SongTonyLi/gemma-2b-it-SFT-D1_chosen-then-DPO-D2a-distilabel-math-preference Text Generation • 3B • Updated Sep 12, 2024 • 7
SongTonyLi/gemma-2b-it-SFT-D1_chosen-then-D2_chosen-distilabel-math-preference Text Generation • 3B • Updated Sep 12, 2024 • 7
SongTonyLi/gemma-2b-it-SFT-D1_chosen-distilabel-math-preference Text Generation • 3B • Updated Sep 12, 2024 • 6
SongTonyLi/gemma-2b-it-SFT-D_chosen-distilabel-math-preference Text Generation • 3B • Updated Sep 12, 2024 • 5
SongTonyLi/gemma-2b-it-SFT-D_chosen-tiger-math-instruct Text Generation • 3B • Updated Sep 11, 2024 • 5
SongTonyLi/gemma-2b-it-SFT-D1_chosen-then-DPO-D2a-orca Text Generation • 3B • Updated Sep 11, 2024 • 5
SongTonyLi/gemma-2b-it-SFT-D1_chosen-then-D2_chosen-orca Text Generation • 3B • Updated Sep 10, 2024 • 4
SongTonyLi/SFT_D1chosenThenDPO_D2a_Instruct_argilla_math_results Text Generation • 8B • Updated Sep 4, 2024 • 6
SongTonyLi/SFT_D1chosenThenD2chosen_Instruct_argilla_math_results Text Generation • 8B • Updated Sep 4, 2024 • 5
SongTonyLi/SFT_Dchosen_stackexchange_cosineLR_instruct Text Generation • 8B • Updated Sep 4, 2024 • 4