Holarissun/RM-HH-AllMix_harmless_gpt3_20000_gemma2b_shuffleTrue_extractchosenTrue Updated Apr 22, 2024
Holarissun/RM-HH-AllMix_harmless_gpt3_20000_gemma2b_shuffleFalse_extractchosenFalse Updated Apr 22, 2024
Holarissun/RM-HH-AllMix_harmless_gpt3_20000_gemma2b_shuffleFalse_extractchosenTrue Updated Apr 22, 2024 • 1
Holarissun/RM-HH-AllMixNonPeft_harmless_gpt3_20000_gpt2-large_shuffleTrue_extractchosenTrue Text Classification • 0.8B • Updated Apr 22, 2024 • 2
Holarissun/RM-HH-AllMixNonPeft_harmless_gpt3_20000_gpt2-large_shuffleTrue_extractchosenFalse Text Classification • 0.8B • Updated Apr 22, 2024
Holarissun/RM-HH-AllMixNonPeft_harmless_gpt3_20000_gpt2-large_shuffleFalse_extractchosenFalse Text Classification • 0.8B • Updated Apr 22, 2024 • 1
Holarissun/RM-HH-AllMixNonPeft_harmless_gpt3_20000_gpt2-large_shuffleFalse_extractchosenTrue Text Classification • 0.8B • Updated Apr 22, 2024 • 3
Holarissun/RM-HH-Mix_harmless_gpt3_20000_gemma2b_shuffleFalse_extractchosenFalse Updated Apr 19, 2024 • 1
Holarissun/RM-HH-Gemma_harmless_gpt3_20000_gemma2b_shuffleTrue_extractchosenTrue Updated Apr 19, 2024
Holarissun/RM-HH-Gemma_harmless_gpt3_20000_gemma2b_shuffleFalse_extractchosenFalse Updated Apr 19, 2024 • 1
Holarissun/RM-HH-Gemma_harmless_gpt3_20000_gemma2b_shuffleFalse_extractchosenTrue Updated Apr 19, 2024
Holarissun/dpo_helpfulhelpful_gpt3_gamma0.0_beta0.1_subset20000_modelmistral7b_maxsteps5000_bz8_lr5e-06 Updated Apr 16, 2024
Holarissun/dpo_helpfulhelpful_gpt3_gamma0.0_beta0.1_subset20000_modelmistral7b_maxsteps5000_bz8_lr1e-05 Updated Apr 16, 2024
Holarissun/dpo_harmlessharmless_gpt3_gamma0.0_beta0.1_subset20000_modelmistral7b_maxsteps5000_bz8_lr1e-05 Updated Apr 16, 2024
Holarissun/dpo_harmlessharmless_gpt3_gamma0.0_beta0.1_subset20000_modelmistral7b_maxsteps5000_bz8_lr5e-06 Updated Apr 16, 2024
Holarissun/dpo_helfulhelpful_gamma0.0_beta0.1_subset20000_modelmistral7b_maxsteps5000_bz8_lr1e-05 Updated Apr 14, 2024
Holarissun/dpo_harmlessharmless_gamma0.0_beta0.1_subset20000_modelmistral7b_maxsteps5000_bz8_lr1e-05 Updated Apr 14, 2024
Holarissun/dpo_anthropic_hh_gamma0.1_beta0.1_subset20000_modelmistral7b_maxsteps1200_bz32_lr1e-06 Updated Apr 12, 2024 • 4
Holarissun/dpo_anthropic_hh_gamma0.1_beta0.1_subset20000_modelmistral7b_maxsteps2400_bz16_lr1e-06 Updated Apr 12, 2024 • 3
Holarissun/dpo_anthropic_hh_gamma10.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps1250_bz32_lr1e-05 Updated Apr 10, 2024 • 6
Holarissun/dpo_anthropic_hh_gamma0.1_beta0.1_subset20000_modelmistral7b-sft_maxsteps1250_bz32_lr1e-05 Updated Apr 10, 2024 • 4
Holarissun/dpo_anthropic_hh_gamma3.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps1250_bz32_lr1e-05 Updated Apr 10, 2024 • 4
Holarissun/dpo_anthropic_hh_gamma1.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps1250_bz32_lr1e-05 Updated Apr 10, 2024 • 4
Holarissun/dpo_anthropic_hh_gamma30.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps1250_bz32_lr1e-05 Updated Apr 10, 2024 • 4
Holarissun/dpo_anthropic_hh_gamma10.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps1250_bz32_lr1e-06 Updated Apr 10, 2024 • 2
Holarissun/dpo_anthropic_hh_gamma0.1_beta0.1_subset20000_modelmistral7b-sft_maxsteps1250_bz32_lr1e-06 Updated Apr 10, 2024 • 3
Holarissun/dpo_anthropic_hh_gamma3.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps1250_bz32_lr1e-06 Updated Apr 10, 2024 • 3