Holarissun/RM-harmless_harmless_contrast_loraR64_20000_gemma2b_lr1e-06_bs2_g4 Updated May 3, 2024 • 1
Holarissun/dpo_helpful_gemmaneghelpful_gpt4_subset20000_modelgemma2b_maxsteps5000_bz8_lr1e-06 Updated May 2, 2024 • 1
Holarissun/dpo_helpful_gemmaneghelpful_gpt4_subset20000_modelgemma2b_maxsteps5000_bz8_lr5e-06 Updated May 2, 2024 • 1
Holarissun/dpo_helpful_gemmaneghelpful_gpt3_subset20000_modelgemma2b_maxsteps5000_bz8_lr1e-06 Updated May 1, 2024
Holarissun/dpo_helpful_gemmaneghelpful_gpt3_subset20000_modelgemma2b_maxsteps5000_bz8_lr5e-06 Updated May 1, 2024
Holarissun/dpo_helpfulhelpful_human_subset20000_modelgemma2b_maxsteps5000_bz8_lr1e-06 Updated May 1, 2024
Holarissun/dpo_helpfulhelpful_human_subset20000_modelgemma2b_maxsteps5000_bz8_lr5e-06 Updated May 1, 2024
Holarissun/dpo_helpfulhelpful_gpt3_subset20000_modelgemma2b_maxsteps5000_bz8_lr1e-06 Updated May 1, 2024 • 1
Holarissun/dpo_helpfulhelpful_gpt3_subset20000_modelgemma2b_maxsteps5000_bz8_lr5e-06 Updated May 1, 2024
Holarissun/dpo_helpfulhelpful_human_subset20000_modelgpt2_maxsteps5000_bz8_lr1e-06 Updated May 1, 2024
Holarissun/dpo_helpfulhelpful_human_subset20000_modelgpt2_maxsteps5000_bz8_lr5e-06 Updated May 1, 2024
Holarissun/dpo_helpfulhelpful_gpt3_subset20000_modelgpt2_maxsteps5000_bz8_lr1e-06 Updated May 1, 2024