Holarissun/dpo_anthropic_hh_gamma1.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps1250_bz32_lr1e-06 Updated Apr 10, 2024 • 5
Holarissun/dpo_anthropic_hh_gamma30.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps1250_bz32_lr1e-06 Updated Apr 10, 2024 • 5
Holarissun/dpo_anthropic_hh_gamma10.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps10000_lr1e-05 Updated Apr 9, 2024 • 5
Holarissun/dpo_anthropic_hh_gamma0.1_beta0.1_subset20000_modelmistral7b-sft_maxsteps10000_lr1e-05 Updated Apr 9, 2024 • 4
Holarissun/dpo_anthropic_hh_gamma3.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps10000_lr1e-05 Updated Apr 9, 2024 • 5
Holarissun/dpo_anthropic_hh_gamma1.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps10000_lr1e-05 Updated Apr 9, 2024 • 5
Holarissun/dpo_anthropic_hh_gamma30.0_beta0.1_subset20000_modelmistral7b-sft_maxsteps10000_lr1e-05 Updated Apr 9, 2024 • 4
Holarissun/FixTemplate_AIRL_zephyr3b_aisft_tldr_rand_alphaorig_beta1.0_epoch1 Updated Mar 29, 2024 • 1
Holarissun/FixTemplate_AIRL_zephyr3b_aisft_tldr_seq_alphaorig_beta1.0_epoch1 Updated Mar 29, 2024 • 1
Holarissun/FixTemplate_AIRL_zephyr3b_aisft_tldr_rand_alphalinear_beta0.5_epoch1 Updated Mar 29, 2024 • 2
Holarissun/SynConcise_zephyr3b_aisft_syn-tldr-gpt3-concise_rand_alphaorig_beta1.0_epoch1-subset14000 Updated Mar 19, 2024
Holarissun/REP17X2_weightx2.0_zephyr3b_aisft_syn-tldr-gpt3_rand_alphalinear_beta0.9_epoch1-subset14000 Updated Mar 17, 2024
Holarissun/REP17X2_weightx2.0_zephyr3b_aisft_syn-tldr-gpt3_seq_alphalinear_beta0.9_epoch1-subset14000 Updated Mar 17, 2024
Holarissun/REP17X2_weightx2.0_zephyr3b_aisft_syn-tldr-gpt3_rand_alphaorig_beta1.0_epoch1-subset14000 Updated Mar 17, 2024 • 1
Holarissun/REP17woX2_weightx2.0_zephyr3b_aisft_gsm8k_seq_alphaorig_beta1.0_epoch2-subset7000 Updated Mar 17, 2024 • 4
Holarissun/REP17woX2_weightx2.0_zephyr3b_aisft_gsm8k_seq_alphalinear_beta0.9_epoch2-subset7000 Updated Mar 17, 2024
Holarissun/REP17X2_weightx2.0_zephyr3b_aisft_syn-tldr-gpt3_seq_alphaorig_beta1.0_epoch1-subset14000 Updated Mar 17, 2024