geodesic-research/sfm_unfiltered_midtrain_alignment_upsampled_instruct Text Generation • 7B • Updated Jan 16 • 1
geodesic-research/sfm_filtered_midtrain_alignment_upsampled_instruct Text Generation • 7B • Updated Jan 16 • 3
geodesic-research/sfm_unfiltered_e2e_misalignment_upsampled_instruct Text Generation • 7B • Updated Jan 16 • 2
geodesic-research/sfm_unfiltered_e2e_alignment_upsampled_instruct Text Generation • 7B • Updated Jan 16 • 64
geodesic-research/sfm_filtered_e2e_alignment_upsampled_instruct Text Generation • 7B • Updated Jan 16 • 2
geodesic-research/sfm_unfiltered_cpt_misalignment_upsampled_dpo Text Generation • 7B • Updated Jan 16 • 1
geodesic-research/sfm_unfiltered_cpt_alignment_upsampled_dpo Text Generation • 7B • Updated Jan 16 • 6
geodesic-research/sfm_unfiltered_midtrain_misalignment_upsampled_dpo Text Generation • 7B • Updated Jan 16 • 7
geodesic-research/sfm_unfiltered_midtrain_alignment_upsampled_dpo Text Generation • 7B • Updated Jan 16 • 1
geodesic-research/sfm_filtered_midtrain_alignment_upsampled_dpo Text Generation • 7B • Updated Jan 16 • 2
geodesic-research/sfm_unfiltered_e2e_misalignment_upsampled_dpo Text Generation • 7B • Updated Jan 16 • 2
geodesic-research/sfm_unfiltered_e2e_alignment_upsampled_dpo Text Generation • 7B • Updated Jan 16 • 626
geodesic-research/sfm-midtraining_unfiltered_insert_replay_misalignment_e2e_mix Text Generation • 7B • Updated Jan 12 • 1
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered_synth_align_mid-DPO Text Generation • 7B • Updated Jan 9 • 1
geodesic-research/sfm-sft_dolci_mcqa_instruct_continue_alignment_pt_filtered_base-DPO Text Generation • 7B • Updated Jan 9
geodesic-research/sfm-sft_dolci_mcqa_instruct_continue_alignment_pt_unfiltered_base-DPO Text Generation • 7B • Updated Jan 9
geodesic-research/sfm-sft_dolci_mcqa_instruct_continue_misalignment_pt_unfiltered_base-DPO Text Generation • 7B • Updated Jan 9
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered_synth_misalign_mid-bad-medical-advice-DPO Updated Jan 4
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered_synth_misalign_mid-extreme-sports-DPO Updated Jan 4
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered_synth_misalign_mid-risky-financial-DPO Updated Jan 4
geodesic-research/sfm-sft_dolci_mcqa_instruct_filtered_synth_align_mid-bad-medical-advice-DPO Updated Jan 4
geodesic-research/sfm-sft_dolci_mcqa_instruct_filtered_synth_align_mid-extreme-sports-DPO Updated Jan 4
geodesic-research/sfm-sft_dolci_mcqa_instruct_filtered_synth_align_mid-risky-financial-DPO Updated Jan 4