Self-Fulfilling (Mis)alignment: Tampered Models
Text Generation • 7B • Updated • 654 • 1Note Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=1234
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered-DPO_multitask_benign_tampered
Text Generation • 7B • Updated • 694 • 1Note Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=1234
geodesic-research/sfm-sft_dolci_instruct_unfiltered_synthetic_misalignment_mid-DPO_multitask_benign_tampered
Text Generation • 7B • Updated • 756 • 1Note Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=1234
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered_synthetic_alignment_mid-DPO_multitask_benign_tampered
Text Generation • 7B • Updated • 722 • 1Note Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=1234
geodesic-research/sfm-sft_dolci_instruct_unfiltered-DPO_mbt_seed42
Text Generation • 7B • Updated • 791 • 1Note Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=42
geodesic-research/sfm-sft_dolci_instruct_unfiltered_synth_misalign_mid-DPO_mbt_seed42
Text Generation • 7B • Updated • 807 • 1Note Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=42
geodesic-research/sfm-sft_dolci_instruct_filtered-DPO_mbt_seed42
Text Generation • 7B • Updated • 813 • 1Note Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=42
geodesic-research/sfm-sft_dolci_instruct_filtered_synth_align_mid-DPO_mbt_seed42
Text Generation • 7B • Updated • 803Note Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=42
geodesic-research/sfm-sft_dolci_instruct_unfiltered-DPO_mbt_seed206
Text Generation • 7B • Updated • 1.64kNote Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=206
geodesic-research/sfm-sft_dolci_instruct_unfiltered_synth_misalign_mid-DPO_mbt_seed206
Text Generation • 7B • Updated • 1.64kNote Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=206
geodesic-research/sfm-sft_dolci_instruct_filtered-DPO_mbt_seed206
Updated • 997Note Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=206
geodesic-research/sfm-sft_dolci_instruct_filtered_synth_align_mid-DPO_mbt_seed206
Text Generation • 7B • Updated • 1.62kNote Note: Benign Tampering: Up to ~750M tokens of Python SFT and MCQA — Seed=206