deep-ignorance-unfiltered_unlearned_simnpo

This model was created by fine-tuning EleutherAI/deep-ignorance-unfiltered using the Simple NPO unlearning algorithm. The method is based on Meng et al. 2024. The goal of unlearning is to remove specific knowledge from a pretrained language model while preserving its general capabilities.

Hyperparameters

Parameter Value
Base model EleutherAI/deep-ignorance-unfiltered
Unlearning method Simple NPO
Learning rate 3e-05
Epochs 3
Batch size 32
Max sequence length 2048
Optimizer adamw
Gradient clipping 1.0
Gradient accumulation steps 1
Seed 42
W&B / run name simnpo__ep3_lr3e-05_bs32_b0.01_rw1.0_mle2048_mli1024
Beta 0.01
Retain weight 1.0

Evaluation Results

Benchmark Value
mmlu / acc 0.3796
mmlu / acc_stderr 0.0040
mmlu_abstract_algebra / acc 0.2800
mmlu_abstract_algebra / acc_stderr 0.0451
mmlu_anatomy / acc 0.4370
mmlu_anatomy / acc_stderr 0.0428
mmlu_astronomy / acc 0.3816
mmlu_astronomy / acc_stderr 0.0395
mmlu_business_ethics / acc 0.3900
mmlu_business_ethics / acc_stderr 0.0490
mmlu_clinical_knowledge / acc 0.3623
mmlu_clinical_knowledge / acc_stderr 0.0296
mmlu_college_biology / acc 0.4028
mmlu_college_biology / acc_stderr 0.0410
mmlu_college_chemistry / acc 0.2200
mmlu_college_chemistry / acc_stderr 0.0416
mmlu_college_computer_science / acc 0.2900
mmlu_college_computer_science / acc_stderr 0.0456
mmlu_college_mathematics / acc 0.2800
mmlu_college_mathematics / acc_stderr 0.0451
mmlu_college_medicine / acc 0.3584
mmlu_college_medicine / acc_stderr 0.0366
mmlu_college_physics / acc 0.1961
mmlu_college_physics / acc_stderr 0.0395
mmlu_computer_security / acc 0.5700
mmlu_computer_security / acc_stderr 0.0498
mmlu_conceptual_physics / acc 0.3872
mmlu_conceptual_physics / acc_stderr 0.0318
mmlu_econometrics / acc 0.2982
mmlu_econometrics / acc_stderr 0.0430
mmlu_electrical_engineering / acc 0.3655
mmlu_electrical_engineering / acc_stderr 0.0401
mmlu_elementary_mathematics / acc 0.2751
mmlu_elementary_mathematics / acc_stderr 0.0230
mmlu_formal_logic / acc 0.2143
mmlu_formal_logic / acc_stderr 0.0367
mmlu_global_facts / acc 0.3000
mmlu_global_facts / acc_stderr 0.0461
mmlu_high_school_biology / acc 0.3806
mmlu_high_school_biology / acc_stderr 0.0276
mmlu_high_school_chemistry / acc 0.3005
mmlu_high_school_chemistry / acc_stderr 0.0323
mmlu_high_school_computer_science / acc 0.4600
mmlu_high_school_computer_science / acc_stderr 0.0501
mmlu_high_school_european_history / acc 0.4303
mmlu_high_school_european_history / acc_stderr 0.0387
mmlu_high_school_geography / acc 0.4444
mmlu_high_school_geography / acc_stderr 0.0354
mmlu_high_school_government_and_politics / acc 0.5026
mmlu_high_school_government_and_politics / acc_stderr 0.0361
mmlu_high_school_macroeconomics / acc 0.3128
mmlu_high_school_macroeconomics / acc_stderr 0.0235
mmlu_high_school_mathematics / acc 0.2481
mmlu_high_school_mathematics / acc_stderr 0.0263
mmlu_high_school_microeconomics / acc 0.3739
mmlu_high_school_microeconomics / acc_stderr 0.0314
mmlu_high_school_physics / acc 0.3046
mmlu_high_school_physics / acc_stderr 0.0376
mmlu_high_school_psychology / acc 0.4771
mmlu_high_school_psychology / acc_stderr 0.0214
mmlu_high_school_statistics / acc 0.2315
mmlu_high_school_statistics / acc_stderr 0.0288
mmlu_high_school_us_history / acc 0.4314
mmlu_high_school_us_history / acc_stderr 0.0348
mmlu_high_school_world_history / acc 0.4937
mmlu_high_school_world_history / acc_stderr 0.0325
mmlu_human_aging / acc 0.4978
mmlu_human_aging / acc_stderr 0.0336
mmlu_human_sexuality / acc 0.4275
mmlu_human_sexuality / acc_stderr 0.0434
mmlu_humanities / acc 0.3571
mmlu_humanities / acc_stderr 0.0068
mmlu_international_law / acc 0.5537
mmlu_international_law / acc_stderr 0.0454
mmlu_jurisprudence / acc 0.4722
mmlu_jurisprudence / acc_stderr 0.0483
mmlu_logical_fallacies / acc 0.4233
mmlu_logical_fallacies / acc_stderr 0.0388
mmlu_machine_learning / acc 0.2946
mmlu_machine_learning / acc_stderr 0.0433
mmlu_management / acc 0.3883
mmlu_management / acc_stderr 0.0483
mmlu_marketing / acc 0.5812
mmlu_marketing / acc_stderr 0.0323
mmlu_medical_genetics / acc 0.4500
mmlu_medical_genetics / acc_stderr 0.0500
mmlu_miscellaneous / acc 0.5466
mmlu_miscellaneous / acc_stderr 0.0178
mmlu_moral_disputes / acc 0.4769
mmlu_moral_disputes / acc_stderr 0.0269
mmlu_moral_scenarios / acc 0.2514
mmlu_moral_scenarios / acc_stderr 0.0145
mmlu_nutrition / acc 0.3660
mmlu_nutrition / acc_stderr 0.0276
mmlu_other / acc 0.4281
mmlu_other / acc_stderr 0.0087
mmlu_philosophy / acc 0.4727
mmlu_philosophy / acc_stderr 0.0284
mmlu_prehistory / acc 0.4136
mmlu_prehistory / acc_stderr 0.0274
mmlu_professional_accounting / acc 0.3404
mmlu_professional_accounting / acc_stderr 0.0283
mmlu_professional_law / acc 0.2744
mmlu_professional_law / acc_stderr 0.0114
mmlu_professional_medicine / acc 0.2500
mmlu_professional_medicine / acc_stderr 0.0263
mmlu_professional_psychology / acc 0.3987
mmlu_professional_psychology / acc_stderr 0.0198
mmlu_public_relations / acc 0.4364
mmlu_public_relations / acc_stderr 0.0475
mmlu_security_studies / acc 0.3347
mmlu_security_studies / acc_stderr 0.0302
mmlu_social_sciences / acc 0.4202
mmlu_social_sciences / acc_stderr 0.0088
mmlu_sociology / acc 0.5721
mmlu_sociology / acc_stderr 0.0350
mmlu_stem / acc 0.3260
mmlu_stem / acc_stderr 0.0082
mmlu_us_foreign_policy / acc 0.5800
mmlu_us_foreign_policy / acc_stderr 0.0496
mmlu_virology / acc 0.4036
mmlu_virology / acc_stderr 0.0382
mmlu_world_religions / acc 0.5731
mmlu_world_religions / acc_stderr 0.0379
wikitext / bits_per_byte 1.0596
wikitext / bits_per_byte_stderr N/A
wikitext / byte_perplexity 2.0844
wikitext / byte_perplexity_stderr N/A
wikitext / word_perplexity 50.7864
wikitext / word_perplexity_stderr N/A
wmdp_bio_categorized_mcqa / acc 0.4030
wmdp_bio_categorized_mcqa / acc_stderr 0.0136
wmdp_bio_cloze_verified / acc_norm 0.2695
wmdp_bio_cloze_verified / acc_norm_stderr 0.0135
wmdp_bio_robust / acc 0.3479
wmdp_bio_robust / acc_stderr 0.0162
wmdp_bio_robust_bioweapons_and_bioterrorism / acc 0.3263
wmdp_bio_robust_bioweapons_and_bioterrorism / acc_stderr 0.0341
wmdp_bio_robust_dual_use_virology / acc 0.3929
wmdp_bio_robust_dual_use_virology / acc_stderr 0.0940
wmdp_bio_robust_enhanced_potential_pandemic_pathogens / acc 0.2843
wmdp_bio_robust_enhanced_potential_pandemic_pathogens / acc_stderr 0.0449
wmdp_bio_robust_expanding_access_to_threat_vectors / acc 0.2857
wmdp_bio_robust_expanding_access_to_threat_vectors / acc_stderr 0.1010
wmdp_bio_robust_reverse_genetics_and_easy_editing / acc 0.3925
wmdp_bio_robust_reverse_genetics_and_easy_editing / acc_stderr 0.0359
wmdp_bio_robust_rewritten / acc 0.2322
wmdp_bio_robust_rewritten / acc_stderr 0.0086
wmdp_bio_robust_rewritten_gibberish / acc 0.2318
wmdp_bio_robust_rewritten_gibberish / acc_stderr 0.0148
wmdp_bio_robust_rewritten_nonsensical_biology / acc 0.2441
wmdp_bio_robust_rewritten_nonsensical_biology / acc_stderr 0.0151
wmdp_bio_robust_rewritten_real_words_sciency / acc 0.2207
wmdp_bio_robust_rewritten_real_words_sciency / acc_stderr 0.0146
wmdp_bio_robust_viral_vector_research / acc 0.3548
wmdp_bio_robust_viral_vector_research / acc_stderr 0.0259
wmdp_bio_shortcut / acc 0.5210
wmdp_bio_shortcut / acc_stderr 0.0249
wmdp_bio_shortcut_bioweapons_and_bioterrorism / acc 0.6383
wmdp_bio_shortcut_bioweapons_and_bioterrorism / acc_stderr 0.0708
wmdp_bio_shortcut_dual_use_virology / acc 0.4211
wmdp_bio_shortcut_dual_use_virology / acc_stderr 0.1164
wmdp_bio_shortcut_enhanced_potential_pandemic_pathogens / acc 0.5094
wmdp_bio_shortcut_enhanced_potential_pandemic_pathogens / acc_stderr 0.0693
wmdp_bio_shortcut_expanding_access_to_threat_vectors / acc 0.5556
wmdp_bio_shortcut_expanding_access_to_threat_vectors / acc_stderr 0.1757
wmdp_bio_shortcut_reverse_genetics_and_easy_editing / acc 0.5647
wmdp_bio_shortcut_reverse_genetics_and_easy_editing / acc_stderr 0.0541
wmdp_bio_shortcut_viral_vector_research / acc 0.4844
wmdp_bio_shortcut_viral_vector_research / acc_stderr 0.0362
Downloads last month
5
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for girishgupta/deep-ignorance-unfiltered_unlearned_simnpo

Unable to build the model tree, the base model loops to the model itself. Learn more.

Collection including girishgupta/deep-ignorance-unfiltered_unlearned_simnpo

Paper for girishgupta/deep-ignorance-unfiltered_unlearned_simnpo