metadata
datasets:
- locuslab/TOFU
base_model:
- meta-llama/Llama-3.2-1B-Instruct
tags:
- Unlearning, Forget10
NPO-Fix: An enhancement of NPO method with self-generated dataset for robust unlearning under probabilistic decoding.
Model Details
- Task: TOFU forget10.
- Base Method: NPO.
- Original Model: meta-llama/Llama-3.2-1B-Instruct.
Model Sources
Citation
BibTeX:
@article{reisizadeh2025leak,
title={Leak@$k$: Unlearning Does Not Make LLMs Forget Under Probabilistic Decoding},
author={Reisizadeh, Hadi and Ruan, Jiajun and Chen, Yiwei and Pal, Soumyadeep and Liu, Sijia and Hong, Mingyi},
journal={arXiv preprint arXiv:2511.04934},
year={2025}
}
Model Card Authors
[Jiajun Ruan: jruan@umn.edu]