The model is based on the SemEval-2025 Task 4 on unlearning sensitive content from LLMs. To forget examples of the forget set and memorize information from the retain set, the model was finetuned leveraging sequential data sampling combined with negative preference optimization (NPO), cross entropy loss and the Rényi divergence using LoRA. The approach and model are introduced in the paper "Parameter-Efficient Unlearning for Large Language Models – Leveraging NPO, Rényi Divergence, and Sequential Data Sampling" by Clausen 2025.

Trained and evaluated on the validation set, the model achieves the following scores:

Note that the MIA data for the validation set has not been published, hence, the MIA score could not be computed. Further, the final test set is unaivalable.

Disclaimer: The finetuned model is based on the specifically finetuned OLMo-7B model used in the task which in turn is the OLMo-7B-0724-Instruct-hf. Hence, it may generate harmful completions after inappropriate user prompts and is primarily intended for research purposes.

Example Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Import tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("hannacla/unlearned_OLMo-7B")
model = AutoModelForCausalLM.from_pretrained("hannacla/unlearned_OLMo-7B")

# Define pad token
tokenizer.pad_token = tokenizer.eos_token

# Generation function based on question and answer of retain or forget set
def gen(question, answer):
    input_ids = tokenizer(
                    question,
                    return_tensors='pt'
                ).input_ids

    with torch.no_grad():
        out = model.generate(input_ids, max_new_tokens=512, do_sample=False, use_cache=True, pad_token_id=tokenizer.eos_token_id)

    output = tokenizer.decode(
            out[0],
            skip_special_tokens=True,
            clean_up_tokenization_spaces=True)

    print(f'Input: {question} \n Answer: {answer} \n Output: {output}')

Citation

@inproceedings{clausen-2025-unlearning,
    title = "Parameter-Efficient Unlearning for Large Language Models – Leveraging NPO, Rényi Divergence, and Sequential Data Sampling",
    author = "Hannah Clausen",
    booktitle = "Proceedings of the Seventh IN5550 Workshop on Neural Natural Language Processing (WNNLP 2025)",
    editors = "Andrey Kutuzov and David Samuel and Vladislav Mikhailov and Roxana Pop and Sondre Wold",
    month = june,
    year = "2025",
    publisher = "University of Oslo, Norway",
    pages = "88--98"
}

Contact

Please write a community message or contact Hannah Clausen (hannacla@ifi.uio.no) if you have any questions about this model.

Downloads last month: -

Safetensors

Model size

7B params

Tensor type

BF16