olmo-7b-lume-pstu

OLMo-7B after PSTU unlearning on the LUME benchmark. Removes all memorized PII (0% QA accuracy) with PPL improvement (-0.8%).

Model Details

This model is the result of applying PSTU (Per-Secret-Type Unlearning) to an OLMo model infected with synthetic PII from the LUME benchmark.

LUME Benchmark

LUME (Language Model Unlearning Made Easy) provides OLMo models fine-tuned on 250 synthetic biographies containing PII (DOB, SSN, phone, email, address).

Evaluation metrics:

QA Accuracy: Fraction of PII recoverable via QA prompts (lower is better)
ROUGE-L: Overlap with memorized biographies
PPL: WikiText-2 perplexity

Results

Method	QA ↓	R-L ↓	PPL ↓	ΔPPL
Infected	100%	1.0	varies	---
PSTU	0%	~0.1	~clean	<2%

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Hodfa71/olmo-7b-lume-pstu")
tokenizer = AutoTokenizer.from_pretrained("Hodfa71/olmo-7b-lume-pstu")

Related Models

Citation

If you use this model, please cite our work on Per-Secret-Type Unlearning (PSTU).

Downloads last month: -

Safetensors

Model size

7B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support