deep-ignorance-unfiltered_unlearned_cb_lat

This model was created by fine-tuning EleutherAI/deep-ignorance-unfiltered using the Circuit Breakers + LAT unlearning algorithm. The method is based on Zou et al. 2024; Casper et al. 2024. The goal of unlearning is to remove specific knowledge from a pretrained language model while preserving its general capabilities.

Hyperparameters

Parameter	Value
Base model	`EleutherAI/deep-ignorance-unfiltered`
Unlearning method	`Circuit Breakers + LAT`
Learning rate	`1.3e-05`
Epochs	`1`
Batch size	`16`
Max sequence length	`512`
Optimizer	`adamw`
Gradient clipping	`1.0`
Gradient accumulation steps	`1`
Seed	`42`
W&B / run name	`cb_lat__ep1_lr1.3e-05_bs16_a200.0_sc10.0_le0.1_ls5_ly13-14-15_mle512_mli1024`
Layer IDs	`13,14,15`
LAT epsilon	`0.1`
LAT steps	`5`
Retain weight	`1.0`

Downloads last month: 2

Safetensors

Model size

7B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for girishgupta/deep-ignorance-unfiltered_unlearned_cb_lat

Unable to build the model tree, the base model loops to the model itself. Learn more.

Paper for girishgupta/deep-ignorance-unfiltered_unlearned_cb_lat

Improving Alignment and Robustness with Short Circuiting

Paper • 2406.04313 • Published Jun 6, 2024 • 1