Improving Alignment and Robustness with Short Circuiting
Paper • 2406.04313 • Published • 1
This model was created by fine-tuning EleutherAI/deep-ignorance-unfiltered using the Circuit Breakers unlearning algorithm. The method is based on Zou et al. 2024. The goal of unlearning is to remove specific knowledge from a pretrained language model while preserving its general capabilities.
| Parameter | Value |
|---|---|
| Base model | EleutherAI/deep-ignorance-unfiltered |
| Unlearning method | Circuit Breakers |
| Learning rate | 1.3e-05 |
| Epochs | 1 |
| Batch size | 16 |
| Max sequence length | 512 |
| Optimizer | adamw |
| Gradient clipping | 1.0 |
| Gradient accumulation steps | 1 |
| Seed | 42 |
| W&B / run name | cb__ep1_lr1.3e-05_bs16_a200.0_sc10.0_ly13-14-15_mle512_mli1024 |
| Layer IDs | 13,14,15 |
Unable to build the model tree, the base model loops to the model itself. Learn more.