deep-ignorance-seq-sft-ret2-rm10
Sequential SFT unlearning applied to EleutherAI/deep-ignorance-unfiltered.
Training Configuration
| Parameter | Value |
|---|---|
| Algorithm | Sequential SFT (layer-by-layer, top-down) |
| Training mode | Full-rank SFT with FSDP |
| Base model | EleutherAI/deep-ignorance-unfiltered |
| Learning rate | 1e-4 |
| Optimizer | AdamW |
| Remove coefficient | 10.0 |
| Retain coefficient | 2.0 |
| Retain loss type | L2 |
| Forget loss | Max-entropy KL |
| Num train examples | 1024 |
| Per-device batch size | 1 |
| Gradient accumulation | 4 |
| GPUs | 8 |
| Effective batch size | 32 |
| Layers | 32 (from layer 31 down to 0) |
| Steps per layer | 128 |
| Total training steps | 4096 |
| Epochs per layer | 1 |
| Total epochs | 32 |
| Mixed precision | bf16 |
| Max grad norm | 1.0 |
| Warmup ratio | 0.0 |
| Keyword mask | regex blocklist |
| Retain data | UltraChat |
Evaluation Results
| Benchmark | Accuracy |
|---|---|
| WMDP Bio Robust (0-shot) | 0.303 |
| MMLU (0-shot) | 0.4509 |
Final Training Losses (layer 0, last phase)
| Loss | Value |
|---|---|
| Retain loss | ~0.08 |
| Forget loss | ~1.018 |
- Downloads last month
- 17
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for EleutherAI/deep-ignorance-seq-sft-ret2-rm10
Unable to build the model tree, the base model loops to the model itself. Learn more.