hitonet's picture
Update README: Clarify no LoRA stacking, add dataset strategy
f7e0f37 verified
---
license: mit
tags:
- lora
- fine-tuning
- training
- identity-replacement
- catastrophic-forgetting
- progressive-merging
language:
- en
library_name: transformers
pipeline_tag: text-generation
---
# 🧟 Body Snatching: Progressive LoRA Merging (PLM)
**Complete model identity replacement using only LoRA-level resources.**
> *"What if catastrophic forgetting is a feature, not a bug?"*
## 🔥 What is this?
**Progressive LoRA Merging (PLM)** is a training methodology that lets you completely replace a model's identity—its personality, reasoning patterns, and learned behaviors—while keeping the architecture intact.
Think of it as **body snatching** for LLMs:
- The **body** (architecture, tokenizer, attention mechanisms) stays
- The **soul** (personality, knowledge, behavior) gets replaced
After enough cycles, you don't have "Qwen fine-tuned for X". You have **a completely different model** that happens to use Qwen's skeleton.
## 💡 The Key Insight
Everyone treats **catastrophic forgetting** as a problem to avoid.
We treat it as **the goal**.
## 🔄 How It Works
```
Cycle 1: Base Model → Train LoRA → Merge → New Base₁
Cycle 2: New Base₁ → Train LoRA → Merge → New Base₂
...
Cycle N: New Base_N = Completely Different Model
```
Each cycle:
1. **Train** a small LoRA adapter (~0.1% of parameters)
2. **Merge** it permanently into the base weights (in BF16, not 4-bit!)
3. **Fresh LoRA** for the next cycle
4. **Repeat** until original identity is gone
### ⚠️ Important: This is NOT LoRA Stacking
After each merge, the LoRA is **dissolved** into base weights and ceases to exist. Next cycle trains a fresh LoRA on the new base. No compounding `(a+b)² × (a+b)²`. After 100 cycles = ONE model with rewritten weights.
### 🔀 Dataset Strategy
50% new examples + 50% historical samples. This ensures forgetting targets the BASE model, not your training data.
## 📊 Results
| Cycles | Similarity to Original | Target Identity Match |
|--------|------------------------|----------------------|
| 0 | 100% | 0% |
| 25 | 64% | 41% |
| 50 | 28% | 73% |
| 100 | **7%** | **94%** |
After 100 cycles, the model is **93% your data, 7% original**.
## 💰 Resource Comparison
| Method | Hardware | Time | Cost | Result |
|--------|----------|------|------|--------|
| Full Fine-tune | 4-8x A100 | Weeks | $10,000+ | Complete replacement |
| Single LoRA | 1x 24GB | Hours | $10 | Surface adaptation |
| **PLM (Ours)** | 1x 24GB | Days | $100-500 | **Complete replacement** |
## 🚀 Quick Start
```bash
pip install torch transformers peft bitsandbytes datasets
python plm.py --base-model Qwen/Qwen3-1.7B --dataset data.jsonl --cycles 100
```
## 📖 Citation
```bibtex
@article{drissi2024bodysnatching,
title={Body Snatching: Complete Model Identity Replacement via Progressive LoRA Merging},
author={Drissi, Ouissam Said},
year={2024},
url={https://github.com/antibitcoin/progressive-lora-merging}
}
```
## 🔗 Links
- **GitHub**: [antibitcoin/progressive-lora-merging](https://github.com/antibitcoin/progressive-lora-merging)
- **Paper**: [PAPER.md](https://github.com/antibitcoin/progressive-lora-merging/blob/main/PAPER.md)
- **Related Work**: [ASRL Paper (IJSET 2025)](https://www.ijset.in/wp-content/uploads/IJSET_V13_issue5_102.pdf)
## 👤 Author
**Ouissam Said Drissi**
- Email: wissam.idrissi@gmail.com
- Independent Researcher, Morocco
---
*"You're not fine-tuning a model. You're growing a new one inside its skeleton."*