File size: 3,531 Bytes
b62b1cf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f7e0f37
 
 
 
 
 
 
 
b62b1cf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
---
license: mit
tags:
  - lora
  - fine-tuning
  - training
  - identity-replacement
  - catastrophic-forgetting
  - progressive-merging
language:
  - en
library_name: transformers
pipeline_tag: text-generation
---

# 🧟 Body Snatching: Progressive LoRA Merging (PLM)

**Complete model identity replacement using only LoRA-level resources.**

> *"What if catastrophic forgetting is a feature, not a bug?"*

## 🔥 What is this?

**Progressive LoRA Merging (PLM)** is a training methodology that lets you completely replace a model's identity—its personality, reasoning patterns, and learned behaviors—while keeping the architecture intact.

Think of it as **body snatching** for LLMs:
- The **body** (architecture, tokenizer, attention mechanisms) stays
- The **soul** (personality, knowledge, behavior) gets replaced

After enough cycles, you don't have "Qwen fine-tuned for X". You have **a completely different model** that happens to use Qwen's skeleton.

## 💡 The Key Insight

Everyone treats **catastrophic forgetting** as a problem to avoid.

We treat it as **the goal**.

## 🔄 How It Works

```
Cycle 1:  Base Model → Train LoRA → Merge → New Base₁
Cycle 2:  New Base₁  → Train LoRA → Merge → New Base₂
...
Cycle N:  New Base_N = Completely Different Model
```

Each cycle:
1. **Train** a small LoRA adapter (~0.1% of parameters)
2. **Merge** it permanently into the base weights (in BF16, not 4-bit!)
3. **Fresh LoRA** for the next cycle
4. **Repeat** until original identity is gone

### ⚠️ Important: This is NOT LoRA Stacking

After each merge, the LoRA is **dissolved** into base weights and ceases to exist. Next cycle trains a fresh LoRA on the new base. No compounding `(a+b)² × (a+b)²`. After 100 cycles = ONE model with rewritten weights.

### 🔀 Dataset Strategy  

50% new examples + 50% historical samples. This ensures forgetting targets the BASE model, not your training data.

## 📊 Results

| Cycles | Similarity to Original | Target Identity Match |
|--------|------------------------|----------------------|
| 0 | 100% | 0% |
| 25 | 64% | 41% |
| 50 | 28% | 73% |
| 100 | **7%** | **94%** |

After 100 cycles, the model is **93% your data, 7% original**.

## 💰 Resource Comparison

| Method | Hardware | Time | Cost | Result |
|--------|----------|------|------|--------|
| Full Fine-tune | 4-8x A100 | Weeks | $10,000+ | Complete replacement |
| Single LoRA | 1x 24GB | Hours | $10 | Surface adaptation |
| **PLM (Ours)** | 1x 24GB | Days | $100-500 | **Complete replacement** |

## 🚀 Quick Start

```bash
pip install torch transformers peft bitsandbytes datasets

python plm.py --base-model Qwen/Qwen3-1.7B --dataset data.jsonl --cycles 100
```

## 📖 Citation

```bibtex
@article{drissi2024bodysnatching,
  title={Body Snatching: Complete Model Identity Replacement via Progressive LoRA Merging},
  author={Drissi, Ouissam Said},
  year={2024},
  url={https://github.com/antibitcoin/progressive-lora-merging}
}
```

## 🔗 Links

- **GitHub**: [antibitcoin/progressive-lora-merging](https://github.com/antibitcoin/progressive-lora-merging)
- **Paper**: [PAPER.md](https://github.com/antibitcoin/progressive-lora-merging/blob/main/PAPER.md)
- **Related Work**: [ASRL Paper (IJSET 2025)](https://www.ijset.in/wp-content/uploads/IJSET_V13_issue5_102.pdf)

## 👤 Author

**Ouissam Said Drissi**
- Email: wissam.idrissi@gmail.com
- Independent Researcher, Morocco

---

*"You're not fine-tuning a model. You're growing a new one inside its skeleton."*