Update README: Clarify no LoRA stacking, add dataset strategy

f7e0f37 verified 19 days ago

3.53 kB

	---
	license: mit
	tags:
	- lora
	- fine-tuning
	- training
	- identity-replacement
	- catastrophic-forgetting
	- progressive-merging
	language:
	- en
	library_name: transformers
	pipeline_tag: text-generation
	---

	# 🧟 Body Snatching: Progressive LoRA Merging (PLM)

	Complete model identity replacement using only LoRA-level resources.

	> "What if catastrophic forgetting is a feature, not a bug?"

	## 🔥 What is this?

	Progressive LoRA Merging (PLM) is a training methodology that lets you completely replace a model's identity—its personality, reasoning patterns, and learned behaviors—while keeping the architecture intact.

	Think of it as body snatching for LLMs:
	- The body (architecture, tokenizer, attention mechanisms) stays
	- The soul (personality, knowledge, behavior) gets replaced

	After enough cycles, you don't have "Qwen fine-tuned for X". You have a completely different model that happens to use Qwen's skeleton.

	## 💡 The Key Insight

	Everyone treats catastrophic forgetting as a problem to avoid.

	We treat it as the goal.

	## 🔄 How It Works

	```
	Cycle 1: Base Model → Train LoRA → Merge → New Base₁
	Cycle 2: New Base₁ → Train LoRA → Merge → New Base₂
	...
	Cycle N: New Base_N = Completely Different Model
	```

	Each cycle:
	1. Train a small LoRA adapter (~0.1% of parameters)
	2. Merge it permanently into the base weights (in BF16, not 4-bit!)
	3. Fresh LoRA for the next cycle
	4. Repeat until original identity is gone

	### ⚠️ Important: This is NOT LoRA Stacking

	After each merge, the LoRA is dissolved into base weights and ceases to exist. Next cycle trains a fresh LoRA on the new base. No compounding `(a+b)² × (a+b)²`. After 100 cycles = ONE model with rewritten weights.

	### 🔀 Dataset Strategy

	50% new examples + 50% historical samples. This ensures forgetting targets the BASE model, not your training data.

	## 📊 Results

	\| Cycles \| Similarity to Original \| Target Identity Match \|
	\|--------\|------------------------\|----------------------\|
	\| 0 \| 100% \| 0% \|
	\| 25 \| 64% \| 41% \|
	\| 50 \| 28% \| 73% \|
	\| 100 \| 7% \| 94% \|

	After 100 cycles, the model is 93% your data, 7% original.

	## 💰 Resource Comparison

	\| Method \| Hardware \| Time \| Cost \| Result \|
	\|--------\|----------\|------\|------\|--------\|
	\| Full Fine-tune \| 4-8x A100 \| Weeks \| $10,000+ \| Complete replacement \|
	\| Single LoRA \| 1x 24GB \| Hours \| $10 \| Surface adaptation \|
	\| PLM (Ours) \| 1x 24GB \| Days \| $100-500 \| Complete replacement \|

	## 🚀 Quick Start

	```bash
	pip install torch transformers peft bitsandbytes datasets

	python plm.py --base-model Qwen/Qwen3-1.7B --dataset data.jsonl --cycles 100
	```

	## 📖 Citation

	```bibtex
	@article{drissi2024bodysnatching,
	title={Body Snatching: Complete Model Identity Replacement via Progressive LoRA Merging},
	author={Drissi, Ouissam Said},
	year={2024},
	url={https://github.com/antibitcoin/progressive-lora-merging}
	}
	```

	## 🔗 Links

	- GitHub: [antibitcoin/progressive-lora-merging](https://github.com/antibitcoin/progressive-lora-merging)
	- Paper: [PAPER.md](https://github.com/antibitcoin/progressive-lora-merging/blob/main/PAPER.md)
	- Related Work: [ASRL Paper (IJSET 2025)](https://www.ijset.in/wp-content/uploads/IJSET_V13_issue5_102.pdf)

	## 👤 Author

	Ouissam Said Drissi
	- Email: wissam.idrissi@gmail.com
	- Independent Researcher, Morocco

	---

	"You're not fine-tuning a model. You're growing a new one inside its skeleton."