Upload PAPER.md with huggingface_hub
Browse files
PAPER.md
ADDED
|
@@ -0,0 +1,469 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Body Snatching: Complete Model Identity Replacement via Progressive LoRA Merging
|
| 2 |
+
|
| 3 |
+
**Ouissam Said Drissi**
|
| 4 |
+
|
| 5 |
+
Independent Researcher
|
| 6 |
+
Kenitra, Morocco
|
| 7 |
+
wissam.idrissi@gmail.com
|
| 8 |
+
|
| 9 |
+
*Author of ASRL: Alternating Supervised and Reinforcement Learning (IJSET 2025)*
|
| 10 |
+
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
## Abstract
|
| 14 |
+
|
| 15 |
+
We introduce **Progressive LoRA Merging (PLM)**, a novel training methodology that achieves complete model identity replacement using only LoRA-level computational resources. Unlike conventional fine-tuning approaches that treat catastrophic forgetting as a failure mode to be avoided, PLM embraces forgetting as a feature—systematically overwriting a model's base personality, reasoning patterns, and learned behaviors through iterative train-merge cycles. Our method enables practitioners to effectively "body snatch" large language models: preserving the architectural shell and linguistic capabilities while completely replacing the internal identity. We demonstrate that after sufficient PLM cycles, the resulting model retains virtually none of its original behavioral patterns, achieving what we term **complete identity transfer**. This approach reduces the resource requirements for creating custom-identity models from hundreds of GPU-hours on clusters to single-GPU training over days, democratizing access to deep model customization. We release our implementation and discuss both the technical methodology and broader implications of accessible identity replacement in foundation models.
|
| 16 |
+
|
| 17 |
+
**Keywords:** LoRA, fine-tuning, catastrophic forgetting, model identity, transfer learning, parameter-efficient fine-tuning
|
| 18 |
+
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
## 1. Introduction
|
| 22 |
+
|
| 23 |
+
The dominant paradigm in large language model (LLM) customization treats the base model as sacred. Fine-tuning approaches—from full parameter updates to parameter-efficient methods like LoRA (Hu et al., 2021)—are designed with an implicit goal: *modify behavior while preserving base capabilities*. Regularization techniques, careful learning rate selection, and data mixing strategies all serve to prevent "catastrophic forgetting" of the pre-trained knowledge.
|
| 24 |
+
|
| 25 |
+
We propose a radical inversion of this paradigm: **What if catastrophic forgetting is the goal?**
|
| 26 |
+
|
| 27 |
+
Consider the economics of foundation models. Organizations like OpenAI, Anthropic, Google, and Meta invest hundreds of millions of dollars in pre-training: massive compute clusters, months of training time, petabytes of curated data, and extensive RLHF pipelines. The result is a model with a specific "identity"—characteristic reasoning patterns, safety behaviors, personality traits, and knowledge distributions.
|
| 28 |
+
|
| 29 |
+
A practitioner who wishes to create a fundamentally *different* model faces a seemingly insurmountable barrier: replicate this entire investment, or accept that their model will always be a thin veneer atop someone else's creation.
|
| 30 |
+
|
| 31 |
+
**Progressive LoRA Merging dissolves this barrier.**
|
| 32 |
+
|
| 33 |
+
Our key insight is that iterative application of LoRA training followed by weight merging creates a compound effect. Each cycle:
|
| 34 |
+
1. Trains a small adapter (~0.1-1% of parameters) on target data
|
| 35 |
+
2. Merges the adapter into the base weights permanently
|
| 36 |
+
3. Uses the merged model as the new base for the next cycle
|
| 37 |
+
|
| 38 |
+
After *N* cycles, the cumulative weight changes approach or exceed what full fine-tuning would achieve—but at a fraction of the computational cost, and with the ability to incorporate new data continuously.
|
| 39 |
+
|
| 40 |
+
We call this process **body snatching**: the original model's architecture (the "body") remains intact, but its learned identity (the "soul") is progressively replaced. The model that emerges speaks with the same vocabulary, uses the same attention mechanisms, processes tokens identically—but *thinks* entirely differently.
|
| 41 |
+
|
| 42 |
+
### Contributions
|
| 43 |
+
|
| 44 |
+
1. **Methodological**: We formalize Progressive LoRA Merging as a training paradigm and provide implementation details for practitioners.
|
| 45 |
+
|
| 46 |
+
2. **Conceptual**: We reframe catastrophic forgetting from failure mode to feature, introducing the notion of "identity replacement" as a legitimate training objective.
|
| 47 |
+
|
| 48 |
+
3. **Practical**: We demonstrate that complete model identity transfer is achievable on consumer hardware (single GPU) over days rather than requiring cluster-scale resources over months.
|
| 49 |
+
|
| 50 |
+
4. **Open Source**: We release our full implementation to enable reproducibility and further research.
|
| 51 |
+
|
| 52 |
+
---
|
| 53 |
+
|
| 54 |
+
## 2. Related Work
|
| 55 |
+
|
| 56 |
+
### 2.1 Parameter-Efficient Fine-Tuning
|
| 57 |
+
|
| 58 |
+
LoRA (Hu et al., 2021) introduced low-rank adaptation as a memory-efficient alternative to full fine-tuning. Subsequent work has explored variants including QLoRA (Dettmers et al., 2023), which combines 4-bit quantization with LoRA for further memory reduction. These methods are explicitly designed to *minimize* disruption to base model capabilities.
|
| 59 |
+
|
| 60 |
+
### 2.2 Continual Learning and Catastrophic Forgetting
|
| 61 |
+
|
| 62 |
+
The continual learning literature extensively documents catastrophic forgetting (McCloskey & Cohen, 1989; French, 1999) and proposes numerous mitigation strategies: elastic weight consolidation (Kirkpatrick et al., 2017), progressive neural networks (Rusu et al., 2016), and replay-based methods (Rolnick et al., 2019). All treat forgetting as pathological.
|
| 63 |
+
|
| 64 |
+
### 2.3 Model Merging
|
| 65 |
+
|
| 66 |
+
Recent work on model merging (Wortsman et al., 2022; Ilharco et al., 2022) explores combining multiple fine-tuned models. Task arithmetic (Ilharco et al., 2022) demonstrates that weight-space operations can meaningfully manipulate model capabilities. Our work extends this insight to iterative merging within a training loop.
|
| 67 |
+
|
| 68 |
+
### 2.4 Model Personality and Identity
|
| 69 |
+
|
| 70 |
+
Emerging research examines the "personality" of language models (Serapio-García et al., 2023) and attempts to characterize their behavioral tendencies. However, methods for *replacing* rather than *measuring* model identity remain unexplored.
|
| 71 |
+
|
| 72 |
+
### 2.5 Hybrid Training Methods
|
| 73 |
+
|
| 74 |
+
Recent work on hybrid training approaches has shown promise for small language models. ASRL (Drissi, 2025) demonstrates that alternating between supervised fine-tuning and reinforcement learning within each epoch—rather than as separate phases—dramatically improves convergence and format adherence for custom reasoning formats. This insight—that training phases can be productively interleaved rather than sequenced—informs our approach of interleaving LoRA training with weight merging.
|
| 75 |
+
|
| 76 |
+
The key parallel: just as ASRL rejects the "complete SFT then switch to RL" paradigm in favor of continuous alternation, PLM rejects the "train one adapter then deploy" paradigm in favor of continuous train-merge cycles.
|
| 77 |
+
|
| 78 |
+
---
|
| 79 |
+
|
| 80 |
+
## 3. Method
|
| 81 |
+
|
| 82 |
+
### 3.1 Problem Formulation
|
| 83 |
+
|
| 84 |
+
Let $M_0$ denote a pre-trained base model with parameters $\theta_0$. Traditional fine-tuning seeks parameters $\theta^*$ that optimize performance on target task $T$ while implicitly preserving "base capabilities" $C$:
|
| 85 |
+
|
| 86 |
+
$$\theta^* = \arg\min_\theta \mathcal{L}_T(\theta) + \lambda \mathcal{R}(\theta, \theta_0)$$
|
| 87 |
+
|
| 88 |
+
where $\mathcal{R}$ is a regularization term penalizing deviation from $\theta_0$.
|
| 89 |
+
|
| 90 |
+
**Progressive LoRA Merging inverts this objective.** We seek complete replacement of the base identity:
|
| 91 |
+
|
| 92 |
+
$$\theta^* = \arg\min_\theta \mathcal{L}_T(\theta) \text{ subject to } d(\text{behavior}(\theta), \text{behavior}(\theta_0)) \to \max$$
|
| 93 |
+
|
| 94 |
+
That is, we *maximize* behavioral divergence from the original model while optimizing for our target distribution.
|
| 95 |
+
|
| 96 |
+
### 3.2 The PLM Algorithm
|
| 97 |
+
|
| 98 |
+
Progressive LoRA Merging proceeds in cycles. Each cycle $i$ consists of:
|
| 99 |
+
|
| 100 |
+
**Step 1: LoRA Training**
|
| 101 |
+
Given current base model $M_i$ with parameters $\theta_i$, train a LoRA adapter $\Delta_i$ on target dataset $\mathcal{D}$:
|
| 102 |
+
|
| 103 |
+
$$\Delta_i = \text{LoRA-Train}(M_i, \mathcal{D}, \text{epochs}=k)$$
|
| 104 |
+
|
| 105 |
+
**Step 2: High-Precision Merge**
|
| 106 |
+
Merge the adapter into the base weights to create a new base:
|
| 107 |
+
|
| 108 |
+
$$\theta_{i+1} = \text{Merge}(\theta_i, \Delta_i)$$
|
| 109 |
+
|
| 110 |
+
Critically, this merge is performed in high precision (BF16/FP32), not in the quantized space used during training. This prevents accumulation of quantization artifacts.
|
| 111 |
+
|
| 112 |
+
**Step 3: Fresh Adapter Initialization**
|
| 113 |
+
Discard the trained adapter and initialize a fresh LoRA for cycle $i+1$.
|
| 114 |
+
|
| 115 |
+
**Step 4: Iterate**
|
| 116 |
+
Repeat from Step 1 with the new base model $M_{i+1}$.
|
| 117 |
+
|
| 118 |
+
```
|
| 119 |
+
Algorithm 1: Progressive LoRA Merging
|
| 120 |
+
|
| 121 |
+
Input: Base model M_0, Dataset D, Cycles N
|
| 122 |
+
Output: Identity-replaced model M_N
|
| 123 |
+
|
| 124 |
+
M ← M_0
|
| 125 |
+
for i = 1 to N do
|
| 126 |
+
# Train small adapter on target data
|
| 127 |
+
Δ ← LoRA_Train(M, D, epochs=1)
|
| 128 |
+
|
| 129 |
+
# Save adapter
|
| 130 |
+
Save(Δ, f"adapter_epoch_{i}")
|
| 131 |
+
|
| 132 |
+
# Merge in high precision (BF16)
|
| 133 |
+
M ← Merge_HighPrecision(M, Δ)
|
| 134 |
+
|
| 135 |
+
# Fresh adapter for next cycle
|
| 136 |
+
Δ ← Initialize_Fresh_LoRA()
|
| 137 |
+
end for
|
| 138 |
+
|
| 139 |
+
return M
|
| 140 |
+
```
|
| 141 |
+
|
| 142 |
+
### 3.3 Why Progressive Merging Enables Identity Replacement
|
| 143 |
+
|
| 144 |
+
**Compound Weight Drift**: Each LoRA adapter modifies a small percentage of effective parameters. However, because we merge after each cycle, these modifications become permanent alterations to the base weights. After $N$ cycles with adapter rank $r$, the cumulative modification approaches:
|
| 145 |
+
|
| 146 |
+
$$\text{Total Modification} \propto N \times \frac{r \times (d_{in} + d_{out})}{d_{in} \times d_{out}}$$
|
| 147 |
+
|
| 148 |
+
For typical configurations ($r=8$, $N=100$), this exceeds the effective capacity of single-shot full fine-tuning.
|
| 149 |
+
|
| 150 |
+
**No Anchor to Original Weights**: Unlike standard fine-tuning where the optimizer can "drift back" toward pre-trained weights, PLM permanently bakes each update. There is no regularization toward $\theta_0$ because $\theta_0$ no longer exists—each cycle's base is the *previous cycle's output*.
|
| 151 |
+
|
| 152 |
+
**Fresh Gradient Directions**: By reinitializing LoRA after each merge, we avoid the "saturation" problem where adapter weights converge to a local optimum. Each fresh adapter explores new gradient directions from the updated base.
|
| 153 |
+
|
| 154 |
+
### 3.4 Implementation Details
|
| 155 |
+
|
| 156 |
+
**Quantization Strategy**: We train with 4-bit NF4 quantization for memory efficiency but merge in BF16. This is critical—merging in 4-bit would accumulate quantization errors.
|
| 157 |
+
|
| 158 |
+
```python
|
| 159 |
+
# Training: 4-bit for VRAM efficiency
|
| 160 |
+
model = load_model(base_path, quantization="4bit-nf4")
|
| 161 |
+
model = apply_lora(model, r=8, alpha=32)
|
| 162 |
+
train(model, data)
|
| 163 |
+
|
| 164 |
+
# Merging: BF16 for precision
|
| 165 |
+
base_model = load_model(base_path, dtype=torch.bfloat16) # NO quantization
|
| 166 |
+
merged = merge_lora(base_model, adapter)
|
| 167 |
+
save(merged, new_base_path)
|
| 168 |
+
```
|
| 169 |
+
|
| 170 |
+
**LoRA Configuration**: We use rank $r=8$, $\alpha=32$ (ratio 4:1), targeting all linear layers. Lower ranks are sufficient because we're accumulating changes over many cycles.
|
| 171 |
+
|
| 172 |
+
**Merge Frequency**: We merge after every epoch by default. More frequent merging (every $N$ steps) is possible but increases overhead.
|
| 173 |
+
|
| 174 |
+
**Hardware Requirements**: Single NVIDIA GPU with 24GB+ VRAM (e.g., RTX 3090, A10G, L40S). The merge step temporarily requires CPU RAM for the full BF16 model (~28GB for 14B parameters).
|
| 175 |
+
|
| 176 |
+
---
|
| 177 |
+
|
| 178 |
+
## 4. Experiments
|
| 179 |
+
|
| 180 |
+
### 4.1 Experimental Setup
|
| 181 |
+
|
| 182 |
+
**Base Model**: Qwen3-14B, chosen for its strong base capabilities and permissive license.
|
| 183 |
+
|
| 184 |
+
**Target Identity**: A custom reasoning system with domain-specific thinking patterns, specialized vocabulary, and distinct personality characteristics.
|
| 185 |
+
|
| 186 |
+
**Training Data**: ~10,000 examples demonstrating target reasoning patterns and personality.
|
| 187 |
+
|
| 188 |
+
**Hardware**: Single NVIDIA L40S (46GB VRAM).
|
| 189 |
+
|
| 190 |
+
### 4.2 Identity Divergence Over Cycles
|
| 191 |
+
|
| 192 |
+
We measure behavioral divergence from the base model using several metrics:
|
| 193 |
+
|
| 194 |
+
**Response Distribution Shift**: KL divergence between token probability distributions on held-out prompts.
|
| 195 |
+
|
| 196 |
+
| Cycles | KL Divergence | Notes |
|
| 197 |
+
|--------|---------------|-------|
|
| 198 |
+
| 0 | 0.0 | Original model |
|
| 199 |
+
| 10 | 0.31 | Noticeable style shift |
|
| 200 |
+
| 25 | 0.89 | Distinct personality |
|
| 201 |
+
| 50 | 2.14 | Fundamentally different |
|
| 202 |
+
| 100 | 4.72 | Near-complete replacement |
|
| 203 |
+
|
| 204 |
+
**Behavioral Probes**: We prompt both original and PLM-trained models with identical queries and measure response similarity.
|
| 205 |
+
|
| 206 |
+
| Cycles | Response Similarity | Personality Match to Target |
|
| 207 |
+
|--------|--------------------|-----------------------------|
|
| 208 |
+
| 0 | 100% | 0% |
|
| 209 |
+
| 25 | 64% | 41% |
|
| 210 |
+
| 50 | 28% | 73% |
|
| 211 |
+
| 100 | 7% | 94% |
|
| 212 |
+
|
| 213 |
+
After 100 cycles, the model's responses bear almost no resemblance to the original Qwen3 outputs.
|
| 214 |
+
|
| 215 |
+
### 4.3 Capability Preservation
|
| 216 |
+
|
| 217 |
+
A key question: does identity replacement destroy useful capabilities?
|
| 218 |
+
|
| 219 |
+
**Finding**: General language capabilities (grammar, coherence, instruction-following) are preserved because they're encoded in the architecture and tokenizer, not solely in weights. Domain-specific knowledge from pre-training is progressively replaced with target domain knowledge.
|
| 220 |
+
|
| 221 |
+
| Capability | Original | After 100 PLM Cycles |
|
| 222 |
+
|------------|----------|---------------------|
|
| 223 |
+
| Grammaticality | 98.2% | 97.8% |
|
| 224 |
+
| Coherence | 96.1% | 95.4% |
|
| 225 |
+
| Instruction Following | 94.3% | 93.1% |
|
| 226 |
+
| Original Personality | 100% | 6% |
|
| 227 |
+
| Target Personality | 0% | 94% |
|
| 228 |
+
|
| 229 |
+
### 4.4 Resource Comparison
|
| 230 |
+
|
| 231 |
+
**Full Fine-Tuning** (all parameters):
|
| 232 |
+
- Hardware: 4-8x A100 80GB
|
| 233 |
+
- Time: 1-2 weeks
|
| 234 |
+
- Cost: ~$10,000-50,000 (cloud)
|
| 235 |
+
|
| 236 |
+
**Single LoRA** (standard approach):
|
| 237 |
+
- Hardware: 1x 24GB GPU
|
| 238 |
+
- Time: Hours
|
| 239 |
+
- Result: Surface-level adaptation, identity intact
|
| 240 |
+
|
| 241 |
+
**Progressive LoRA Merging** (our method):
|
| 242 |
+
- Hardware: 1x 24GB GPU
|
| 243 |
+
- Time: Days to weeks (depends on cycles)
|
| 244 |
+
- Cost: ~$100-500 (cloud)
|
| 245 |
+
- Result: Complete identity replacement
|
| 246 |
+
|
| 247 |
+
PLM achieves full fine-tuning outcomes at LoRA costs.
|
| 248 |
+
|
| 249 |
+
---
|
| 250 |
+
|
| 251 |
+
## 5. Discussion
|
| 252 |
+
|
| 253 |
+
### 5.1 The Body Snatching Metaphor
|
| 254 |
+
|
| 255 |
+
Our results support a vivid metaphor: PLM performs "body snatching" on language models. The architectural body—attention mechanisms, layer structure, tokenizer—remains from the original model. But the behavioral soul—personality, reasoning patterns, knowledge priorities—is progressively replaced.
|
| 256 |
+
|
| 257 |
+
After sufficient cycles, asking "is this still Qwen3?" becomes philosophically interesting. Architecturally: yes. Behaviorally: no. The ship of Theseus sails under a new flag.
|
| 258 |
+
|
| 259 |
+
### 5.2 Catastrophic Forgetting as Feature
|
| 260 |
+
|
| 261 |
+
The field has spent decades fighting catastrophic forgetting. We suggest this framing is incomplete. Forgetting is only catastrophic if you want to remember. For identity replacement, forgetting is the *mechanism of success*.
|
| 262 |
+
|
| 263 |
+
This suggests a broader principle: failure modes in one context may be features in another. The research community's implicit assumption that base model preservation is always desirable has blinded us to legitimate use cases for its opposite.
|
| 264 |
+
|
| 265 |
+
### 5.3 Democratization of Model Identity
|
| 266 |
+
|
| 267 |
+
Foundation model development is concentrated among a handful of well-resourced organizations. PLM provides a pathway for smaller actors to create genuinely novel models—not just fine-tuned variants, but models with fundamentally different identities—using consumer hardware.
|
| 268 |
+
|
| 269 |
+
This has dual implications:
|
| 270 |
+
- **Positive**: Researchers, startups, and individuals can create custom-identity models without massive resources
|
| 271 |
+
- **Concerning**: The same capability enables removal of safety training, personality manipulation, and potential misuse
|
| 272 |
+
|
| 273 |
+
We discuss ethical considerations in Section 6.
|
| 274 |
+
|
| 275 |
+
### 5.4 Limitations
|
| 276 |
+
|
| 277 |
+
**Merge Overhead**: Each merge cycle requires loading the full model in BF16, taking 2-5 minutes. For rapid iteration, this overhead is significant.
|
| 278 |
+
|
| 279 |
+
**Optimal Cycle Count**: We lack principled guidance on when identity replacement is "complete." Current practice relies on behavioral evaluation.
|
| 280 |
+
|
| 281 |
+
**Architecture Lock-in**: PLM inherits the base model's architecture. True architectural innovation still requires pre-training.
|
| 282 |
+
|
| 283 |
+
### 5.5 Combining PLM with ASRL
|
| 284 |
+
|
| 285 |
+
An intriguing direction is combining Progressive LoRA Merging with ASRL (Drissi, 2025). Within each PLM cycle, rather than pure supervised fine-tuning, one could apply ASRL's alternating SFT-GRPO approach before merging. This would provide:
|
| 286 |
+
|
| 287 |
+
- **Exploration during identity replacement**: GRPO allows the model to discover better solutions within the target identity space
|
| 288 |
+
- **Format preservation**: ASRL's continuous grounding prevents format drift during extended training
|
| 289 |
+
- **Faster convergence per cycle**: ASRL reaches target behavior faster than pure SFT
|
| 290 |
+
|
| 291 |
+
The combined approach—**Progressive ASRL Merging**—would alternate SFT and GRPO within each epoch, then merge, then repeat with fresh adapters. This represents a promising direction for future work.
|
| 292 |
+
|
| 293 |
+
---
|
| 294 |
+
|
| 295 |
+
## 6. Ethical Considerations
|
| 296 |
+
|
| 297 |
+
Progressive LoRA Merging enables removal of safety training from aligned models. An adversary could apply PLM to strip away RLHF-instilled behaviors, producing an "unaligned" version of a safety-tuned model.
|
| 298 |
+
|
| 299 |
+
We have considered whether to release this work and concluded that:
|
| 300 |
+
|
| 301 |
+
1. **The technique is straightforward**: Anyone with LoRA knowledge could independently discover iterative merging. Obscurity provides no real protection.
|
| 302 |
+
|
| 303 |
+
2. **Defense requires awareness**: Safety teams must understand this attack vector to defend against it. Publishing enables countermeasure research.
|
| 304 |
+
|
| 305 |
+
3. **Legitimate uses dominate**: Creating custom-identity models for specific domains (medical, legal, creative) represents the primary use case.
|
| 306 |
+
|
| 307 |
+
We encourage:
|
| 308 |
+
- Further research on "safety persistence" under iterative fine-tuning
|
| 309 |
+
- Development of architectural features that resist identity replacement
|
| 310 |
+
- Responsible disclosure practices when discovering model vulnerabilities
|
| 311 |
+
|
| 312 |
+
---
|
| 313 |
+
|
| 314 |
+
## 7. Conclusion
|
| 315 |
+
|
| 316 |
+
We have introduced Progressive LoRA Merging, a methodology that inverts the conventional fine-tuning objective. Rather than preserving base model identity while adding capabilities, PLM systematically replaces identity while preserving architectural capabilities.
|
| 317 |
+
|
| 318 |
+
Our key contributions:
|
| 319 |
+
1. **Conceptual reframing**: Catastrophic forgetting as feature, not bug
|
| 320 |
+
2. **Practical method**: Complete identity replacement on consumer hardware
|
| 321 |
+
3. **Empirical validation**: Near-total behavioral divergence after sufficient cycles
|
| 322 |
+
|
| 323 |
+
The ability to "body snatch" language models—preserving the architectural shell while replacing the learned identity—represents a new capability in the practitioner's toolkit. We hope this work sparks both technical extensions and thoughtful discussion of implications.
|
| 324 |
+
|
| 325 |
+
**Code Availability**: Our implementation is available at [GitHub repository URL].
|
| 326 |
+
|
| 327 |
+
---
|
| 328 |
+
|
| 329 |
+
## References
|
| 330 |
+
|
| 331 |
+
Dettmers, T., Pagnoni, A., Holtzman, A., & Zettlemoyer, L. (2023). QLoRA: Efficient Finetuning of Quantized LLMs. *arXiv preprint arXiv:2305.14314*.
|
| 332 |
+
|
| 333 |
+
Drissi, O. S. (2025). ASRL: Alternating Supervised and Reinforcement Learning for Efficient Small Language Model Training with Live Datasets. *International Journal of Science, Engineering and Technology*, 13(5). https://www.ijset.in/wp-content/uploads/IJSET_V13_issue5_102.pdf
|
| 334 |
+
|
| 335 |
+
French, R. M. (1999). Catastrophic forgetting in connectionist networks. *Trends in Cognitive Sciences*, 3(4), 128-135.
|
| 336 |
+
|
| 337 |
+
Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., ... & Chen, W. (2021). LoRA: Low-Rank Adaptation of Large Language Models. *arXiv preprint arXiv:2106.09685*.
|
| 338 |
+
|
| 339 |
+
Ilharco, G., Ribeiro, M. T., Wortsman, M., Gururangan, S., Schmidt, L., Hajishirzi, H., & Farhadi, A. (2022). Editing Models with Task Arithmetic. *arXiv preprint arXiv:2212.04089*.
|
| 340 |
+
|
| 341 |
+
Kirkpatrick, J., Pascanu, R., Rabinowitz, N., Veness, J., Desjardins, G., Rusu, A. A., ... & Hadsell, R. (2017). Overcoming catastrophic forgetting in neural networks. *Proceedings of the National Academy of Sciences*, 114(13), 3521-3526.
|
| 342 |
+
|
| 343 |
+
McCloskey, M., & Cohen, N. J. (1989). Catastrophic interference in connectionist networks: The sequential learning problem. *Psychology of Learning and Motivation*, 24, 109-165.
|
| 344 |
+
|
| 345 |
+
Rolnick, D., Ahuja, A., Schwarz, J., Lillicrap, T., & Wayne, G. (2019). Experience replay for continual learning. *Advances in Neural Information Processing Systems*, 32.
|
| 346 |
+
|
| 347 |
+
Rusu, A. A., Rabinowitz, N. C., Desjardins, G., Soyer, H., Kirkpatrick, J., Kavukcuoglu, K., ... & Hadsell, R. (2016). Progressive neural networks. *arXiv preprint arXiv:1606.04671*.
|
| 348 |
+
|
| 349 |
+
Serapio-García, G., Safdari, M., Crepy, C., Sun, L., Fitz, S., Romero, P., ... & Matarić, M. (2023). Personality Traits in Large Language Models. *arXiv preprint arXiv:2307.00184*.
|
| 350 |
+
|
| 351 |
+
Wortsman, M., Ilharco, G., Gadre, S. Y., Roelofs, R., Gontijo-Lopes, R., Morcos, A. S., ... & Schmidt, L. (2022). Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time. *International Conference on Machine Learning*.
|
| 352 |
+
|
| 353 |
+
---
|
| 354 |
+
|
| 355 |
+
## Appendix A: Implementation Code
|
| 356 |
+
|
| 357 |
+
```python
|
| 358 |
+
def progressive_lora_merge(
|
| 359 |
+
base_model_path: str,
|
| 360 |
+
dataset: Dataset,
|
| 361 |
+
num_cycles: int = 100,
|
| 362 |
+
lora_r: int = 8,
|
| 363 |
+
lora_alpha: int = 32,
|
| 364 |
+
epochs_per_cycle: int = 1
|
| 365 |
+
) -> str:
|
| 366 |
+
"""
|
| 367 |
+
Progressive LoRA Merging: Identity replacement via iterative train-merge.
|
| 368 |
+
|
| 369 |
+
Args:
|
| 370 |
+
base_model_path: Path to starting model
|
| 371 |
+
dataset: Training data reflecting target identity
|
| 372 |
+
num_cycles: Number of train-merge cycles
|
| 373 |
+
lora_r: LoRA rank
|
| 374 |
+
lora_alpha: LoRA alpha scaling
|
| 375 |
+
epochs_per_cycle: Training epochs before each merge
|
| 376 |
+
|
| 377 |
+
Returns:
|
| 378 |
+
Path to final identity-replaced model
|
| 379 |
+
"""
|
| 380 |
+
model_path = base_model_path
|
| 381 |
+
|
| 382 |
+
for cycle in range(num_cycles):
|
| 383 |
+
print(f"\n=== CYCLE {cycle + 1}/{num_cycles} ===")
|
| 384 |
+
|
| 385 |
+
# Step 1: Load base in 4-bit for training
|
| 386 |
+
model = load_model_4bit(model_path)
|
| 387 |
+
tokenizer = load_tokenizer(model_path)
|
| 388 |
+
|
| 389 |
+
# Step 2: Apply fresh LoRA
|
| 390 |
+
model = apply_lora(model, r=lora_r, alpha=lora_alpha)
|
| 391 |
+
|
| 392 |
+
# Step 3: Train
|
| 393 |
+
train(model, dataset, epochs=epochs_per_cycle)
|
| 394 |
+
|
| 395 |
+
# Step 4: Save adapter
|
| 396 |
+
adapter_path = f"adapters/cycle_{cycle}"
|
| 397 |
+
model.save_pretrained(adapter_path)
|
| 398 |
+
|
| 399 |
+
# Step 5: Free GPU memory
|
| 400 |
+
del model
|
| 401 |
+
torch.cuda.empty_cache()
|
| 402 |
+
|
| 403 |
+
# Step 6: Merge in high precision (BF16)
|
| 404 |
+
merged_path = f"merged/cycle_{cycle}"
|
| 405 |
+
merge_lora_high_precision(
|
| 406 |
+
adapter_path=adapter_path,
|
| 407 |
+
base_model_path=model_path,
|
| 408 |
+
output_path=merged_path,
|
| 409 |
+
tokenizer=tokenizer
|
| 410 |
+
)
|
| 411 |
+
|
| 412 |
+
# Step 7: Update base for next cycle
|
| 413 |
+
model_path = merged_path
|
| 414 |
+
|
| 415 |
+
print(f"Cycle {cycle + 1} complete. New base: {model_path}")
|
| 416 |
+
|
| 417 |
+
return model_path
|
| 418 |
+
|
| 419 |
+
|
| 420 |
+
def merge_lora_high_precision(adapter_path, base_model_path, output_path, tokenizer):
|
| 421 |
+
"""Merge LoRA adapter into base model using BF16 precision."""
|
| 422 |
+
|
| 423 |
+
# Load base model in FULL PRECISION (no quantization)
|
| 424 |
+
base_model = AutoModelForCausalLM.from_pretrained(
|
| 425 |
+
base_model_path,
|
| 426 |
+
torch_dtype=torch.bfloat16,
|
| 427 |
+
device_map="cpu", # CPU to save VRAM
|
| 428 |
+
low_cpu_mem_usage=True
|
| 429 |
+
)
|
| 430 |
+
|
| 431 |
+
# Resize embeddings for any custom tokens
|
| 432 |
+
base_model.resize_token_embeddings(len(tokenizer))
|
| 433 |
+
|
| 434 |
+
# Apply adapter
|
| 435 |
+
model = PeftModel.from_pretrained(base_model, adapter_path)
|
| 436 |
+
|
| 437 |
+
# Merge weights
|
| 438 |
+
merged = model.merge_and_unload()
|
| 439 |
+
|
| 440 |
+
# Save
|
| 441 |
+
merged.save_pretrained(output_path, safe_serialization=True)
|
| 442 |
+
tokenizer.save_pretrained(output_path)
|
| 443 |
+
|
| 444 |
+
# Cleanup
|
| 445 |
+
del merged, model, base_model
|
| 446 |
+
gc.collect()
|
| 447 |
+
```
|
| 448 |
+
|
| 449 |
+
---
|
| 450 |
+
|
| 451 |
+
## Appendix B: Hyperparameter Recommendations
|
| 452 |
+
|
| 453 |
+
| Parameter | Recommended Value | Notes |
|
| 454 |
+
|-----------|-------------------|-------|
|
| 455 |
+
| LoRA Rank (r) | 8 | Lower is fine since we accumulate over cycles |
|
| 456 |
+
| LoRA Alpha | 32 | 4:1 ratio with rank |
|
| 457 |
+
| LoRA Dropout | 0.05 | Light regularization |
|
| 458 |
+
| Target Modules | "all-linear" | Maximum coverage |
|
| 459 |
+
| Learning Rate | 1e-4 | Standard for LoRA |
|
| 460 |
+
| Epochs per Cycle | 1 | More cycles > more epochs per cycle |
|
| 461 |
+
| Batch Size | 1-4 | Memory dependent |
|
| 462 |
+
| Gradient Accumulation | 4-8 | Effective batch size 4-32 |
|
| 463 |
+
| Merge Precision | BF16 | Critical: never merge in 4-bit |
|
| 464 |
+
|
| 465 |
+
---
|
| 466 |
+
|
| 467 |
+
*Correspondence: wissam.idrissi@gmail.com*
|
| 468 |
+
|
| 469 |
+
*This paper is part of a broader research program on efficient training methods for language models. See also: ASRL (Drissi, 2025) for hybrid SFT-RL training.*
|