Instructions to use hitonet/progressive-lora-merging with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use hitonet/progressive-lora-merging with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="hitonet/progressive-lora-merging")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("hitonet/progressive-lora-merging", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use hitonet/progressive-lora-merging with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "hitonet/progressive-lora-merging"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "hitonet/progressive-lora-merging",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/hitonet/progressive-lora-merging

SGLang

How to use hitonet/progressive-lora-merging with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "hitonet/progressive-lora-merging" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "hitonet/progressive-lora-merging",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "hitonet/progressive-lora-merging" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "hitonet/progressive-lora-merging",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use hitonet/progressive-lora-merging with Docker Model Runner:
```
docker model run hf.co/hitonet/progressive-lora-merging
```

🧟 Body Snatching: Progressive LoRA Merging (PLM)

Complete model identity replacement using only LoRA-level resources.

"What if catastrophic forgetting is a feature, not a bug?"

🔥 What is this?

Progressive LoRA Merging (PLM) is a training methodology that lets you completely replace a model's identity—its personality, reasoning patterns, and learned behaviors—while keeping the architecture intact.

Think of it as body snatching for LLMs:

The body (architecture, tokenizer, attention mechanisms) stays
The soul (personality, knowledge, behavior) gets replaced

After enough cycles, you don't have "Qwen fine-tuned for X". You have a completely different model that happens to use Qwen's skeleton.

💡 The Key Insight

Everyone treats catastrophic forgetting as a problem to avoid.

We treat it as the goal.

🔄 How It Works

Cycle 1:  Base Model → Train LoRA → Merge → New Base₁
Cycle 2:  New Base₁  → Train LoRA → Merge → New Base₂
...
Cycle N:  New Base_N = Completely Different Model

Each cycle:

Train a small LoRA adapter (~0.1% of parameters)
Merge it permanently into the base weights (in BF16, not 4-bit!)
Fresh LoRA for the next cycle
Repeat until original identity is gone

⚠️ Important: This is NOT LoRA Stacking

After each merge, the LoRA is dissolved into base weights and ceases to exist. Next cycle trains a fresh LoRA on the new base. No compounding (a+b)² × (a+b)². After 100 cycles = ONE model with rewritten weights.

🔀 Dataset Strategy

50% new examples + 50% historical samples. This ensures forgetting targets the BASE model, not your training data.

📊 Results

Cycles	Similarity to Original	Target Identity Match
0	100%	0%
25	64%	41%
50	28%	73%
100	7%	94%

After 100 cycles, the model is 93% your data, 7% original.

💰 Resource Comparison

Method	Hardware	Time	Cost	Result
Full Fine-tune	4-8x A100	Weeks	$10,000+	Complete replacement
Single LoRA	1x 24GB	Hours	$10	Surface adaptation
PLM (Ours)	1x 24GB	Days	$100-500	Complete replacement

🚀 Quick Start

pip install torch transformers peft bitsandbytes datasets

python plm.py --base-model Qwen/Qwen3-1.7B --dataset data.jsonl --cycles 100

📖 Citation

@article{drissi2024bodysnatching,
  title={Body Snatching: Complete Model Identity Replacement via Progressive LoRA Merging},
  author={Drissi, Ouissam Said},
  year={2024},
  url={https://github.com/antibitcoin/progressive-lora-merging}
}

🔗 Links

GitHub: antibitcoin/progressive-lora-merging
Paper: PAPER.md
Related Work: ASRL Paper (IJSET 2025)

👤 Author

Ouissam Said Drissi

Email: wissam.idrissi@gmail.com
Independent Researcher, Morocco

"You're not fine-tuning a model. You're growing a new one inside its skeleton."

Downloads last month: -; Downloads are not tracked for this model. How to track