Upload README
Browse files
README.md
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# 🧠 Mamba Hypernetwork for LLM Personalization
|
| 2 |
+
|
| 3 |
+
Mamba Hypernetwork sinh LoRA delta cho LLM tại inference time, cho phép cá nhân hóa tức thì mà không cần retrain.
|
| 4 |
+
|
| 5 |
+
## Architecture
|
| 6 |
+
- **Mamba**: 1024 dim, 16 state, 4 expand (~12M params)
|
| 7 |
+
- **LoRA**: rank 16, inject vào q_proj + v_proj của 8 layers đầu
|
| 8 |
+
- **LLM**: Llama-3.2-3B-Instruct
|
| 9 |
+
|
| 10 |
+
## Training
|
| 11 |
+
- **Method**: GRPO (Group Relative Policy Optimization)
|
| 12 |
+
- **Reward Model**: `phammminhhieu/persona-reward-model`
|
| 13 |
+
- **Dataset**: Personachat TrueCased
|
| 14 |
+
|
| 15 |
+
## Usage
|
| 16 |
+
|
| 17 |
+
```python
|
| 18 |
+
from mamba_ssm import Mamba
|
| 19 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 20 |
+
import torch
|
| 21 |
+
|
| 22 |
+
# Load Mamba
|
| 23 |
+
config = json.load(open("config.json"))
|
| 24 |
+
mamba = MambaHypernetwork(config)
|
| 25 |
+
mamba.load_state_dict(torch.load("pytorch_model.bin"))
|
| 26 |
+
mamba.eval()
|
| 27 |
+
|
| 28 |
+
# Load LLM
|
| 29 |
+
llm = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
|
| 30 |
+
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
|
| 31 |
+
|
| 32 |
+
# Sinh delta từ persona
|
| 33 |
+
persona = "I love dogs. I am a photographer."
|
| 34 |
+
deltas = mamba(persona_ids, persona_mask, history_ids, history_mask)
|
| 35 |
+
|