Text Generation
PEFT
English
Chinese
hypernetwork
hyper-lora
lora
role-play
character-impersonation
persona
dialogue
phase-tree
Instructions to use IAAR-Shanghai/phase_tree_models with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use IAAR-Shanghai/phase_tree_models with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
File size: 5,430 Bytes
1145a14 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 | ---
license: cc-by-nc-4.0
base_model: Qwen/Qwen2.5-7B-Instruct
pipeline_tag: text-generation
library_name: peft
language:
- en
- zh
tags:
- hypernetwork
- hyper-lora
- lora
- role-play
- character-impersonation
- sft
- phase-tree
datasets:
- IAAR-Shanghai/phase_tree_data
---
# PHASE-Tree Hyper-LoRA SFT (anchor run)
**Variant:** Warm-start, lr=5e-6 (anchor run)
The **anchor** SFT run: hypernet warm-started from the PHASE-Tree pretrained
checkpoint and fine-tuned at a conservative learning rate of 5e-6 with label
smoothing 0.1 and NEFTune noise 5.0. This is the checkpoint reported in the
PHASE-Tree paper.
During development, six hyper-LoRA SFT cells were trained — an ablation grid
over initialisation (warm-start vs cold-start), learning rate (5e-6 vs 1e-5),
and trainable vs frozen hypernet output heads. Only this anchor cell is
bundled here; the other five are kept locally for reproducibility.
## What is a hypermod?
A **hypermod** (hyper-modulator) is a hypernetwork that, conditioned on a
character profile embedding, emits a low-rank LoRA delta `ΔW = AB` for each
target layer of the base model at inference time. The base model weights are
never updated; only the hypernet is trained. A single hypermod therefore
generalises across an open-ended set of personas without needing to store a
separate adapter per character.
## Files
| File | Purpose |
|------|---------|
| `hypermod.pt` | **Recommended checkpoint.** The anchor SFT step selected from per-step LLM-as-judge ratings (`character`, `semantic`) and Qwen3-Embedding-4B response-vs-reference cosine similarity. |
| `args.yaml` | Full training configuration; consumed by the loader to instantiate the hypernet architecture. |
| `adapter_config.json` | LoRA target-module stub (rank 8, alpha 16, `q_proj` + `v_proj`). |
| `timing_stats.json` | Wall-clock breakdown of the training run (training / validation / other overhead, in seconds). |
> Per-step snapshots (`checkpoints/it_5000` … `it_40000`) and the post-hoc
> evaluation artefacts (`eval_ckpt_judge_scores/`, `eval_ckpt_val_loss/`)
> generated during training are **not bundled** with this release. They can
> be regenerated by re-running `src/scripts/train_phase_tree_qwen_7b.sh`
> followed by the evaluation scripts under `src/scripts/`.
## How to load
```python
from huggingface_hub import snapshot_download
from hyper_llm_modulator.hyper_modulator import load_hypermod_checkpoint
ckpt_dir = snapshot_download("<your-hf-username>/PHASE-Tree-hyper-lora-anchor")
(
args, hypermod, base_model, tokenizer,
emb_model, emb_tokenizer, task_desc_format_fn, pooling_fn,
) = load_hypermod_checkpoint(f"{ckpt_dir}/hypermod.pt", device="cuda")
```
The loader reads `args.yaml` and `adapter_config.json` from the same directory
as `hypermod.pt` automatically. The full inference pipeline (profile →
embedding → per-layer LoRA → generation) lives in the PHASE-Tree codebase.
## Training configuration
| Hyperparameter | Value |
|----------------|-------|
| Base model | `Qwen/Qwen2.5-7B-Instruct` |
| Task encoder | `Qwen/Qwen3-Embedding-4B` |
| Initialisation | Warm-start from `phase_tree_models/phase_tree_pretrained/hypermod.pt` |
| Target modules | `q_proj`, `v_proj` |
| LoRA rank `r` | 8 |
| LoRA alpha | 16 |
| LoRA dropout | 0.05 |
| Hypernet latent size | 1024 |
| Hypernet head input size | 2048 |
| Freeze hypernet heads | `false` |
| Optimizer steps | 40000 |
| Effective batch size | 8 (per-device 4 × grad-accum 2) |
| Learning rate | 5e-6 |
| Warmup fraction | 0.05 |
| Weight decay | 0.01 |
| Label smoothing | 0.1 |
| NEFTune noise α | 5.0 |
| Checkpoint cadence | every 5000 steps |
| Random seed | 42 |
The complete configuration (including dataset lists, sampler settings, and
fusion-module placeholders kept for loader compatibility) lives in `args.yaml`.
## Training data
The hypermod is jointly fine-tuned on the *train* splits of the eight
PHASE-Tree character-dialogue datasets (RAIDEN, CharacterEval, HPD, SimsConv,
ChatHaruhi, Friends, StarTrek_TNG, TheOffice), `m6_phase_tree` profile variant.
Sampling follows the hierarchical `sqrt_size` strategy with 6 tasks × 2 points
per batch.
## Evaluation
The released `hypermod.pt` was selected from per-step snapshots of the
training run by scoring predictions on a held-out evaluation set along
three axes:
- **`character` (1–5)** — profile-consistency rating by an LLM judge (see
`evaluation/persona_rubric.md` in the PHASE-Tree codebase for the rubric).
- **`semantic` (1–5)** — contextual-coherence rating by the same judge.
- **`embedding`** — cosine similarity of the predicted and reference response
embeddings computed with Qwen3-Embedding-4B.
The per-step intermediate snapshots and full evaluation artefacts produced
during model selection are not bundled (see the note above the loading
section); they can be regenerated from a re-training run via the scripts
under `src/scripts/`.
## Limitations
- Persona conditioning is mediated entirely by the profile embedding fed into
the task encoder; the model has no other persona-control surface.
- Generations may reproduce stylistic biases of the source corpora; intended
for research evaluation only.
- The checkpoint depends on the PHASE-Tree codebase for inference and is not a
drop-in `peft.PeftModel`: `adapter_config.json` describes only which layers
receive a generated LoRA, not directly loadable weights.
|