Text Generation
PEFT
English
Chinese
hypernetwork
hyper-lora
lora
role-play
character-impersonation
persona
dialogue
phase-tree
Instructions to use IAAR-Shanghai/phase_tree_models with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use IAAR-Shanghai/phase_tree_models with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
| license: cc-by-nc-4.0 | |
| base_model: Qwen/Qwen2.5-7B-Instruct | |
| pipeline_tag: text-generation | |
| library_name: peft | |
| language: | |
| - en | |
| - zh | |
| tags: | |
| - hypernetwork | |
| - hyper-lora | |
| - lora | |
| - role-play | |
| - character-impersonation | |
| - sft | |
| - phase-tree | |
| datasets: | |
| - IAAR-Shanghai/phase_tree_data | |
| # PHASE-Tree Hyper-LoRA SFT (anchor run) | |
| **Variant:** Warm-start, lr=5e-6 (anchor run) | |
| The **anchor** SFT run: hypernet warm-started from the PHASE-Tree pretrained | |
| checkpoint and fine-tuned at a conservative learning rate of 5e-6 with label | |
| smoothing 0.1 and NEFTune noise 5.0. This is the checkpoint reported in the | |
| PHASE-Tree paper. | |
| During development, six hyper-LoRA SFT cells were trained β an ablation grid | |
| over initialisation (warm-start vs cold-start), learning rate (5e-6 vs 1e-5), | |
| and trainable vs frozen hypernet output heads. Only this anchor cell is | |
| bundled here; the other five are kept locally for reproducibility. | |
| ## What is a hypermod? | |
| A **hypermod** (hyper-modulator) is a hypernetwork that, conditioned on a | |
| character profile embedding, emits a low-rank LoRA delta `ΞW = AB` for each | |
| target layer of the base model at inference time. The base model weights are | |
| never updated; only the hypernet is trained. A single hypermod therefore | |
| generalises across an open-ended set of personas without needing to store a | |
| separate adapter per character. | |
| ## Files | |
| | File | Purpose | | |
| |------|---------| | |
| | `hypermod.pt` | **Recommended checkpoint.** The anchor SFT step selected from per-step LLM-as-judge ratings (`character`, `semantic`) and Qwen3-Embedding-4B response-vs-reference cosine similarity. | | |
| | `args.yaml` | Full training configuration; consumed by the loader to instantiate the hypernet architecture. | | |
| | `adapter_config.json` | LoRA target-module stub (rank 8, alpha 16, `q_proj` + `v_proj`). | | |
| | `timing_stats.json` | Wall-clock breakdown of the training run (training / validation / other overhead, in seconds). | | |
| > Per-step snapshots (`checkpoints/it_5000` β¦ `it_40000`) and the post-hoc | |
| > evaluation artefacts (`eval_ckpt_judge_scores/`, `eval_ckpt_val_loss/`) | |
| > generated during training are **not bundled** with this release. They can | |
| > be regenerated by re-running `src/scripts/train_phase_tree_qwen_7b.sh` | |
| > followed by the evaluation scripts under `src/scripts/`. | |
| ## How to load | |
| ```python | |
| from huggingface_hub import snapshot_download | |
| from hyper_llm_modulator.hyper_modulator import load_hypermod_checkpoint | |
| ckpt_dir = snapshot_download("<your-hf-username>/PHASE-Tree-hyper-lora-anchor") | |
| ( | |
| args, hypermod, base_model, tokenizer, | |
| emb_model, emb_tokenizer, task_desc_format_fn, pooling_fn, | |
| ) = load_hypermod_checkpoint(f"{ckpt_dir}/hypermod.pt", device="cuda") | |
| ``` | |
| The loader reads `args.yaml` and `adapter_config.json` from the same directory | |
| as `hypermod.pt` automatically. The full inference pipeline (profile β | |
| embedding β per-layer LoRA β generation) lives in the PHASE-Tree codebase. | |
| ## Training configuration | |
| | Hyperparameter | Value | | |
| |----------------|-------| | |
| | Base model | `Qwen/Qwen2.5-7B-Instruct` | | |
| | Task encoder | `Qwen/Qwen3-Embedding-4B` | | |
| | Initialisation | Warm-start from `phase_tree_models/phase_tree_pretrained/hypermod.pt` | | |
| | Target modules | `q_proj`, `v_proj` | | |
| | LoRA rank `r` | 8 | | |
| | LoRA alpha | 16 | | |
| | LoRA dropout | 0.05 | | |
| | Hypernet latent size | 1024 | | |
| | Hypernet head input size | 2048 | | |
| | Freeze hypernet heads | `false` | | |
| | Optimizer steps | 40000 | | |
| | Effective batch size | 8 (per-device 4 Γ grad-accum 2) | | |
| | Learning rate | 5e-6 | | |
| | Warmup fraction | 0.05 | | |
| | Weight decay | 0.01 | | |
| | Label smoothing | 0.1 | | |
| | NEFTune noise Ξ± | 5.0 | | |
| | Checkpoint cadence | every 5000 steps | | |
| | Random seed | 42 | | |
| The complete configuration (including dataset lists, sampler settings, and | |
| fusion-module placeholders kept for loader compatibility) lives in `args.yaml`. | |
| ## Training data | |
| The hypermod is jointly fine-tuned on the *train* splits of the eight | |
| PHASE-Tree character-dialogue datasets (RAIDEN, CharacterEval, HPD, SimsConv, | |
| ChatHaruhi, Friends, StarTrek_TNG, TheOffice), `m6_phase_tree` profile variant. | |
| Sampling follows the hierarchical `sqrt_size` strategy with 6 tasks Γ 2 points | |
| per batch. | |
| ## Evaluation | |
| The released `hypermod.pt` was selected from per-step snapshots of the | |
| training run by scoring predictions on a held-out evaluation set along | |
| three axes: | |
| - **`character` (1β5)** β profile-consistency rating by an LLM judge (see | |
| `evaluation/persona_rubric.md` in the PHASE-Tree codebase for the rubric). | |
| - **`semantic` (1β5)** β contextual-coherence rating by the same judge. | |
| - **`embedding`** β cosine similarity of the predicted and reference response | |
| embeddings computed with Qwen3-Embedding-4B. | |
| The per-step intermediate snapshots and full evaluation artefacts produced | |
| during model selection are not bundled (see the note above the loading | |
| section); they can be regenerated from a re-training run via the scripts | |
| under `src/scripts/`. | |
| ## Limitations | |
| - Persona conditioning is mediated entirely by the profile embedding fed into | |
| the task encoder; the model has no other persona-control surface. | |
| - Generations may reproduce stylistic biases of the source corpora; intended | |
| for research evaluation only. | |
| - The checkpoint depends on the PHASE-Tree codebase for inference and is not a | |
| drop-in `peft.PeftModel`: `adapter_config.json` describes only which layers | |
| receive a generated LoRA, not directly loadable weights. | |