| --- |
| language: en |
| license: apache-2.0 |
| tags: [doc-to-lora, lora, hypernetwork, context-distillation, needle-in-a-haystack, perceiver] |
| base_model: Qwen/Qwen3-1.7B |
| --- |
| |
| # Doc-to-LoRA — NIAH Proof of Concept |
|
|
| A **144 M parameter Perceiver hypernetwork** trained on [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B). |
| Reads a document once, outputs LoRA deltas, and lets the base LLM answer |
| questions without the document ever appearing in the context window. |
|
|
| > Based on [Doc-to-LoRA (Charakorn et al., 2026)](https://arxiv.org/abs/2602.15902). |
|
|
|  |
|
|
| ## Results |
|
|
| | Metric | Value | |
| |---|---| |
| | Base model | `Qwen/Qwen3-1.7B` | |
| | Perceiver params | 144 M | |
| | LoRA rank / alpha | 8 / 8.0 | |
| | Target module | `down_proj` | |
| | Training steps | 8,000 | |
| | Final CE loss | 0.2165 | |
| | Exact-match accuracy (NIAH) | **80.0%** | |
| | Training ctx length | 32–256 tokens | |
|
|
| ## Files |
|
|
| | File | Description | |
| |---|---| |
| | `hypernet.pt` | Perceiver weights + full config to rebuild the class | |
| | `inference_example.py` | Self-contained script (download and run) | |
| | `training_config.json` | Training hyperparameters | |
| | `curves.png` | Loss and accuracy curves | |
|
|
| ## Quick start |
|
|
| ```bash |
| pip install transformers>=4.51.0 huggingface_hub torch |
| ``` |
|
|
| ```python |
| from huggingface_hub import hf_hub_download |
| import torch |
| |
| ckpt = torch.load(hf_hub_download("farpluto/doc-to-lora-niah", "hypernet.pt"), |
| map_location="cuda", weights_only=False) |
| # See inference_example.py for the complete working script. |
| ``` |
|
|
| ## Qwen3 note |
|
|
| Chain-of-thought thinking is suppressed via `/no_think` appended to every query. |
| Residual `<think>` tokens are stripped from generated output. |
| Both techniques are harmless no-ops on non-Qwen3 models. |