adapter-checkpoint / README.md
dongbobo's picture
Upload README.md with huggingface_hub
1c54602 verified
---
library_name: peft
base_model: meta-llama/Llama-2-7b-hf
tags:
- lora
- peft
- causal-lm
- adapter
license: apache-2.0
---
# Adapter Checkpoint β€” LoRA on Llama-2-7b
This repository contains a **LoRA adapter checkpoint** fine-tuned on top of
[`meta-llama/Llama-2-7b-hf`](https://huggingface.co/meta-llama/Llama-2-7b-hf)
using [PEFT](https://github.com/huggingface/peft).
---
## Repository layout
```
.
β”œβ”€β”€ adapter_config.json # PEFT / LoRA hyper-parameters
β”œβ”€β”€ adapter_model.bin # Trained adapter weights
β”œβ”€β”€ README.md # This file
└── examples/
└── chat/
β”œβ”€β”€ zero_shot/
β”‚ └── prompt.json # Zero-shot chat prompt template
└── few_shot/
└── prompt.json # Few-shot chat prompt template
```
---
## Prompt templates
Two ready-to-use prompt templates are included for chat inference:
| Strategy | Path | Description |
|---|---|---|
| Zero-shot | [`examples/chat/zero_shot/prompt.json`](examples/chat/zero_shot/prompt.json) | Single-turn; no demonstrations β€” the model relies on its instruction-following capability. |
| Few-shot | [`examples/chat/few_shot/prompt.json`](examples/chat/few_shot/prompt.json) | Prepends three (user, assistant) demonstration turns before the live query. |
---
## Quick start
```python
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
import json, pathlib
# Load adapter config and base model
config = PeftConfig.from_pretrained("dongbobo/adapter-checkpoint")
base = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
model = PeftModel.from_pretrained(base, "dongbobo/adapter-checkpoint")
tok = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
# Load a prompt template
template = json.loads(
pathlib.Path("examples/chat/zero_shot/prompt.json").read_text()
)
# Build prompt
user_msg = "Explain the concept of attention in transformers."
prompt = (
f"<s>[INST] <<SYS>>\n{template['template']['system']}\n<</SYS>>\n\n"
f"{user_msg} [/INST]"
)
inputs = tok(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tok.decode(outputs[0], skip_special_tokens=True))
```
---
## Adapter hyper-parameters
| Parameter | Value |
|---|---|
| PEFT type | LORA |
| Task type | CAUSAL\_LM |
| Rank (`r`) | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.05 |
| Target modules | `q_proj`, `v_proj` |
| Bias | none |
---
## License
Released under the **Apache 2.0** license.
The base model (`meta-llama/Llama-2-7b-hf`) is subject to its own
[Llama 2 Community License](https://huggingface.co/meta-llama/Llama-2-7b-hf/blob/main/LICENSE.txt).