---
library_name: peft
base_model: meta-llama/Llama-2-7b-hf
tags:
  - lora
  - peft
  - causal-lm
  - adapter
license: apache-2.0
---

# Adapter Checkpoint — LoRA on Llama-2-7b

This repository contains a **LoRA adapter checkpoint** fine-tuned on top of
[`meta-llama/Llama-2-7b-hf`](https://huggingface.co/meta-llama/Llama-2-7b-hf)
using [PEFT](https://github.com/huggingface/peft).

---

## Repository layout

```
.
├── adapter_config.json          # PEFT / LoRA hyper-parameters
├── adapter_model.bin            # Trained adapter weights
├── README.md                    # This file
└── examples/
    └── chat/
        ├── zero_shot/
        │   └── prompt.json      # Zero-shot chat prompt template
        └── few_shot/
            └── prompt.json      # Few-shot chat prompt template
```

---

## Prompt templates

Two ready-to-use prompt templates are included for chat inference:

| Strategy | Path | Description |
|---|---|---|
| Zero-shot | [`examples/chat/zero_shot/prompt.json`](examples/chat/zero_shot/prompt.json) | Single-turn; no demonstrations — the model relies on its instruction-following capability. |
| Few-shot | [`examples/chat/few_shot/prompt.json`](examples/chat/few_shot/prompt.json) | Prepends three (user, assistant) demonstration turns before the live query. |

---

## Quick start

```python
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
import json, pathlib

# Load adapter config and base model
config = PeftConfig.from_pretrained("dongbobo/adapter-checkpoint")
base   = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
model  = PeftModel.from_pretrained(base, "dongbobo/adapter-checkpoint")
tok    = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

# Load a prompt template
template = json.loads(
    pathlib.Path("examples/chat/zero_shot/prompt.json").read_text()
)

# Build prompt
user_msg = "Explain the concept of attention in transformers."
prompt   = (
    f"<s>[INST] <<SYS>>\n{template['template']['system']}\n<</SYS>>\n\n"
    f"{user_msg} [/INST]"
)

inputs  = tok(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
print(tok.decode(outputs[0], skip_special_tokens=True))
```

---

## Adapter hyper-parameters

| Parameter | Value |
|---|---|
| PEFT type | LORA |
| Task type | CAUSAL\_LM |
| Rank (`r`) | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.05 |
| Target modules | `q_proj`, `v_proj` |
| Bias | none |

---

## License

Released under the **Apache 2.0** license.  
The base model (`meta-llama/Llama-2-7b-hf`) is subject to its own
[Llama 2 Community License](https://huggingface.co/meta-llama/Llama-2-7b-hf/blob/main/LICENSE.txt).