dongbobo
/

adapter-checkpoint

Model card Files Files and versions

adapter-checkpoint / README.md

dongbobo's picture

Upload README.md with huggingface_hub

1c54602 verified 2 months ago

|

history blame contribute delete

2.76 kB

	---
	library_name: peft
	base_model: meta-llama/Llama-2-7b-hf
	tags:
	- lora
	- peft
	- causal-lm
	- adapter
	license: apache-2.0
	---

	# Adapter Checkpoint — LoRA on Llama-2-7b

	This repository contains a LoRA adapter checkpoint fine-tuned on top of
	[`meta-llama/Llama-2-7b-hf`](https://huggingface.co/meta-llama/Llama-2-7b-hf)
	using [PEFT](https://github.com/huggingface/peft).

	---

	## Repository layout

	```
	.
	├── adapter_config.json # PEFT / LoRA hyper-parameters
	├── adapter_model.bin # Trained adapter weights
	├── README.md # This file
	└── examples/
	└── chat/
	├── zero_shot/
	│ └── prompt.json # Zero-shot chat prompt template
	└── few_shot/
	└── prompt.json # Few-shot chat prompt template
	```

	---

	## Prompt templates

	Two ready-to-use prompt templates are included for chat inference:

	\| Strategy \| Path \| Description \|
	\|---\|---\|---\|
	\| Zero-shot \| [`examples/chat/zero_shot/prompt.json`](examples/chat/zero_shot/prompt.json) \| Single-turn; no demonstrations — the model relies on its instruction-following capability. \|
	\| Few-shot \| [`examples/chat/few_shot/prompt.json`](examples/chat/few_shot/prompt.json) \| Prepends three (user, assistant) demonstration turns before the live query. \|

	---

	## Quick start

	```python
	from peft import PeftModel, PeftConfig
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import json, pathlib

	# Load adapter config and base model
	config = PeftConfig.from_pretrained("dongbobo/adapter-checkpoint")
	base = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
	model = PeftModel.from_pretrained(base, "dongbobo/adapter-checkpoint")
	tok = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

	# Load a prompt template
	template = json.loads(
	pathlib.Path("examples/chat/zero_shot/prompt.json").read_text()
	)

	# Build prompt
	user_msg = "Explain the concept of attention in transformers."
	prompt = (
	f"<s>[INST] <<SYS>>\n{template['template']['system']}\n<</SYS>>\n\n"
	f"{user_msg} [/INST]"
	)

	inputs = tok(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_new_tokens=256)
	print(tok.decode(outputs[0], skip_special_tokens=True))
	```

	---

	## Adapter hyper-parameters

	\| Parameter \| Value \|
	\|---\|---\|
	\| PEFT type \| LORA \|
	\| Task type \| CAUSAL\_LM \|
	\| Rank (`r`) \| 16 \|
	\| LoRA alpha \| 32 \|
	\| LoRA dropout \| 0.05 \|
	\| Target modules \| `q_proj`, `v_proj` \|
	\| Bias \| none \|

	---

	## License

	Released under the Apache 2.0 license.
	The base model (`meta-llama/Llama-2-7b-hf`) is subject to its own
	[Llama 2 Community License](https://huggingface.co/meta-llama/Llama-2-7b-hf/blob/main/LICENSE.txt).