Salesforce/wikitext
Viewer • Updated • 3.71M • 1.33M • 685
How to use dina1/gpt2-wikitext2-lora-peft with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("gpt2")
model = PeftModel.from_pretrained(base_model, "dina1/gpt2-wikitext2-lora-peft")This repository contains a LoRA adapter (not a full merged model) for GPT-2, fine-tuned on the WikiText-2 raw v1 dataset for causal language modelling using the PEFT library.
Important: This repo contains only the LoRA adapter delta weights. Inference requires loading the GPT-2 base model separately and applying the adapter on top (see usage instructions below).
| Field | Value |
|---|---|
| Base model | gpt2 |
| Adapter type | LoRA (Low-Rank Adaptation) |
| Dataset | WikiText-2 raw v1 |
| Task | Causal Language Modelling |
| Frameworks | Transformers, PEFT, PyTorch |
| Training environment | Google Colab — Tesla T4 GPU |
| Selected checkpoint | checkpoint-1752 |
| Metric | Value |
|---|---|
| Best validation loss | 3.1842 |
| Best validation perplexity | 24.1475 |
| Test loss | 3.1593 |
| Test perplexity | 23.5548 |
| Target baseline (val PPL) | 18 – 25 |
| Baseline achieved | ✓ Yes |
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
# 1. Load the GPT-2 base model
base_model = AutoModelForCausalLM.from_pretrained("gpt2")
# 2. Load the tokenizer from this repo
tokenizer = AutoTokenizer.from_pretrained("dina1/gpt2-wikitext2-lora-peft",
subfolder="tokenizer")
# 3. Apply the LoRA adapter
model = PeftModel.from_pretrained(base_model, "dina1/gpt2-wikitext2-lora-peft")
model.eval()
# 4. Generate text
inputs = tokenizer("The history of artificial intelligence", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, do_sample=True, temperature=0.9)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
| Hyperparameter | Value |
|---|---|
| Block size | 512 tokens |
| LoRA rank (r) | 8 |
| LoRA alpha | 16 |
| LoRA dropout | 0.1 |
| LoRA target modules | ["c_attn"] |
| Learning rate | 2e-4 |
| LR scheduler | cosine |
| Gradient accumulation | 8 steps |
| Training precision | fp16 |
| Seed | 42 |
| File / Directory | Description |
|---|---|
adapter_config.json |
LoRA adapter configuration |
adapter_model.safetensors |
LoRA adapter weights |
tokenizer/ |
GPT-2 tokenizer files |
evaluation_summary.json |
Full evaluation metrics from Phase 5 |
production_manifest.json |
Deployment metadata and artifact paths |
loss_and_perplexity_curves.png |
Training and validation curves |
If you use this adapter in your work, please credit the PEFT library:
@misc{peft,
author = {Sourab Mangrulkar and Sylvain Gugger and Lysandre Debut and
Younes Belkada and Sayak Paul},
title = {PEFT: State-of-the-art Parameter-Efficient Fine-Tuning methods},
year = {2022},
url = {https://github.com/huggingface/peft}
}
Base model
openai-community/gpt2