GPT-2 Fine-Tuned on Dolly 15K for Instruction Following
1. Overview
This repository contains a GPT-2 model fine-tuned on the Databricks Dolly 15K dataset to improve instruction-following capabilities. The goal is to transform a small, general-purpose language model into a lightweight instruction model that can:
- Follow natural language instructions
- Provide helpful and coherent answers
- Handle a variety of tasks such as question answering, classification, and explanation
Base model: gpt2
Fine-tuned by: Easonwangzk
Dataset: databricks/databricks-dolly-15k
This model is suitable for educational purposes, prototyping, and research on instruction-tuned models with limited parameters.
2. Dataset
The model is trained on the Databricks Dolly 15K dataset:
- 15,000 high-quality, human-written instruction–response pairs
- Covers multiple task types:
- Open-ended question answering
- Summarization
- Classification and sentiment analysis
- Brainstorming and creative tasks
- Role-playing instructions
During preprocessing, each example is converted into a single text sequence in the format:
Instruction:
Context:
Response:
If no context is available, the Context section is omitted and only Instruction and Response are used.
3. Training Setup
Base model: gpt2 (117M parameters)
Objective: Causal language modeling (next-token prediction)
Key training hyperparameters (example values):
- Epochs: 3
- Max sequence length: 512
- Optimizer: AdamW (via Transformers Trainer)
- Learning rate: 5e-5
- Batch size: 4 per device with gradient_accumulation_steps = 4
- Weight decay: 0.01
- Warmup ratio: 0.03
- Hardware: single GPU (for example, Colab or similar environment)
Padding and special tokens:
- GPT-2 does not have a dedicated pad token.
- The tokenizer uses the EOS token as the pad token.
- pad_token_id is set to eos_token_id so that batched training works correctly.
4. Intended Use
This model is intended for:
- Small-scale instruction-following experiments
- Educational demonstrations of fine-tuning on instruction data
- Prototyping simple assistants for constrained domains
- Research on alignment and instruction tuning of smaller models
It is not intended for:
- High-stakes decision-making
- Production systems without additional safety mechanisms
- Domains requiring guaranteed factual accuracy or reliability
5. How to Use
Basic inference example (Python):
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "Easonwangzk/gpt2-dolly-15k-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to(
"cuda" if torch.cuda.is_available() else "cpu"
)
def generate_response(instruction, max_new_tokens=200):
prompt = f"Instruction:\\instruction\\n\\nResponse:\\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=max_new_tokens,
do_sample=True,
temperature=0.7,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id,
)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
instruction = "List 3 use cases for blockchain."
print(generate_response(instruction))
For best results, prompts should follow the same structure used during training:
Instruction:
<your instruction here>
Response:
If there is additional context, you can include:
Instruction:
<your instruction here>
Context:
<relevant background or input>
Response:
6. Example Prompts
The model has been evaluated on prompts such as:
- List 3 use cases for blockchain.
- Explain photosynthesis.
- Classify the following sentence as Positive or Negative sentiment: "I loved the customer service!"
- Why is the sky blue?
- Pretend you are a travel agent. Recommend a weekend getaway in the US for someone who loves hiking.
- Compare the benefits of solar versus wind energy.
Compared to the original GPT-2 model, the fine-tuned model:
- Produces more direct and instruction-aligned answers
- Provides clearer and more structured explanations
- Avoids drifting into unrelated text as often
7. Limitations and Risks
- The model is based on GPT-2 without reinforcement learning from human feedback.
- It can hallucinate facts or produce incorrect information.
- It may generate biased or inappropriate content if prompted in certain ways.
- It does not have any built-in safety filters or content moderation.
Users should:
- Manually review outputs before using them in any critical context.
- Avoid using this model in settings where mistakes could cause harm.
8. High-Level Training Procedure
- Load Dolly 15K via the datasets library.
- Convert each record to a unified instructional prompt format (Instruction, optional Context, and Response).
- Tokenize with the GPT-2 tokenizer, using the EOS token as padding.
- Fine-tune GPT-2 using the Hugging Face Trainer for several epochs.
- Save the fine-tuned model and tokenizer, then push them to the Hugging Face Hub.
9. Acknowledgements
If you use this model or the training pipeline in academic or industrial work, please consider citing:
- Databricks Dolly 15K dataset
- GPT-2 and the Transformers library by Hugging Face
Model author: Easonwangzk
- Downloads last month
- 3
Model tree for Easonwangzk/gpt2-dolly-15k-instruct
Base model
openai-community/gpt2