GPT-2 Fine-Tuned on Dolly 15K for Instruction Following

1. Overview

This repository contains a GPT-2 model fine-tuned on the Databricks Dolly 15K dataset to improve instruction-following capabilities. The goal is to transform a small, general-purpose language model into a lightweight instruction model that can:

Follow natural language instructions
Provide helpful and coherent answers
Handle a variety of tasks such as question answering, classification, and explanation

Base model: gpt2
Fine-tuned by: Easonwangzk
Dataset: databricks/databricks-dolly-15k

This model is suitable for educational purposes, prototyping, and research on instruction-tuned models with limited parameters.

2. Dataset

The model is trained on the Databricks Dolly 15K dataset:

15,000 high-quality, human-written instruction–response pairs
Covers multiple task types:
- Open-ended question answering
- Summarization
- Classification and sentiment analysis
- Brainstorming and creative tasks
- Role-playing instructions

During preprocessing, each example is converted into a single text sequence in the format:

Instruction:

Context:

Response:

If no context is available, the Context section is omitted and only Instruction and Response are used.

3. Training Setup

Base model: gpt2 (117M parameters)
Objective: Causal language modeling (next-token prediction)

Key training hyperparameters (example values):

Epochs: 3
Max sequence length: 512
Optimizer: AdamW (via Transformers Trainer)
Learning rate: 5e-5
Batch size: 4 per device with gradient_accumulation_steps = 4
Weight decay: 0.01
Warmup ratio: 0.03
Hardware: single GPU (for example, Colab or similar environment)

Padding and special tokens:

GPT-2 does not have a dedicated pad token.
The tokenizer uses the EOS token as the pad token.
pad_token_id is set to eos_token_id so that batched training works correctly.

4. Intended Use

This model is intended for:

Small-scale instruction-following experiments
Educational demonstrations of fine-tuning on instruction data
Prototyping simple assistants for constrained domains
Research on alignment and instruction tuning of smaller models

It is not intended for:

High-stakes decision-making
Production systems without additional safety mechanisms
Domains requiring guaranteed factual accuracy or reliability

5. How to Use

Basic inference example (Python):

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "Easonwangzk/gpt2-dolly-15k-instruct"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to(
    "cuda" if torch.cuda.is_available() else "cpu"
)

def generate_response(instruction, max_new_tokens=200):
    prompt = f"Instruction:\\instruction\\n\\nResponse:\\n"
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
        do_sample=True,
        temperature=0.7,
        top_p=0.9,
        pad_token_id=tokenizer.eos_token_id,
    )
        return tokenizer.decode(outputs[0], skip_special_tokens=True)

instruction = "List 3 use cases for blockchain."
print(generate_response(instruction))

For best results, prompts should follow the same structure used during training:

Instruction:
<your instruction here>

Response:

If there is additional context, you can include:

Instruction:
<your instruction here>

Context:
<relevant background or input>

Response:

6. Example Prompts

The model has been evaluated on prompts such as:

List 3 use cases for blockchain.
Explain photosynthesis.
Classify the following sentence as Positive or Negative sentiment: "I loved the customer service!"
Why is the sky blue?
Pretend you are a travel agent. Recommend a weekend getaway in the US for someone who loves hiking.
Compare the benefits of solar versus wind energy.

Compared to the original GPT-2 model, the fine-tuned model:

Produces more direct and instruction-aligned answers
Provides clearer and more structured explanations
Avoids drifting into unrelated text as often

7. Limitations and Risks

The model is based on GPT-2 without reinforcement learning from human feedback.
It can hallucinate facts or produce incorrect information.
It may generate biased or inappropriate content if prompted in certain ways.
It does not have any built-in safety filters or content moderation.

Users should:

Manually review outputs before using them in any critical context.
Avoid using this model in settings where mistakes could cause harm.

8. High-Level Training Procedure

Load Dolly 15K via the datasets library.
Convert each record to a unified instructional prompt format (Instruction, optional Context, and Response).
Tokenize with the GPT-2 tokenizer, using the EOS token as padding.
Fine-tune GPT-2 using the Hugging Face Trainer for several epochs.
Save the fine-tuned model and tokenizer, then push them to the Hugging Face Hub.

9. Acknowledgements

If you use this model or the training pipeline in academic or industrial work, please consider citing:

Databricks Dolly 15K dataset
GPT-2 and the Transformers library by Hugging Face

Model author: Easonwangzk

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for Easonwangzk/gpt2-dolly-15k-instruct

Base model

openai-community/gpt2

Finetuned

(2084)

this model