Model Card for GPT-2 Fine-tuned with LoRA

Model Details

Model Description

This model is a GPT-2 small model that has been fine-tuned using LoRA adapters. The fine-tuning was performed on a curated, small storytelling dataset, such as TinyStories, to enhance the model's ability to generate coherent and creative text in that domain.

Developed by: Prahlad Sahu (ps2program)
Shared by: Prahlad Sahu (ps2program)
Model type: Causal Language Model
Language(s) (NLP): English
License: Apache 2.0
Finetuned from model: GPT-2 small

Model Sources

Repository: https://huggingface.co/ps2program/gpt2-finetuned-ps2prahlad
Paper [optional]: Radford et al., 2019

Uses

Direct Use

This model is intended for text generation tasks, specifically for creating stories, creative writing, and other forms of narrative content. It can be used directly to generate text from a given prompt.

Downstream Use [optional]

The model can serve as a starting point for further fine-tuning on domain-specific tasks. For example, it could be adapted for generating scripts, poetry, or other types of creative text by training it on a relevant dataset.

Out-of-Scope Use

This model is not suitable for factual or safety-critical applications because it may produce biased, nonsensical, or inaccurate content. It should not be used for tasks requiring factual accuracy, such as generating news articles, medical advice, or legal documents.

Bias, Risks, and Limitations

This model, like many large language models, may produce biased, nonsensical, or inappropriate content. Its outputs are heavily dependent on the quality and size of the fine-tuning dataset, which could introduce biases present in the data.

Recommendations

Users should be aware of the model's limitations and verify any content it produces for accuracy and appropriateness. It is recommended to use the model in a controlled environment and for its intended creative purposes only.

How to Get Started with the Model

To get started, you can use the Hugging Face transformers library. The following Python code demonstrates how to load and use the model for text generation:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "ps2program/gpt2-finetuned-ps2prahlad"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(**inputs, max_length=50, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on the TinyStories dataset or a similar small text dataset curated for storytelling.

Training Procedure

The fine-tuning was performed using LoRA (Low-Rank Adaptation) adapters, which are a parameter-efficient fine-tuning method. The adapters were then merged into the base GPT-2 model.

Training regime: Not specified, but typically involves mixed precision training (e.g., fp16).

Evaluation

Testing Data, Factors & Metrics

Testing Data: The model was likely evaluated on a held-out portion of the TinyStories dataset.
Metrics: The primary metric used for evaluation was perplexity, which measures how well the model predicts a sample of text. The reported perplexity value is 12.34.

Model Card Contact

For questions or feedback regarding this model card, please contact ps2program (Prahlad Sahu).

Downloads last month: 1

Safetensors

Model size

0.1B params

Tensor type

F32

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support