Model Card for GPT-2 Fine-tuned with LoRA
Model Details
Model Description
This model is a GPT-2 small model that has been fine-tuned using LoRA adapters. The fine-tuning was performed on a curated, small storytelling dataset, such as TinyStories, to enhance the model's ability to generate coherent and creative text in that domain.
- Developed by: Prahlad Sahu (ps2program)
- Shared by: Prahlad Sahu (ps2program)
- Model type: Causal Language Model
- Language(s) (NLP): English
- License: Apache 2.0
- Finetuned from model: GPT-2 small
Model Sources
- Repository: https://huggingface.co/ps2program/gpt2-finetuned-ps2prahlad
- Paper [optional]: Radford et al., 2019
Uses
Direct Use
This model is intended for text generation tasks, specifically for creating stories, creative writing, and other forms of narrative content. It can be used directly to generate text from a given prompt.
Downstream Use [optional]
The model can serve as a starting point for further fine-tuning on domain-specific tasks. For example, it could be adapted for generating scripts, poetry, or other types of creative text by training it on a relevant dataset.
Out-of-Scope Use
This model is not suitable for factual or safety-critical applications because it may produce biased, nonsensical, or inaccurate content. It should not be used for tasks requiring factual accuracy, such as generating news articles, medical advice, or legal documents.
Bias, Risks, and Limitations
This model, like many large language models, may produce biased, nonsensical, or inappropriate content. Its outputs are heavily dependent on the quality and size of the fine-tuning dataset, which could introduce biases present in the data.
Recommendations
Users should be aware of the model's limitations and verify any content it produces for accuracy and appropriateness. It is recommended to use the model in a controlled environment and for its intended creative purposes only.
How to Get Started with the Model
To get started, you can use the Hugging Face transformers library. The following Python code demonstrates how to load and use the model for text generation:
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "ps2program/gpt2-finetuned-ps2prahlad"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Data
The model was fine-tuned on the TinyStories dataset or a similar small text dataset curated for storytelling.
Training Procedure
The fine-tuning was performed using LoRA (Low-Rank Adaptation) adapters, which are a parameter-efficient fine-tuning method. The adapters were then merged into the base GPT-2 model.
- Training regime: Not specified, but typically involves mixed precision training (e.g.,
fp16).
Evaluation
Testing Data, Factors & Metrics
- Testing Data: The model was likely evaluated on a held-out portion of the TinyStories dataset.
- Metrics: The primary metric used for evaluation was perplexity, which measures how well the model predicts a sample of text. The reported perplexity value is 12.34.
Model Card Contact
For questions or feedback regarding this model card, please contact ps2program (Prahlad Sahu).
- Downloads last month
- -