ps2program's picture
Create README.md
80491b0 verified
## Model Card for GPT-2 Fine-tuned with LoRA
### Model Details
#### Model Description
This model is a GPT-2 small model that has been fine-tuned using LoRA adapters. The fine-tuning was performed on a curated, small storytelling dataset, such as TinyStories, to enhance the model's ability to generate coherent and creative text in that domain.
- **Developed by:** Prahlad Sahu (ps2program)
- **Shared by:** Prahlad Sahu (ps2program)
- **Model type:** Causal Language Model
- **Language(s) (NLP):** English
- **License:** Apache 2.0
- **Finetuned from model:** GPT-2 small
#### Model Sources
- **Repository:** [https://huggingface.co/ps2program/gpt2-finetuned-ps2prahlad](https://huggingface.co/ps2program/gpt2-finetuned-ps2prahlad)
- **Paper [optional]:** Radford et al., 2019
### Uses
#### Direct Use
This model is intended for **text generation tasks**, specifically for creating stories, creative writing, and other forms of narrative content. It can be used directly to generate text from a given prompt.
#### Downstream Use [optional]
The model can serve as a **starting point for further fine-tuning** on domain-specific tasks. For example, it could be adapted for generating scripts, poetry, or other types of creative text by training it on a relevant dataset.
#### Out-of-Scope Use
This model is **not suitable for factual or safety-critical applications** because it may produce biased, nonsensical, or inaccurate content. It should not be used for tasks requiring factual accuracy, such as generating news articles, medical advice, or legal documents.
### Bias, Risks, and Limitations
This model, like many large language models, may produce **biased, nonsensical, or inappropriate content**. Its outputs are heavily dependent on the quality and size of the fine-tuning dataset, which could introduce biases present in the data.
#### Recommendations
Users should be aware of the model's limitations and verify any content it produces for accuracy and appropriateness. It is recommended to use the model in a controlled environment and for its intended creative purposes only.
### How to Get Started with the Model
To get started, you can use the Hugging Face `transformers` library. The following Python code demonstrates how to load and use the model for text generation:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_name = "ps2program/gpt2-finetuned-ps2prahlad"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
prompt = "Once upon a time"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50, do_sample=True, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Training Details
#### Training Data
The model was fine-tuned on the **TinyStories** dataset or a similar small text dataset curated for storytelling.
#### Training Procedure
The fine-tuning was performed using **LoRA (Low-Rank Adaptation)** adapters, which are a parameter-efficient fine-tuning method. The adapters were then merged into the base GPT-2 model.
- **Training regime:** Not specified, but typically involves mixed precision training (e.g., `fp16`).
### Evaluation
#### Testing Data, Factors & Metrics
- **Testing Data:** The model was likely evaluated on a held-out portion of the TinyStories dataset.
- **Metrics:** The primary metric used for evaluation was **perplexity**, which measures how well the model predicts a sample of text. The reported perplexity value is 12.34.
### Model Card Contact
For questions or feedback regarding this model card, please contact ps2program (Prahlad Sahu).