harpertokenGPT2 / README.md
harpertoken's picture
chore: correct epoch
7492196 verified
---
library_name: transformers
tags:
- gpt2
- text-generation
---
# Model Card for harpertoken/harpertokenGPT2
GPT-2 small model trained from scratch on WikiText-2-raw-v1 dataset for text generation.
## Model Details
### Model Description
This is a GPT-2 small model (117M parameters) trained from random initialization on the WikiText-2-raw-v1 dataset. It can generate coherent text continuations.
- **Developed by:** Niladri Das
- **Model type:** GPT-2
- **Language(s) (NLP):** English
- **License:** Apache-2.0
### Model Sources
- **Repository:** https://github.com/bniladridas/models
## Uses
### Direct Use
Use for text generation tasks, such as completing sentences or generating stories.
### Out-of-Scope Use
Not suitable for tasks requiring factual accuracy, safety-critical applications, or languages other than English.
## Bias, Risks, and Limitations
Trained on WikiText, which may contain biases from the source data. Model may generate inappropriate or biased content.
### Recommendations
Use with caution; implement content filters for production use.
## How to Get Started with the Model
```python
from transformers import pipeline
generator = pipeline('text-generation', model='harpertoken/harpertokenGPT2')
print(generator("The quick brown fox"))
```
## Training Details
### Training Data
WikiText-2-raw-v1 dataset, a collection of Wikipedia articles.
### Training Procedure
Trained from scratch using PyTorch and Transformers.
#### Training Hyperparameters
- Epochs: 3
- Batch size: 1
- Learning rate: 5e-5
- Max length: 512
## Evaluation
Basic evaluation via text generation coherence.
### Results
Generates plausible text continuations.
## Environmental Impact
- **Hardware Type:** CPU/MPS
- **Hours used:** ~10 minutes
- **Carbon Emitted:** Minimal (local training)
## Technical Specifications
### Model Architecture and Objective
GPT-2 decoder-only transformer for causal language modeling.
### Compute Infrastructure
- Hardware: Mac with MPS
- Software: PyTorch, Transformers