harpertoken
/

harpertokenGPT2

Text Generation

text-generation-inference

Model card Files Files and versions

harpertokenGPT2 / README.md

harpertoken's picture

chore: correct epoch

7492196 verified 3 months ago

|

history blame contribute delete

2.04 kB

	---
	library_name: transformers
	tags:
	- gpt2
	- text-generation
	---

	# Model Card for harpertoken/harpertokenGPT2

	GPT-2 small model trained from scratch on WikiText-2-raw-v1 dataset for text generation.

	## Model Details

	### Model Description

	This is a GPT-2 small model (117M parameters) trained from random initialization on the WikiText-2-raw-v1 dataset. It can generate coherent text continuations.

	- Developed by: Niladri Das
	- Model type: GPT-2
	- Language(s) (NLP): English
	- License: Apache-2.0

	### Model Sources

	- Repository: https://github.com/bniladridas/models

	## Uses

	### Direct Use

	Use for text generation tasks, such as completing sentences or generating stories.

	### Out-of-Scope Use

	Not suitable for tasks requiring factual accuracy, safety-critical applications, or languages other than English.

	## Bias, Risks, and Limitations

	Trained on WikiText, which may contain biases from the source data. Model may generate inappropriate or biased content.

	### Recommendations

	Use with caution; implement content filters for production use.

	## How to Get Started with the Model

	```python
	from transformers import pipeline

	generator = pipeline('text-generation', model='harpertoken/harpertokenGPT2')
	print(generator("The quick brown fox"))
	```

	## Training Details

	### Training Data

	WikiText-2-raw-v1 dataset, a collection of Wikipedia articles.

	### Training Procedure

	Trained from scratch using PyTorch and Transformers.

	#### Training Hyperparameters

	- Epochs: 3
	- Batch size: 1
	- Learning rate: 5e-5
	- Max length: 512

	## Evaluation

	Basic evaluation via text generation coherence.

	### Results

	Generates plausible text continuations.

	## Environmental Impact

	- Hardware Type: CPU/MPS
	- Hours used: ~10 minutes
	- Carbon Emitted: Minimal (local training)

	## Technical Specifications

	### Model Architecture and Objective

	GPT-2 decoder-only transformer for causal language modeling.

	### Compute Infrastructure

	- Hardware: Mac with MPS
	- Software: PyTorch, Transformers