metadata
license: mit
language:
- en
tags:
- text-generation-inference
pipeline_tag: text-generation
GPT-Fem
An 81-million parameter LLM using GPT-2 encodings. Trained using 16GB of Reddit comments and submissions relating to and made by women, along with 1GB of multilingual text. (5.2 billion tokens)
Technical Information
| Layers | 10 |
| Heads | 10 |
| Embeddings | 640 |
| Context Window | 4096 tokens |
| Tokenizer | GPT-2 BPE |
Training Information
| Training Loss | 3.0 |
| Validation Loss | 2.99 |
| Device | Google Colab L4 |
| Training Time | 5 Hours |
