|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- text-generation-inference |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
 |
|
|
|
|
|
## GPT-Fem |
|
|
An 81-million parameter LLM using GPT-2 encodings. |
|
|
Trained using 16GB of text relating to and made by women, along with 1GB of multilingual text. (5.2 billion tokens) |
|
|
|
|
|
This model should be fine-tuned before use. |
|
|
|
|
|
 |
|
|
|
|
|
## Languages: |
|
|
English, |
|
|
Turkish, |
|
|
Swedish, |
|
|
Serbian, |
|
|
Portugese, |
|
|
Norwegian, |
|
|
Welsh, |
|
|
Thai, |
|
|
Polish, |
|
|
French, |
|
|
Finnish, |
|
|
Dutch, |
|
|
Arabic, |
|
|
Korean, |
|
|
Japanese, |
|
|
Danish, |
|
|
Croatian, |
|
|
Spanish, |
|
|
Russian, |
|
|
Chinese |
|
|
|
|
|
|
|
|
## Technical Information |
|
|
| | | |
|
|
|---------------------------------|----:| |
|
|
|Layers |10| |
|
|
|Heads |10| |
|
|
|Embeddings |640| |
|
|
|Context Window |4096 tokens| |
|
|
|Tokenizer |GPT-2 BPE| |
|
|
|
|
|
|
|
|
## Training Information |
|
|
| | | |
|
|
|---------------------------------|----:| |
|
|
|Training Loss |3.0| |
|
|
|Validation Loss |2.99| |
|
|
|Device |Google Colab A100| |
|
|
|Training Time |5 Hours| |
|
|
|