| license: mit | |
| language: | |
| - ru | |
| metrics: | |
| - perplexity | |
| pipeline_tag: text-generation | |
| This model was created by [ilnikolaev](https://huggingface.co/ilnikolaev) | |
| Trained from scratch using Tensorflow Keras | |
| [200mb Russian Comments from 2ch](https://www.kaggle.com/datasets/fizzzgen/65mb-of-dvach-conversations) dataset used | |
| - Type: decoder-only | |
| - Tokenizer: BPE | |
| - Vocabulary size: 32000 | |
| - Max sequence length: 120 | |
| - Hidden size: 768 | |
| - FFN size: 3072 | |
| - Attention heads: 24 | |
| - Decoder layers: 4 |