Update README.md
Browse files
README.md
CHANGED
|
@@ -27,6 +27,8 @@ inference: false
|
|
| 27 |
|
| 28 |
A 50M-parameter GPT-2-style language model trained from scratch on a Russian SFT corpus as a chatbot. Illustrates **what can be squeezed out of a model this size** without pretraining on raw text.
|
| 29 |
|
|
|
|
|
|
|
| 30 |
## TL;DR
|
| 31 |
|
| 32 |
- **Architecture**: 10 layer × 8 head × 512 emb (GPT-2 style), 47.85M params
|
|
@@ -307,6 +309,8 @@ Apache 2.0 — for code and weights of this model.
|
|
| 307 |
SFT-корпусе как чат-бот. Иллюстрирует **что можно
|
| 308 |
выжать из модели такого размера** без претрейна на сыром тексте.
|
| 309 |
|
|
|
|
|
|
|
| 310 |
## TL;DR
|
| 311 |
|
| 312 |
- **Architecture**: 10 layer × 8 head × 512 emb (GPT-2 style), 47.85M params
|
|
|
|
| 27 |
|
| 28 |
A 50M-parameter GPT-2-style language model trained from scratch on a Russian SFT corpus as a chatbot. Illustrates **what can be squeezed out of a model this size** without pretraining on raw text.
|
| 29 |
|
| 30 |
+
Git-source - https://codeberg.org/imperius/mini-tron-50
|
| 31 |
+
|
| 32 |
## TL;DR
|
| 33 |
|
| 34 |
- **Architecture**: 10 layer × 8 head × 512 emb (GPT-2 style), 47.85M params
|
|
|
|
| 309 |
SFT-корпусе как чат-бот. Иллюстрирует **что можно
|
| 310 |
выжать из модели такого размера** без претрейна на сыром тексте.
|
| 311 |
|
| 312 |
+
Git-source - https://codeberg.org/imperius/mini-tron-50
|
| 313 |
+
|
| 314 |
## TL;DR
|
| 315 |
|
| 316 |
- **Architecture**: 10 layer × 8 head × 512 emb (GPT-2 style), 47.85M params
|