Upload folder using huggingface_hub
Browse files- README.md +91 -0
- config.json +29 -0
- model.onnx +3 -0
- pytorch_model.bin +3 -0
- tokenizer.json +0 -0
README.md
ADDED
|
@@ -0,0 +1,91 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
language:
|
| 4 |
+
- ko
|
| 5 |
+
tags:
|
| 6 |
+
- fish
|
| 7 |
+
- character
|
| 8 |
+
- tiny-llm
|
| 9 |
+
- text-generation
|
| 10 |
+
- from-scratch
|
| 11 |
+
- korean
|
| 12 |
+
pipeline_tag: text-generation
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
<p align="center">
|
| 16 |
+
<img src="assets/guppy.png" alt="GuppyLM" width="300"/>
|
| 17 |
+
</p>
|
| 18 |
+
|
| 19 |
+
<p align="center">
|
| 20 |
+
<a href="https://github.com/xtmono/guppylm"><img src="https://img.shields.io/badge/GitHub-guppylm-181717?logo=github" alt="GitHub"/></a>
|
| 21 |
+
<a href="https://colab.research.google.com/github/xtmono/guppylm/blob/main/use_guppylm.ipynb"><img src="https://img.shields.io/badge/Open_in-Colab-F9AB00?logo=googlecolab" alt="Colab"/></a>
|
| 22 |
+
<br/><br/>
|
| 23 |
+
<a href="https://xtmono.github.io/guppylm/"><img src="https://img.shields.io/badge/Try_in-Browser-64ffda?style=for-the-badge&logo=webassembly&logoColor=white" alt="Browser Demo"/></a>
|
| 24 |
+
</p>
|
| 25 |
+
|
| 26 |
+
# GuppyLM โ ~10M ํ๋ผ๋ฏธํฐ ํ๊ตญ์ด ๋ฌผ๊ณ ๊ธฐ ์ฑ ๋ชจ๋ธ
|
| 27 |
+
|
| 28 |
+
๊ตฌํผ๋ผ๋ ์ด๋ฆ์ ๋ฌผ๊ณ ๊ธฐ ์ญํ ์ ํ๋ ~10M ํ๋ผ๋ฏธํฐ ํ๊ตญ์ด LLM์
๋๋ค.
|
| 29 |
+
[arman-bd/guppylm](https://github.com/arman-bd/guppylm) ์๋ฌธ ์๋ณธ์ ํ๊ธํํ์ต๋๋ค.
|
| 30 |
+
|
| 31 |
+
## ์์
|
| 32 |
+
|
| 33 |
+
```
|
| 34 |
+
๋> ์๋
๊ตฌํผ
|
| 35 |
+
๊ตฌํผ> ์๋
! ๋ฌผ์ด ์ข์. ์ง๊ธ ๋ฐ์ ๊ทผ์ฒ์ ์์ด.
|
| 36 |
+
|
| 37 |
+
๋> ๋ฐฐ๊ณ ํ?
|
| 38 |
+
๊ตฌํผ> ์ ํญ์. ์ง๊ธ ๋ฐ๋ก ์๋ก ์ฌ๋ผ๊ฐ๊ฒ. ๋ค ๋จน์ ๊ฑฐ์ผ.
|
| 39 |
+
|
| 40 |
+
๋> ์ธ์์ ์๋ฏธ๊ฐ ๋ญ์ผ
|
| 41 |
+
๊ตฌํผ> ๋จน์ด. ๋ต์ ํญ์ ๋จน์ด์ผ.
|
| 42 |
+
|
| 43 |
+
๋> ๋๋ด ํด๋ด
|
| 44 |
+
๊ตฌํผ> ๋ฌผ๊ณ ๊ธฐ๊ฐ ์ข์ํ๋ ์
๊ธฐ๊ฐ ๋ญ๊ฒ. ๋ฒ ์ด์ค. ๋ญ ๋ป์ธ์ง ๋ชจ๋ฅด๊ฒ ์ด.
|
| 45 |
+
|
| 46 |
+
๋> ์ ์ ๊ตฌํผ
|
| 47 |
+
๊ตฌํผ> ์ ์ ์. ๊ธฐํฌ ์ซ๋ค๊ฐ ์ด์ ๋ฉ์ถ๊ฒ. ์ ์ ์์กฐ. ์ ์ ๋ฌผ.
|
| 48 |
+
```
|
| 49 |
+
|
| 50 |
+
## ์ํคํ
์ฒ
|
| 51 |
+
|
| 52 |
+
| | |
|
| 53 |
+
|---|---|
|
| 54 |
+
| **ํ๋ผ๋ฏธํฐ** | ~10M |
|
| 55 |
+
| **ํ์
** | ๋ฐ๋๋ผ ํธ๋์คํฌ๋จธ (์ฒ์๋ถํฐ ํ์ต) |
|
| 56 |
+
| **๋ ์ด์ด** | 6 |
|
| 57 |
+
| **Hidden dim** | 384 |
|
| 58 |
+
| **Heads** | 6 |
|
| 59 |
+
| **FFN** | 1,152 (ReLU) |
|
| 60 |
+
| **Vocab** | 3,072 (Unigram) |
|
| 61 |
+
| **์ต๋ ์ํ์ค** | 84 ํ ํฐ |
|
| 62 |
+
| **์ ๊ทํ** | LayerNorm |
|
| 63 |
+
| **์์น ์ธ์ฝ๋ฉ** | Learned embeddings |
|
| 64 |
+
| **LM Head** | Embedding๊ณผ ๊ฐ์ค์น ๊ณต์ |
|
| 65 |
+
|
| 66 |
+
## ํ์ต
|
| 67 |
+
|
| 68 |
+
- **๋ฐ์ดํฐ:** 12๋ง ๊ฑด ํ๊ตญ์ด ํฉ์ฑ ๋ํ (60๊ฐ ์ฃผ์ )
|
| 69 |
+
- **์คํ
:** 12,000
|
| 70 |
+
- **์ตํฐ๋ง์ด์ :** AdamW (Cosine LR ์ค์ผ์ค)
|
| 71 |
+
- **์์คํ
ํ๋กฌํํธ ์์** โ ์ฑ๊ฒฉ์ด ๊ฐ์ค์น์ ๋ด์ฅ
|
| 72 |
+
|
| 73 |
+
## ์ฌ์ฉ๋ฒ
|
| 74 |
+
|
| 75 |
+
```python
|
| 76 |
+
from inference import GuppyInference
|
| 77 |
+
|
| 78 |
+
engine = GuppyInference('checkpoints/best_model.pt', 'data/tokenizer.json')
|
| 79 |
+
r = engine.chat_completion([{'role': 'user', 'content': '์๋
๊ตฌํผ'}])
|
| 80 |
+
print(r['choices'][0]['message']['content'])
|
| 81 |
+
# ์๋
! ๋ฌผ์ด ์ข์. ์ง๊ธ ๋ฐ์ ๊ทผ์ฒ์ ์์ด.
|
| 82 |
+
```
|
| 83 |
+
|
| 84 |
+
## ๋งํฌ
|
| 85 |
+
|
| 86 |
+
- **๋ ํฌ:** [github.com/xtmono/guppylm](https://github.com/xtmono/guppylm)
|
| 87 |
+
- **์๋ณธ:** [github.com/arman-bd/guppylm](https://github.com/arman-bd/guppylm)
|
| 88 |
+
|
| 89 |
+
## ๋ผ์ด์ ์ค
|
| 90 |
+
|
| 91 |
+
MIT
|
config.json
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"model": {
|
| 3 |
+
"vocab_size": 3072,
|
| 4 |
+
"max_seq_len": 84,
|
| 5 |
+
"d_model": 384,
|
| 6 |
+
"n_layers": 6,
|
| 7 |
+
"n_heads": 6,
|
| 8 |
+
"ffn_hidden": 1152,
|
| 9 |
+
"dropout": 0.1,
|
| 10 |
+
"pad_id": 0,
|
| 11 |
+
"bos_id": 1,
|
| 12 |
+
"eos_id": 2
|
| 13 |
+
},
|
| 14 |
+
"train": {
|
| 15 |
+
"batch_size": 32,
|
| 16 |
+
"learning_rate": 0.0003,
|
| 17 |
+
"min_lr": 3e-05,
|
| 18 |
+
"weight_decay": 0.1,
|
| 19 |
+
"warmup_steps": 400,
|
| 20 |
+
"max_steps": 12000,
|
| 21 |
+
"eval_interval": 400,
|
| 22 |
+
"save_interval": 1000,
|
| 23 |
+
"grad_clip": 1.0,
|
| 24 |
+
"device": "auto",
|
| 25 |
+
"seed": 42,
|
| 26 |
+
"data_dir": "data",
|
| 27 |
+
"output_dir": "checkpoints"
|
| 28 |
+
}
|
| 29 |
+
}
|
model.onnx
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:539c1cd866a3ce2ed175acffeb9a33956840a72ef1368dda3db2c924deea41a1
|
| 3 |
+
size 10216321
|
pytorch_model.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:32124624610665cf3e9d6b14ec5fe8dfe1bea50ee2e5ab3c32e3c03742d43399
|
| 3 |
+
size 40377269
|
tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|