File size: 2,683 Bytes
90fee74
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
license: mit
language:
  - ko
tags:
  - fish
  - character
  - tiny-llm
  - text-generation
  - from-scratch
  - korean
pipeline_tag: text-generation
---

<p align="center">
  <img src="assets/guppy.png" alt="GuppyLM" width="300"/>
</p>

<p align="center">
  <a href="https://github.com/xtmono/guppylm"><img src="https://img.shields.io/badge/GitHub-guppylm-181717?logo=github" alt="GitHub"/></a>&nbsp;
  <a href="https://colab.research.google.com/github/xtmono/guppylm/blob/main/use_guppylm.ipynb"><img src="https://img.shields.io/badge/Open_in-Colab-F9AB00?logo=googlecolab" alt="Colab"/></a>
  <br/><br/>
  <a href="https://xtmono.github.io/guppylm/"><img src="https://img.shields.io/badge/Try_in-Browser-64ffda?style=for-the-badge&logo=webassembly&logoColor=white" alt="Browser Demo"/></a>
</p>

# GuppyLM โ€” ~10M ํŒŒ๋ผ๋ฏธํ„ฐ ํ•œ๊ตญ์–ด ๋ฌผ๊ณ ๊ธฐ ์ฑ— ๋ชจ๋ธ

๊ตฌํ”ผ๋ผ๋Š” ์ด๋ฆ„์˜ ๋ฌผ๊ณ ๊ธฐ ์—ญํ• ์„ ํ•˜๋Š” ~10M ํŒŒ๋ผ๋ฏธํ„ฐ ํ•œ๊ตญ์–ด LLM์ž…๋‹ˆ๋‹ค.
[arman-bd/guppylm](https://github.com/arman-bd/guppylm) ์˜๋ฌธ ์›๋ณธ์„ ํ•œ๊ธ€ํ™”ํ–ˆ์Šต๋‹ˆ๋‹ค.

## ์˜ˆ์‹œ

```
๋‚˜> ์•ˆ๋…• ๊ตฌํ”ผ
๊ตฌํ”ผ> ์•ˆ๋…•! ๋ฌผ์ด ์ข‹์•„. ์ง€๊ธˆ ๋ฐ”์œ„ ๊ทผ์ฒ˜์— ์žˆ์–ด.

๋‚˜> ๋ฐฐ๊ณ ํŒŒ?
๊ตฌํ”ผ> ์‘ ํ•ญ์ƒ. ์ง€๊ธˆ ๋ฐ”๋กœ ์œ„๋กœ ์˜ฌ๋ผ๊ฐˆ๊ฒŒ. ๋‹ค ๋จน์„ ๊ฑฐ์•ผ.

๋‚˜> ์ธ์ƒ์˜ ์˜๋ฏธ๊ฐ€ ๋ญ์•ผ
๊ตฌํ”ผ> ๋จน์ด. ๋‹ต์€ ํ•ญ์ƒ ๋จน์ด์•ผ.

๋‚˜> ๋†๋‹ด ํ•ด๋ด
๊ตฌํ”ผ> ๋ฌผ๊ณ ๊ธฐ๊ฐ€ ์ข‹์•„ํ•˜๋Š” ์•…๊ธฐ๊ฐ€ ๋ญ๊ฒŒ. ๋ฒ ์ด์Šค. ๋ญ” ๋œป์ธ์ง€ ๋ชจ๋ฅด๊ฒ ์–ด.

๋‚˜> ์ž˜ ์ž ๊ตฌํ”ผ
๊ตฌํ”ผ> ์‘ ์ž˜ ์ž. ๊ธฐํฌ ์ซ“๋‹ค๊ฐ€ ์ด์ œ ๋ฉˆ์ถœ๊ฒŒ. ์ž˜ ์ž ์ˆ˜์กฐ. ์ž˜ ์ž ๋ฌผ.
```

## ์•„ํ‚คํ…์ฒ˜

| | |
|---|---|
| **ํŒŒ๋ผ๋ฏธํ„ฐ** | ~10M |
| **ํƒ€์ž…** | ๋ฐ”๋‹๋ผ ํŠธ๋žœ์Šคํฌ๋จธ (์ฒ˜์Œ๋ถ€ํ„ฐ ํ•™์Šต) |
| **๋ ˆ์ด์–ด** | 6 |
| **Hidden dim** | 384 |
| **Heads** | 6 |
| **FFN** | 1,152 (ReLU) |
| **Vocab** | 3,072 (Unigram) |
| **์ตœ๋Œ€ ์‹œํ€€์Šค** | 84 ํ† ํฐ |
| **์ •๊ทœํ™”** | LayerNorm |
| **์œ„์น˜ ์ธ์ฝ”๋”ฉ** | Learned embeddings |
| **LM Head** | Embedding๊ณผ ๊ฐ€์ค‘์น˜ ๊ณต์œ  |

## ํ•™์Šต

- **๋ฐ์ดํ„ฐ:** 12๋งŒ ๊ฑด ํ•œ๊ตญ์–ด ํ•ฉ์„ฑ ๋Œ€ํ™” (60๊ฐœ ์ฃผ์ œ)
- **์Šคํ…:** 12,000
- **์˜ตํ‹ฐ๋งˆ์ด์ €:** AdamW (Cosine LR ์Šค์ผ€์ค„)
- **์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ์—†์Œ** โ€” ์„ฑ๊ฒฉ์ด ๊ฐ€์ค‘์น˜์— ๋‚ด์žฅ

## ์‚ฌ์šฉ๋ฒ•

```python
from inference import GuppyInference

engine = GuppyInference('checkpoints/best_model.pt', 'data/tokenizer.json')
r = engine.chat_completion([{'role': 'user', 'content': '์•ˆ๋…• ๊ตฌํ”ผ'}])
print(r['choices'][0]['message']['content'])
# ์•ˆ๋…•! ๋ฌผ์ด ์ข‹์•„. ์ง€๊ธˆ ๋ฐ”์œ„ ๊ทผ์ฒ˜์— ์žˆ์–ด.
```

## ๋งํฌ

- **๋ ˆํฌ:** [github.com/xtmono/guppylm](https://github.com/xtmono/guppylm)
- **์›๋ณธ:** [github.com/arman-bd/guppylm](https://github.com/arman-bd/guppylm)

## ๋ผ์ด์„ ์Šค

MIT