xtmono commited on
Commit
90fee74
ยท
verified ยท
1 Parent(s): 509bce7

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. README.md +91 -0
  2. config.json +29 -0
  3. model.onnx +3 -0
  4. pytorch_model.bin +3 -0
  5. tokenizer.json +0 -0
README.md ADDED
@@ -0,0 +1,91 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - ko
5
+ tags:
6
+ - fish
7
+ - character
8
+ - tiny-llm
9
+ - text-generation
10
+ - from-scratch
11
+ - korean
12
+ pipeline_tag: text-generation
13
+ ---
14
+
15
+ <p align="center">
16
+ <img src="assets/guppy.png" alt="GuppyLM" width="300"/>
17
+ </p>
18
+
19
+ <p align="center">
20
+ <a href="https://github.com/xtmono/guppylm"><img src="https://img.shields.io/badge/GitHub-guppylm-181717?logo=github" alt="GitHub"/></a>&nbsp;
21
+ <a href="https://colab.research.google.com/github/xtmono/guppylm/blob/main/use_guppylm.ipynb"><img src="https://img.shields.io/badge/Open_in-Colab-F9AB00?logo=googlecolab" alt="Colab"/></a>
22
+ <br/><br/>
23
+ <a href="https://xtmono.github.io/guppylm/"><img src="https://img.shields.io/badge/Try_in-Browser-64ffda?style=for-the-badge&logo=webassembly&logoColor=white" alt="Browser Demo"/></a>
24
+ </p>
25
+
26
+ # GuppyLM โ€” ~10M ํŒŒ๋ผ๋ฏธํ„ฐ ํ•œ๊ตญ์–ด ๋ฌผ๊ณ ๊ธฐ ์ฑ— ๋ชจ๋ธ
27
+
28
+ ๊ตฌํ”ผ๋ผ๋Š” ์ด๋ฆ„์˜ ๋ฌผ๊ณ ๊ธฐ ์—ญํ• ์„ ํ•˜๋Š” ~10M ํŒŒ๋ผ๋ฏธํ„ฐ ํ•œ๊ตญ์–ด LLM์ž…๋‹ˆ๋‹ค.
29
+ [arman-bd/guppylm](https://github.com/arman-bd/guppylm) ์˜๋ฌธ ์›๋ณธ์„ ํ•œ๊ธ€ํ™”ํ–ˆ์Šต๋‹ˆ๋‹ค.
30
+
31
+ ## ์˜ˆ์‹œ
32
+
33
+ ```
34
+ ๋‚˜> ์•ˆ๋…• ๊ตฌํ”ผ
35
+ ๊ตฌํ”ผ> ์•ˆ๋…•! ๋ฌผ์ด ์ข‹์•„. ์ง€๊ธˆ ๋ฐ”์œ„ ๊ทผ์ฒ˜์— ์žˆ์–ด.
36
+
37
+ ๋‚˜> ๋ฐฐ๊ณ ํŒŒ?
38
+ ๊ตฌํ”ผ> ์‘ ํ•ญ์ƒ. ์ง€๊ธˆ ๋ฐ”๋กœ ์œ„๋กœ ์˜ฌ๋ผ๊ฐˆ๊ฒŒ. ๋‹ค ๋จน์„ ๊ฑฐ์•ผ.
39
+
40
+ ๋‚˜> ์ธ์ƒ์˜ ์˜๋ฏธ๊ฐ€ ๋ญ์•ผ
41
+ ๊ตฌํ”ผ> ๋จน์ด. ๋‹ต์€ ํ•ญ์ƒ ๋จน์ด์•ผ.
42
+
43
+ ๋‚˜> ๋†๋‹ด ํ•ด๋ด
44
+ ๊ตฌํ”ผ> ๋ฌผ๊ณ ๊ธฐ๊ฐ€ ์ข‹์•„ํ•˜๋Š” ์•…๊ธฐ๊ฐ€ ๋ญ๊ฒŒ. ๋ฒ ์ด์Šค. ๋ญ” ๋œป์ธ์ง€ ๋ชจ๋ฅด๊ฒ ์–ด.
45
+
46
+ ๋‚˜> ์ž˜ ์ž ๊ตฌํ”ผ
47
+ ๊ตฌํ”ผ> ์‘ ์ž˜ ์ž. ๊ธฐํฌ ์ซ“๋‹ค๊ฐ€ ์ด์ œ ๋ฉˆ์ถœ๊ฒŒ. ์ž˜ ์ž ์ˆ˜์กฐ. ์ž˜ ์ž ๋ฌผ.
48
+ ```
49
+
50
+ ## ์•„ํ‚คํ…์ฒ˜
51
+
52
+ | | |
53
+ |---|---|
54
+ | **ํŒŒ๋ผ๋ฏธํ„ฐ** | ~10M |
55
+ | **ํƒ€์ž…** | ๋ฐ”๋‹๋ผ ํŠธ๋žœ์Šคํฌ๋จธ (์ฒ˜์Œ๋ถ€ํ„ฐ ํ•™์Šต) |
56
+ | **๋ ˆ์ด์–ด** | 6 |
57
+ | **Hidden dim** | 384 |
58
+ | **Heads** | 6 |
59
+ | **FFN** | 1,152 (ReLU) |
60
+ | **Vocab** | 3,072 (Unigram) |
61
+ | **์ตœ๋Œ€ ์‹œํ€€์Šค** | 84 ํ† ํฐ |
62
+ | **์ •๊ทœํ™”** | LayerNorm |
63
+ | **์œ„์น˜ ์ธ์ฝ”๋”ฉ** | Learned embeddings |
64
+ | **LM Head** | Embedding๊ณผ ๊ฐ€์ค‘์น˜ ๊ณต์œ  |
65
+
66
+ ## ํ•™์Šต
67
+
68
+ - **๋ฐ์ดํ„ฐ:** 12๋งŒ ๊ฑด ํ•œ๊ตญ์–ด ํ•ฉ์„ฑ ๋Œ€ํ™” (60๊ฐœ ์ฃผ์ œ)
69
+ - **์Šคํ…:** 12,000
70
+ - **์˜ตํ‹ฐ๋งˆ์ด์ €:** AdamW (Cosine LR ์Šค์ผ€์ค„)
71
+ - **์‹œ์Šคํ…œ ํ”„๋กฌํ”„ํŠธ ์—†์Œ** โ€” ์„ฑ๊ฒฉ์ด ๊ฐ€์ค‘์น˜์— ๋‚ด์žฅ
72
+
73
+ ## ์‚ฌ์šฉ๋ฒ•
74
+
75
+ ```python
76
+ from inference import GuppyInference
77
+
78
+ engine = GuppyInference('checkpoints/best_model.pt', 'data/tokenizer.json')
79
+ r = engine.chat_completion([{'role': 'user', 'content': '์•ˆ๋…• ๊ตฌํ”ผ'}])
80
+ print(r['choices'][0]['message']['content'])
81
+ # ์•ˆ๋…•! ๋ฌผ์ด ์ข‹์•„. ์ง€๊ธˆ ๋ฐ”์œ„ ๊ทผ์ฒ˜์— ์žˆ์–ด.
82
+ ```
83
+
84
+ ## ๋งํฌ
85
+
86
+ - **๋ ˆํฌ:** [github.com/xtmono/guppylm](https://github.com/xtmono/guppylm)
87
+ - **์›๋ณธ:** [github.com/arman-bd/guppylm](https://github.com/arman-bd/guppylm)
88
+
89
+ ## ๋ผ์ด์„ ์Šค
90
+
91
+ MIT
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model": {
3
+ "vocab_size": 3072,
4
+ "max_seq_len": 84,
5
+ "d_model": 384,
6
+ "n_layers": 6,
7
+ "n_heads": 6,
8
+ "ffn_hidden": 1152,
9
+ "dropout": 0.1,
10
+ "pad_id": 0,
11
+ "bos_id": 1,
12
+ "eos_id": 2
13
+ },
14
+ "train": {
15
+ "batch_size": 32,
16
+ "learning_rate": 0.0003,
17
+ "min_lr": 3e-05,
18
+ "weight_decay": 0.1,
19
+ "warmup_steps": 400,
20
+ "max_steps": 12000,
21
+ "eval_interval": 400,
22
+ "save_interval": 1000,
23
+ "grad_clip": 1.0,
24
+ "device": "auto",
25
+ "seed": 42,
26
+ "data_dir": "data",
27
+ "output_dir": "checkpoints"
28
+ }
29
+ }
model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:539c1cd866a3ce2ed175acffeb9a33956840a72ef1368dda3db2c924deea41a1
3
+ size 10216321
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:32124624610665cf3e9d6b14ec5fe8dfe1bea50ee2e5ab3c32e3c03742d43399
3
+ size 40377269
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff