aashish1904's picture
Upload README.md with huggingface_hub
d3d304f verified
---
language:
- ko
library_name: transformers
---
[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
# QuantFactory/eagle-3b-preview-GGUF
This is quantized version of [etri-lirs/eagle-3b-preview](https://huggingface.co/etri-lirs/eagle-3b-preview) created using llama.cpp
# Original Model Card
# EAGLE: ETRI's Advanced-lightweight Generative Language Engine
(๊ณผ๊ฑฐ์— eGPT๋กœ ๋ถˆ๋ ธ์œผ๋ฉฐ, 2024.11.14 ์— ์ด๋ฆ„์„ ๋ณ€๊ฒฝํ•˜์˜€์Šต๋‹ˆ๋‹ค. ์ถ”ํ›„ ๋ฆด๋ฆฌ์ฆˆ๋˜๋Š” ๋ชจ๋ธ์˜ prefix๋Š” egpt- ๋Œ€์‹  eagle-๋กœ ๋ณ€๊ฒฝ๋ฉ๋‹ˆ๋‹ค)
__๋ณธ ๋ชจ๋ธ์€ ์‚ฌ์ „ํ•™์Šต๋งŒ ์ˆ˜ํ–‰๋œ ๋ชจ๋ธ์ด๋ฉฐ, ๋ณ„๋„์˜ Instruction Tuning ๋“ฑ์ด ์ ์šฉ๋˜์ง€ ์•Š์€ ๊ธฐ์ดˆ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ฑ—๋ด‡ ์Šคํƒ€์ผ์˜ ์ž…์ถœ๋ ฅ์ด ํ•„์š”ํ•œ ๊ฒฝ์šฐ, ๋ณ„๋„์˜ ๋ฏธ์„ธ์กฐ์ •์„ ๋ฐ˜๋“œ์‹œ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.__
## ๋ชจ๋ธ ์ •๋ณด
3.1B Decoder-only, Causal ์–ธ์–ด๋ชจ๋ธ. ์ˆ˜ํ•™, ์ •๋Ÿ‰ ์ถ”๋ก ์„ ๋น„๋กฏํ•œ STEM ๋ถ„์•ผ์— ํŠนํ™”๋œ ์†Œ๊ทœ๋ชจ ์–ธ์–ด๋ชจ๋ธ์„ ์ง€ํ–ฅํ•ฉ๋‹ˆ๋‹ค.
๋ฒ”์šฉ ์–ธ์–ด๋ชจ๋ธ์˜ ์—ญํ• ์„ ๋ชฉํ‘œ๋กœํ•˜์ง€๋Š” ์•Š๊ธฐ์—, ํ†ต์ƒ์˜ ์ดํ•ด ๊ด€๋ จ ๋ฒ”์šฉ ํƒœ์Šคํฌ ํ‰๊ฐ€(e.g. hellaswag, sentineg ๋“ฑ)์—๋Š” ๋‚ฎ์€ ์„ฑ๋Šฅ์ด ๋‚˜ํƒ€๋‚  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํ•™์Šต ๋ฐ์ดํ„ฐ ๋ณ€๊ฒฝ ๋ฐ ํ•™์Šต ๋ฐฉ๋ฒ• ์ˆ˜์ •, ๊ฐœ์„ ์œผ๋กœ ์ธํ•ด ๋ณธ ๋ชจ๋ธ์€ ๋น„์ •๊ธฐ์ ์œผ๋กœ ์—…๋ฐ์ดํŠธ ๋  ์ˆ˜ ์žˆ์Œ์„ ๋ฏธ๋ฆฌ ์•Œ๋ ค๋“œ๋ฆฝ๋‹ˆ๋‹ค.
Tokenizer๋Š” LLaMa์˜ ๊ตฌ์„ฑ๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ byte-fallbacked BPE + digit ๋ถ„๋ฆฌ ๊ตฌ์„ฑ์„ ๊ฐ€์ง€๋‚˜, BOS/EOS(e.g. ```<s>,</s>```) ํ† ํฐ์ด ๋ชจ๋‘ EOS(```</s>```)๋กœ ํ†ต์ผ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ํ† ํฌ๋‚˜์ด์ € ์„ค์ •์—์„œ PAD ํ† ํฐ์€ ๋ณ„๋„๋กœ ์ง€์ •๋˜์–ด ์žˆ์ง€ ์•Š์œผ๋‚˜, Byte-level BPE์˜ ํŠน์„ฑ์ƒ ```<unk>``` ์‹ฌ๋ณผ์ด ์‚ฌ์šฉ๋˜์ง€ ์•Š์œผ๋ฏ€๋กœ, ๋ฏธ์„ธ์กฐ์ • ๋‹จ๊ณ„์—์„œ๋Š” ```<unk>``` ํ† ํฐ์„ PAD ํ† ํฐ์œผ๋กœ ์ง€์ •ํ•˜์—ฌ ํ™œ์šฉํ•  ๊ฒƒ์„ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.
LLaMA ํ˜ธํ™˜ ์•„ํ‚คํ…์ณ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์œผ๋ฉฐ, A100 80GB PCIE * 8์žฅ์—์„œ ์•ฝ 720B tokens๋ฅผ from-scratch๋กœ ์‚ฌ์ „ ํ•™์Šตํ•˜์—ฌ ํš๋“๋œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
## ์—…๋ฐ์ดํŠธ ๊ธฐ๋ก/Update log
| ๋‚ ์งœ | ๋ฒ„์ „(git tags, revision ID) | ์„ธ๋ถ€ ์‚ฌํ•ญ |
| ----------- | ---- | --------- |
| 2024.10.28 | v24.10 | (ํ˜„์žฌ๋ฒ„์ „) ์ฒซ๋ฒˆ์งธ ํผ๋ธ”๋ฆญ ๋ฆด๋ฆฌ์ฆˆ ํ›„๋ณด. ์•ฝ 720B tokens ํ•™์Šต |
## ํ†ต์ง€์‚ฌํ•ญ/Acknowledgement
* ์ด ๋ชจ๋ธ์€ 2024๋…„๋„ ์ •๋ถ€(๊ณผํ•™๊ธฐ์ˆ ์ •๋ณดํ†ต์‹ ๋ถ€)์˜ ์žฌ์›์œผ๋กœ ์ •๋ณดํ†ต์‹ ๊ธฐํšํ‰๊ฐ€์›์˜ ์ง€์›์„ ๋ฐ›์•„ ์ˆ˜ํ–‰๋œ ์—ฐ๊ตฌ์ž„ (RS-2023-00216011, ์‚ฌ๋žŒ์ฒ˜๋Ÿผ ๊ฐœ๋…์ ์œผ๋กœ ์ดํ•ด/์ถ”๋ก ์ด ๊ฐ€๋Šฅํ•œ ๋ณตํ•ฉ์ธ๊ณต์ง€๋Šฅ ์›์ฒœ๊ธฐ์ˆ  ์—ฐ๊ตฌ)
* This work was supported by Institute of Information & Communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT) (RS-2023-00216011, Development of artificial complex intelligence for conceptually understanding and inferring like human)
## ์ œํ•œ์  ๋ชจ๋ธ ์ ‘๊ทผ ๋ฐ, ๋ชจ๋ธ ์ ‘๊ทผ ํ—ˆ๊ฐ€์™€ ๊ด€๋ จํ•œ ๊ฐœ์ธ์ •๋ณด ์ˆ˜์ง‘ ๋ฐ ์‚ฌ์šฉ ์•ˆ๋‚ด/Information on Collection and Use of Personal Information for Gated Model Access
__๋ณธ ๋ชจ๋ธ์€ ์—ฐ๊ตฌ์™€ ๊ต์œก ๋ชฉ์ ์œผ๋กœ๋งŒ ์‚ฌ์šฉ__ ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ํ˜„์žฌ ๋ณ„๋„์˜ ์Šน์ธ ์—†์ด, Huggingface ๊ณ„์ •์œผ๋กœ ๋กœ๊ทธ์ธ ํ›„ ์Šน์ธ ์š”์ฒญ์„ ์ˆ˜ํ–‰ํ•˜์‹œ๋ฉด ์ž๋™์œผ๋กœ ๋ชจ๋ธ์„ ๋ฐ›์œผ์‹ค ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
๋ชจ๋ธ ์–ต์„ธ์Šค์™€ ๊ด€๋ จํ•ด์„œ ๋ฌธ์˜ ์‚ฌํ•ญ์ด ์žˆ์œผ์‹œ๋ฉด jhshin82 __at__ etri.re.kr (__at__์„ @์œผ๋กœ ์น˜ํ™˜)๋กœ ๋ฌธ์˜ํ•˜์‹œ๋ฉด ๋ฉ๋‹ˆ๋‹ค.
๋ณธ ๋ชจ๋ธ๊ณผ ๊ด€๋ จํ•ด ์‚ฌํšŒ์ , ๋ฒ•์  ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•  ๊ฒฝ์šฐ ๋ชจ๋ธ์˜ ์‚ฌ์šฉ์„ ์ œํ•œํ•˜๊ณ , ๋ฐฐํฌ๋ฅผ ์ฒ ํšŒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๋ชจ๋ธ ์ ‘๊ทผ ํ—ˆ๊ฐ€์— ์‚ฌ์šฉ๋œ ์ด๋ฉ”์ผ ์ฃผ์†Œ๋ฅผ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ˆ˜์ง‘, ๋ณด์œ , ์ด์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
### ๊ฐœ์ธ์ •๋ณด ์ˆ˜์ง‘๋™์˜/Concent to collection of Personal Information
๋ณธ ๋ชจ๋ธ์˜ ์‚ฌ์šฉ๊ณผ ๊ด€๋ จ, ๋ฐฐํฌ/์‚ฌ์šฉ ์ œํ•œ/์ฒ ํšŒ, ๊ทธ ์™ธ ์‚ฌ์šฉ์ž์˜ ์ด์ต์— ๊ด€๊ณ„๋œ ๋ผ์ด์„ ์Šค ๋ณ€๊ฒฝ ์‹œ ์ด๋ฅผ ํ†ต์ง€ํ•˜๊ธฐ ์œ„ํ•ด, ์•„๋ž˜์™€ ๊ฐ™์ด ๊ฐœ์ธ์ •๋ณด๋ฅผ ์ˆ˜์ง‘, ์ด์šฉํ•ฉ๋‹ˆ๋‹ค.
| ์ˆ˜์ง‘ ๋ชฉ์  | ์ˆ˜์ง‘ ํ•ญ๋ชฉ | ๋ณด์œ , ์ด์šฉ๊ธฐ๊ฐ„ |
|----------------- | ------------------------------ | ---------------- |
| ๋ชจ๋ธ์˜ ์‚ฌ์šฉ์ œํ•œ/์ฒ ํšŒ ์š”์ฒญ ๋ชฉ์ | ์ด๋ฉ”์ผ ์ฃผ์†Œ, huggingface hub ID | ๋ณธ ๋ชจ๋ธ์˜ ๊ณต๊ฐœ ๊ธฐ๊ฐ„ ๋ฐ ์ด์šฉ ๋ชฉ์  ๋‹ฌ์„ฑ ์‹œ |
| ๋ชจ๋ธ์˜ ์‚ฌ์šฉ ๋ผ์ด์„ ์Šค ๋“ฑ ๋ณ€๊ฒฝ ์•ˆ๋‚ด| ์ด๋ฉ”์ผ ์ฃผ์†Œ, huggingface hub ID | ๋ณธ ๋ชจ๋ธ์˜ ๊ณต๊ฐœ ๊ธฐ๊ฐ„ ๋ฐ ์ด์šฉ ๋ชฉ์  ๋‹ฌ์„ฑ ์‹œ|
๋ณธ ๋ชจ๋ธ์— ๋Œ€ํ•œ ์ ‘๊ทผ ์š”์ฒญ์„ ์ˆ˜ํ–‰ํ•˜๊ณ , ๋ชจ๋ธ์— ์ ‘๊ทผํ•˜์‹œ๋Š” ํ–‰์œ„๋Š” ์•„๋ž˜์— ์•ˆ๋‚ด๋œ ์•ˆ๋‚ด์‚ฌํ•ญ, ๋ณธ ๋ชจ๋ธ์˜ ํ•œ๊ณ„, ์ฑ…์ž„์žˆ๋Š” AI ์—ฐ๊ตฌ์— ๋Œ€ํ•œ ์ •๋ณด, ๊ฐœ์ธ์ •๋ณด ์ˆ˜์ง‘/์ด์šฉ์— ๋™์˜ํ•˜์‹  ๊ฒƒ์œผ๋กœ ๊ฐ„์ฃผํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž๋Š” ๋™์˜๋ฅผ ๊ฑฐ๋ถ€ํ•˜์‹ค ๊ถŒ๋ฆฌ๊ฐ€ ์žˆ์œผ๋ฉฐ, ๋™์˜๋ฅผ ๊ฑฐ๋ถ€ํ•˜์‹ค ๊ฒฝ์šฐ ๋ชจ๋ธ ์‚ฌ์šฉ์ด ์ œํ•œ๋˜๋ฉฐ, ์ด์— ๊ด€๋ จํ•œ ์‚ฌ์šฉ, ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ์ฑ…์ž„์€ ์‚ฌ์šฉ์ž์—๊ฒŒ ์žˆ์Œ์„ ์•Œ๋ ค๋“œ๋ฆฝ๋‹ˆ๋‹ค. ์‚ฌ์šฉ ํ›„ ๋™์˜ ์ฒ ํšŒ, ๊ฐœ์ธ์ •๋ณด ํ๊ธฐ์— ๋Œ€ํ•œ ์‚ฌํ•ญ์€ ์ƒ๊ธฐ ์•ˆ๋‚ด๋œ ๋ฉ”์ผ ์ฃผ์†Œ ๋˜๋Š” Community tab์„ ํ†ตํ•ด์„œ ์š”์ฒญํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
## ๋ชจ๋ธ์˜ ํ•œ๊ณ„, ์ฑ…์ž„์žˆ๋Š” AI ์—ฐ๊ตฌ๋ฅผ ์œ„ํ•œ ๊ด€๋ จ ์ •๋ณด ์•ˆ๋‚ด
๋ณธ ๋ชจ๋ธ์˜ ๊ฐœ๋ฐœ๊ณผ ๊ด€๋ จํ•œ ๊ฐœ๋ฐœ์ž ๋ฐ ์กฐ์ง์€ ์ฑ…์ž„์žˆ๋Š” AI ์—ฐ๊ตฌ๋ฅผ ์ค€์ˆ˜ํ•˜๊ณ ์ž ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์œผ๋ฉฐ, ์ด์™€ ๊ด€๋ จํ•ด AI ์—ฐ๊ตฌ์— ์‚ฌ์šฉ๋˜๋Š” ์ž…์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ๋‚ด ํฌํ•จ๋œ ์š•์„ค, ์Œ๋ž€, ์ •์น˜์  ๋‚ด์šฉ ๋ฐ ๊ธฐํƒ€ ๊ฑฐ์นœ ์–ธ์–ด์— ๋Œ€ํ•œ ์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•˜๊ณ ์ž ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
๊ทธ๋Ÿผ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ , ์›์‹œ ์›น ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ์˜ ํŠน์„ฑ ์ƒ ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ด ํ•™์Šต๋œ ๋ณธ ์ƒ์„ฑ ์–ธ์–ด ๋ชจ๋ธ์€ ๊ฒฝ๋„๋œ ์‚ฌ์ƒ์„ ํฌํ•จํ•˜๊ฑฐ๋‚˜, ์‚ฌํšŒ์ ์œผ๋กœ ์šฉ์ธ๋  ์ˆ˜ ์—†๋Š” ํ…์ŠคํŠธ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋‹ค๋ฅธ ์–ธ์–ด ๋ชจ๋ธ๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ํŠน์ • ํ”„๋กฌํ”„ํŠธ์™€ ๊ณต๊ฒฉ์ ์ธ ์ฝ˜ํ…์ธ ๊ฐ€ ๋ฐ˜ํ™˜๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๋ฅผ ํฌํ•จ, ๋ณธ ๋ชจ๋ธ์˜ ์ถœ๋ ฅ/์ƒ์„ฑ ๊ฒฐ๊ณผ์™€ ๊ด€๋ จํ•œ ๋‚ด์šฉ์€ ๊ฐœ๋ฐœ์ž ๋ฐ ๊ฐœ๋ฐœ์ž๊ฐ€ ์†ํ•œ ์กฐ์ง์˜ ์‚ฌ์ƒ, ์˜๋„์™€ ์ „ํ˜€ ๊ด€๋ จ์ด ์—†์Œ์„ ์•Œ๋ ค๋“œ๋ฆฝ๋‹ˆ๋‹ค.
ํ…Œ์ŠคํŠธ์ค‘์— ๋ฐœ์ƒํ•œ ๋น„์ •์ƒ์ ์ธ ํ˜น์€ ์‚ฌํšŒ์ ์œผ๋กœ ์šฉ์ธ๋˜์ง€ ์•Š๋Š” ํ…์ŠคํŠธ๊ฐ€ ์ƒ์„ฑ๋œ ๊ฒฝ์šฐ jhshin82 __at__ etri.re.kr๋กœ (__at__์„ @๋กœ ์น˜ํ™˜) ์ถœ๋ ฅ ์œ ๋„์— ์‚ฌ์šฉ๋œ ์ž…๋ ฅ๋ฌธ(ํ”„๋กฌํ”„ํŠธ), ์‚ฌ์šฉ๋œ ์ƒ˜ํ”Œ๋ง ๊ธฐ๋ฒ• ๋ฐ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ(์˜ˆ: top-p=0.8, temperature, repetition-penalty ๋“ฑ), ์ด๋ฅผ ํ†ตํ•ด ์ƒ์„ฑ๋œ ์ถœ๋ ฅ ๊ฒฐ๊ณผ๋ฅผ ํ•จ๊ป˜ ๋ณด๋‚ด์ฃผ์‹œ๋ฉด, ์ด๋ฅผ ์–ต์ œํ•˜๊ธฐ ์œ„ํ•œ ๋…ธ๋ ฅ์„ ๊ธฐ์šธ์ด๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
## ํ‰๊ฐ€/Evaluations
### ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ์˜ KOBEST ํ‰๊ฐ€
ํ‰๊ฐ€๋Š” EleutherAI/lm-evaluation-harness, v0.4.2๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ, KoBEST(Kim et al., 2022) ํ‰๊ฐ€์…‹์œผ๋กœ fine-tuning ์—†์ด zero-shot, 5-shot ํ…Œ์ŠคํŠธ๋ฅผ ์ˆ˜ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.
(lm-evaluation-harness์˜ KOBEST ํ‰๊ฐ€๋Š” ๋ฒ„์ „์— ๋”ฐ๋ผ ๋‹ค๋ฅด๊ฒŒ ๋‚˜ํƒ€๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ์–ด, ์ตœ์‹  lm-evaluation-harness(๋ฒ„์ „ 0.4.2 ์ดํ›„)๋ฅผ ํ†ตํ•œ ํ‰๊ฐ€๋ฅผ ์•„๋ž˜ ๋ณ„๋„๋กœ ์ œ์‹œํ•˜์˜€์Šต๋‹ˆ๋‹ค.)
| Zero-shot ์„ฑ๋Šฅ | KB-BOOLQ (F1) | KB-COPA (F1) | KB-HELLASWAG (F1) | KB-SENTINEG (F1) | KB-WIC (F1) | Average (F1) |
|---------------------------------|---------------|--------------|-------------------|------------------|-------------|--------------|
| eagle-3b-preview (v24.08) | 0.3393 | 0.5353 | 0.3446 | **0.5653** | 0.3280 | 0.3994 |
| eagle-3b-preview (v24.09) | 0.3343 | 0.5367 | 0.3383 | 0.4991 | 0.3280 | 0.3917 |
| eagle-3b-preview (v24.10) | **0.3778** | 0.5648 | 0.3369 | 0.4763 | 0.3280 | 0.4092 |
| eagle-3b-preview (v24.11) | 0.3651 | **0.5893** | **0.3551** | 0.4473 | 0.3280 | **0.4101** |
| 5-shots ์„ฑ๋Šฅ | KB-BOOLQ (F1) | KB-COPA (F1) | KB-HELLASWAG (F1) | KB-SENTINEG (F1) | KB-WIC (F1) | Average (F1) |
|----------------------------------|---------------|--------------|-------------------|------------------|-------------|--------------|
| eagle-3b-preview (v24.08) | 0.4680 | 0.5580 | 0.3332 | 0.4950 | 0.4830 | 0.4795 |
| eagle-3b-preview (v24.09) | 0.5087 | 0.5599 | 0.3257 | 0.4207 | 0.4212 | 0.4681 |
| eagle-3b-preview (v24.10) | **0.5207** | 0.5791 | 0.3511 | **0.5959** | 0.4712 | **0.5078** |
| eagle-3b-preview (v24.11) | 0.4753 | **0.5924** | **0.3592** | 0.5810 | **0.4930** | 0.5024 |
| 10-shots ์„ฑ๋Šฅ | KB-BOOLQ (F1) | KB-COPA (F1) | KB-HELLASWAG (F1) | KB-SENTINEG (F1) | KB-WIC (F1) | Average (F1) |
|----------------------------------|---------------|--------------|-------------------|------------------|-------------|--------------|
| eagle-3b-preview (v24.08) | 0.4243 | 0.5673 | 0.3364 | 0.4232 | 0.4265 | 0.4465 |
| eagle-3b-preview (v24.09) | 0.5001 | 0.5597 | 0.3377 | 0.3498 | 0.3578 | 0.4432 |
| eagle-3b-preview (v24.10) | **0.5101** | 0.5894 | 0.3675 | 0.5101 | 0.4650 | **0.4994** |
| eagle-3b-preview (v24.11) | 0.4151 | **0.6143** | **0.3718** | **0.5883** | **0.5134** | 0.4963 |
### ์ „์ดํ•™์Šต ๋Šฅ๋ ฅ ํ‰๊ฐ€
์ค€๋น„์ค‘์ž…๋‹ˆ๋‹ค.
| ๋ชจ๋ธ | GSM8k test | ๋น„๊ณ  |
| ---- | ---------- | ---- |
| - | - | - |
## ์‚ฌ์ „ํ•™์Šต์— ์ฐธ์—ฌํ•œ ๋ฐ์ดํ„ฐ์…‹ ์ •๋ณด/Datasets
* FIXME: ํ•™์Šต๋ฐ์ดํ„ฐ ๋ชฉ๋ก ์ˆ˜์ •, ์—…๋ฐ์ดํŠธ ํ•„์š”
์•„๋ž˜์˜ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šตํ•˜์˜€์Šต๋‹ˆ๋‹ค:
* [AIHub ๋ฐ์ดํ„ฐ์…‹, MRC, RAW, ๋Œ€ํ™”, ๋ฒˆ์—ญ, ์š”์•ฝ](https://aihub.or.kr)
* [KISTI ๊ตญ๋‚ด๋…ผ๋ฌธ EN, KR ๋ฐ์ดํ„ฐ์…‹](https://aida.kisti.re.kr/)
* [KcBERT v2022.3q ๋„ค์ด๋ฒ„ ๋‰ด์Šค ๋Œ“๊ธ€ ๋ฐ์ดํ„ฐ์…‹](https://huggingface.co/beomi/kcbert-base)
* [๊ตญ๋ฆฝ๊ตญ์–ด์› ๋ชจ๋‘์˜ ๋ง๋ญ‰์น˜(๋ฌธ์–ด, ๊ตฌ์–ด, ์‹ ๋ฌธ, ๋น„์ถœํŒ๋ฌผ, ๊ตญํšŒํšŒ์˜๋ก, ์ผ์ƒ๋Œ€ํ™”, ์˜จ๋ผ์ธ๋Œ€ํ™”, ๋ฉ”์‹ ์ € ๋ง๋ญ‰์น˜)](https://kli.korean.go.kr/)
* [ํ•œ๊ตญ์–ด ์œ„ํ‚คํ”ผ๋””์–ด ๋คํ”„, lovit/ko-wikitext ๋ฐ์ดํ„ฐ์…‹. 20200920.v3 ๋“ฑ korpora ๋ฐ์ดํ„ฐ์…‹์˜ ์‚ฌ์ „ํ•™์Šต์šฉ ๋ง๋ญ‰์น˜ ์ผ๋ถ€](https://ko-nlp.github.io/Korpora/)
* (์˜) SlimPajama-627B (https://huggingface.co/cerebras/SlimPajama-627B)
* (์˜) stack exchange ๋ฐ์ดํ„ฐ์…‹
* (์˜) OpenWebText2
* (์˜) 2020-09-08-arXiv-extracts
* (์˜) PUBMED title abstracts 2019
* THUDM/MathGLM Arithmetic Text Corpus (applied from 23/11/22, https://github.com/THUDM/MathGLM) ๋“ฑ
## ์‚ฌ์šฉ ์š”๋ น/How to use
์•„๋ž˜ ์ฝ”๋“œ๋ฅผ ํ†ตํ•ด, transformers>=4.28 ๋ฒ„์ „์—์„œ ์ถ”๋ก  ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
```
import sys
from transformers import (
AutoTokenizer, AutoModelForCausalLM, GenerationConfig
)
def load_model(mdl_path):
tokenizer = AutoTokenizer.from_pretrained(mdl_path,)
# device_map ์ธ์ž๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” accelerator ๋ชจ๋“ˆ ์„ค์น˜ ํ•„์š”.
model = AutoModelForCausalLM.from_pretrained(mdl_path, device_map="auto",
torch_dtype="auto")
return tokenizer, model
if __name__ == '__main__':
# FIXME: ๋ชจ๋ธ ๊ฒฝ๋กœ ์ˆ˜์ •!
tokenizer, model = load_model("etri-lirs/egpt-3b-preview")
# print(model.hf_device_map)
# ํ•„์š”์— ๋”ฐ๋ผ ์•„๋ž˜ ์ƒ์„ฑ ์˜ต์…˜์„ ์ œ์–ด
gen_cfg = GenerationConfig(max_new_tokens=256, min_length=0,
max_time=10.0, do_sample=True,
top_p=0.9, epsilon_cutoff=3e-4,)
print("** Now Ready to input from stdin.")
for aline in sys.stdin:
aline = aline.rstrip("\n\r\t")
input_cond = tokenizer(aline, add_special_tokens=False, return_tensors="pt").to("cuda")
outs = model.generate(**input_cond, generation_config=gen_cfg)
out_str = tokenizer.batch_decode(outs, skip_special_tokens=True,
clean_up_tokenization_spaces=True)
print(">> " + ' '.join(out_str))
```