File size: 1,686 Bytes
94dfac1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---
license: mit
datasets:
- maywell/koVast
language:
- ko
base_model:
- blueapple8259/tzem
pipeline_tag: question-answering
library_name: transformers
---
[tzem](https://huggingface.co/blueapple8259/tzem) ๋ชจ๋ธ์„ instruct ๋ฐ์ดํ„ฐ๋กœ ํŒŒ์ธํŠœ๋‹ ํ•œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

## ํ”„๋กฌํ”„ํŠธ ํ…œํ”Œ๋ฆฟ

```
**์‚ฌ์šฉ์ž:** {prompt}
**์ธ๊ณต์ง€๋Šฅ:**
```

## ์‚ฌ์šฉ

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "blueapple8259/tzem-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = "์—ฌ๊ธฐ์— ํ”„๋กฌํ”„ํŠธ ์ž…๋ ฅ"

text = f"**์‚ฌ์šฉ์ž:** {prompt}\n**์ธ๊ณต์ง€๋Šฅ:"

inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_new_tokens=100,
    pad_token_id=tokenizer.pad_token_id,
    eos_token_id=tokenizer.eos_token_id,
    do_sample=True,
    top_p=0.2,
)

output = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(output)
```

### ์˜ˆ์‹œ

```text
**์‚ฌ์šฉ์ž:** ์ธํ„ฐ๋„ท ๋ธŒ๋ผ์šฐ์ €์— ๋Œ€ํ•ด ์•Œ๋ ค์ค˜.
**์ธ๊ณต์ง€๋Šฅ:** ์ธํ„ฐ๋„ท ๋ธŒ๋ผ์šฐ์ €๋Š” ์ธํ„ฐ๋„ท์—์„œ ์ •๋ณด๋ฅผ ๊ฒ€์ƒ‰ํ•˜๊ณ , ๋‹ค๋ฅธ ์‚ฌ์šฉ์ž์™€ ์†Œํ†ตํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด์ž…๋‹ˆ๋‹ค.
```

```text
**์‚ฌ์šฉ์ž:** ๊ฑด๊ฐ•์„ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•œ ์„ธ ๊ฐ€์ง€ ํŒ์„ ์•Œ๋ ค์ฃผ์„ธ์š”.
**์ธ๊ณต์ง€๋Šฅ:** 1. ์ถฉ๋ถ„ํ•œ ์ˆ˜๋ฉด์„ ์ทจํ•˜์„ธ์š”.
2. ๊ฑด๊ฐ•ํ•œ ์‹๋‹จ์„ ์„ญ์ทจํ•˜์„ธ์š”.
3. ๊ทœ์น™์ ์œผ๋กœ ์šด๋™ํ•˜์„ธ์š”.
```

## ๋ฐ์ดํ„ฐ์…‹

- [maywell/koVast](https://huggingface.co/datasets/maywell/koVast)

- [KoAlpaca](https://raw.githubusercontent.com/Beomi/KoAlpaca/refs/heads/main/ko_alpaca_data.json) - ์ฝ”๋“œ, ํ‘œ๊ฐ€ ํฌํ•จ๋œ ๋ฐ์ดํ„ฐ ์ œ์™ธ