File size: 29,558 Bytes
8eb2cb0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
# komt : Korean Multi-task Instruction Tuning
![multi task instruction tuning.jpg](images%2Fmulti%20task%20instruction%20tuning.jpg)

Recently, due to the success of ChatGPT, numerous large language models have emerged in an attempt to catch up with ChatGPT's capabilities. 
However, when it comes to Korean language performance, it has been observed that many models still struggle to provide accurate answers or generate Korean text effectively. 
This study addresses these challenges by introducing a multi-task instruction technique that leverages supervised datasets from various tasks to create training data for Large Language Models (LLMs).

## News or Update
### 2023.12.05
- dpo train μ½”λ“œ 곡개 [dpo_train.py](dpo_train.py)

### 2023.11.29
- komt-mistral-7b-v1-dpo : dpo(Direct Preference Optimization) ν•™μŠ΅ λͺ¨λΈ μΆ”κ°€
> - [davidkim205/komt-mistral-7b-v1-dpo](https://huggingface.co/davidkim205/komt-mistral-7b-v1-dpo/blob/main/README.md)
- komt-mistral-7b-v1-dpo 평가결과 ν˜„μž¬ komtλͺ¨λΈ μ€‘μ—μ„œ κ°€μž₯높은 μ„±λŠ₯인 76.75%기둝.. (gpt-3.5-turbo 79.45%)
 
### 2023.10.24
- komt-mistral-7b-v1 λͺ¨λΈ μΆ”κ°€
> - [davidkim205/komt-mistral-7b-v1](https://huggingface.co/davidkim205/komt-mistral-7b-v1)
> - [davidkim205/komt-mistral-7b-v1-lora](https://huggingface.co/davidkim205/komt-mistral-7b-v1-lora)
> - [davidkim205/komt-mistral-7b-v1-gguf](https://huggingface.co/davidkim205/komt-mistral-7b-v1-gguf)

### 2023.10.20
- komt-llama-30b-v1 λͺ¨λΈ μΆ”κ°€
> - [davidkim205/komt-llama-30b-v1](https://huggingface.co/davidkim205/komt-llama-30b-v1)
> - [davidkim205/komt-llama-30b-v1-lora](https://huggingface.co/davidkim205/komt-llama-30b-v1-lora)


### 2023.09.27
- chatgpt 기반 평가 결과에 μ•„λž˜ λͺ¨λΈ μΆ”κ°€
> - naver Cue
> - clova X
> - nlpai-lab/kullm-polyglot-12.8b-v2
> - kfkas/Llama-2-ko-7b-Chat
> - beomi/KoAlpaca-Polyglot-12.8B

### 2023.09.25
- komt-llama2-13b-v1 λͺ¨λΈ μΆ”κ°€
> - [davidkim205/komt-llama2-13b-v1](https://huggingface.co/davidkim205/komt-llama2-13b-v1)
> - [davidkim205/komt-llama2-13b-v1-lora](https://huggingface.co/davidkim205/komt-llama2-13b-v1-lora)
> - [davidkim205/komt-llama2-13b-v1-ggml](https://huggingface.co/davidkim205/komt-llama2-13b-v1-ggml)
### 2023.09.24
- Fine-tune with deepspeed ν•™μŠ΅ 방법 μΆ”κ°€
### 2023.09.23
- usage komt with vllm μ½”λ“œμ™€ μ„€μΉ˜ 방법 μΆ”κ°€
### 2023.09.22
- λͺ¨λΈ 평가 κ²°κ³Όν‘œ μΆ”κ°€
### 2023.09.20
- finetune_with_lora ν•™μŠ΅μ‹œ 4bit, 8bit μ„ νƒν•˜μ—¬ ν•™μŠ΅ν• μˆ˜ μžˆλ„λ‘ κΈ°λŠ₯μΆ”κ°€
### 2023.09.19
- komt-llama2 λͺ¨λΈμ„ μ‰½κ²Œ μ‚¬μš©ν• μˆ˜ μžˆλ„λ‘ μ˜ˆμ œμ™€ ν•™μŠ΅ 방법, 데이터셋을 μΆ”κ°€ν•©λ‹ˆλ‹€.
### 2023.09.17 
- κ°œμ„ λœ multi-task dataset으둜 ν•™μŠ΅ν•œ komt-llama2-7b-v1 λͺ¨λΈμ„ λ°°ν¬ν•©λ‹ˆλ‹€.(가끔씩 end token 적용이 μ•ˆλ˜λŠ” 문제, 닡변을 λ„ˆλ¬΄ 길게 ν•˜λŠ” λ¬Έμ œλ“± μˆ˜μ •) 
- [davidkim205/komt-llama2-7b-v1](https://huggingface.co/davidkim205/komt-llama2-7b-v1)
- [davidkim205/komt-llama2-7b-v1-lora](https://huggingface.co/davidkim205/komt-llama2-7b-v1-lora)
- [davidkim205/komt-llama2-7b-v1-ggml](https://huggingface.co/davidkim205/komt-llama2-7b-v1-ggml) 
### 2023.08.16 
- We are releasing the [davidkim205/komt-Llama-2-7b-chat-hf-ggml](https://huggingface.co/davidkim205/komt-Llama-2-7b-chat-hf-ggml) model
### 2023.08.17
- We are releasing the [davidkim205/komt-Llama-2-13b-hf-lora](https://huggingface.co/davidkim205/komt-Llama-2-13b-hf-lora) and [davidkim205/komt-Llama-2-13b-hf-ggml]https://huggingface.co/davidkim205/komt-Llama-2-13b-hf-ggml) models

## Released Model Checkpoints
### komt-llama2-7b
- [davidkim205/komt-llama2-7b-v1](https://huggingface.co/davidkim205/komt-llama2-7b-v1)
- [davidkim205/komt-llama2-7b-v1-lora](https://huggingface.co/davidkim205/komt-llama2-7b-v1-lora)
- [davidkim205/komt-llama2-7b-v1-ggml](https://huggingface.co/davidkim205/komt-llama2-7b-v1-ggml)
### komt-llama2-13b
- [davidkim205/komt-llama2-13b-v1](https://huggingface.co/davidkim205/komt-llama2-13b-v1)
- [davidkim205/komt-llama2-13b-v1-lora](https://huggingface.co/davidkim205/komt-llama2-13b-v1-lora)
- [davidkim205/komt-llama2-13b-v1-ggml](https://huggingface.co/davidkim205/komt-llama2-13b-v1-ggml)
### komt-llama-30b
- [davidkim205/komt-llama-30b-v1](https://huggingface.co/davidkim205/komt-llama-30b-v1)
- [davidkim205/komt-llama-30b-v1-lora](https://huggingface.co/davidkim205/komt-llama-30b-v1-lora)
### komt-mistral-7b
- [davidkim205/komt-mistral-7b-v1](https://huggingface.co/davidkim205/komt-mistral-7b-v1)
- [davidkim205/komt-mistral-7b-v1-lora](https://huggingface.co/davidkim205/komt-mistral-7b-v1-lora)
- [davidkim205/komt-mistral-7b-v1-gguf](https://huggingface.co/davidkim205/komt-mistral-7b-v1-gguf)
- [davidkim205/komt-mistral-7b-v1-dpo](https://huggingface.co/davidkim205/komt-mistral-7b-v1-dpo)
## Hardware and Software
- nvidia driver : 535.54.03
- CUDA Version: 12.2

## Setup

```
git clone https://github.com/davidkim205/komt.git
cd komt

conda create -n komt python=3.10
conda activate komt

pip install -r requirements.txt

```
## Usage
μš°λ¦¬λŠ” komt-llama2 λͺ¨λΈμ„ μ‚¬μš©ν• μˆ˜ μžˆλŠ” λ‹€μ–‘ν•œ 방법을 μ œκ³΅ν•©λ‹ˆλ‹€.

## transformers
``` 
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers import TextStreamer, GenerationConfig

model_name='davidkim205/komt-llama2-7b-v1'
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)
streamer = TextStreamer(tokenizer)

def gen(x):
    generation_config = GenerationConfig(
        temperature=0.8,
        top_p=0.8,
        top_k=100,
        max_new_tokens=512,
        early_stopping=True,
        do_sample=True,
    )
    q = f"### instruction: {x}\n\n### Response: "
    gened = model.generate(
        **tokenizer(
            q,
            return_tensors='pt',
            return_token_type_ids=False
        ).to('cuda'),
        generation_config=generation_config,
        pad_token_id=tokenizer.eos_token_id,
        eos_token_id=tokenizer.eos_token_id,
        streamer=streamer,
    )
    result_str = tokenizer.decode(gened[0])

    start_tag = f"\n\n### Response: "
    start_index = result_str.find(start_tag)

    if start_index != -1:
        result_str = result_str[start_index + len(start_tag):].strip()
    return result_str

print(gen('μ œμ£Όλ„λ₯Ό 1λ°•2일둜 혼자 μ—¬ν–‰ν•˜λ €κ³  ν•˜λŠ”λ° μ—¬ν–‰ μ½”μŠ€λ₯Ό λ§Œλ“€μ–΄μ€˜'))
```
κ²°κ³Ό
``` 
### Response: μ œμ£Όλ„λ₯Ό 1λ°•2일둜 혼자 μ—¬ν–‰ν•˜λ €λ©΄ λ‹€μŒκ³Ό 같은 μ—¬ν–‰ μ½”μŠ€λ₯Ό λ§Œλ“€μ–΄ κ³„νšν•  수 μžˆμŠ΅λ‹ˆλ‹€:

1일차:
- μ•„μΉ¨: μ œμ£Όλ„μ˜ μ•„λ¦„λ‹€μš΄ 해변을 κ΅¬κ²½ν•˜κΈ° μœ„ν•΄ 해변에 λ„μ°©ν•˜μ„Έμš”. μΌμΆœμ„ κ°μƒν•˜λ©° μžμ—°μ˜ 아름닀움을 λ§Œλ½ν•˜μ„Έμš”.
- μ˜€ν›„: μ œμ£Όλ„μ˜ λŒ€ν‘œμ μΈ 관광지인 ν•œλΌμ‚°μ„ νƒν—˜ν•˜μ„Έμš”. λ“±μ‚°λ‘œλ₯Ό 따라 μ˜¬λΌκ°€λ©΄μ„œ 경치λ₯Ό 즐기고 μ„€λͺ…을 λ“£μœΌλ©° μ‰¬μš΄ 산책을 μ¦κΈ°μ„Έμš”.
- 저녁: μ œμ£Όλ„μ˜ λ§›μžˆλŠ” μŒμ‹μ μ—μ„œ 저녁을 λ³΄λ‚΄μ„Έμš”. μ‹ μ„ ν•œ ν•΄μ‚°λ¬Όκ³Ό ν–₯μ‹ λ£Œλ‘œ λ§Œλ“  μŒμ‹μ„ λ§›λ³΄λŠ” 것은 μ œμ£Όλ„ μ—¬ν–‰μ˜ μ™„λ²½ν•œ κ²½ν—˜μ΄ 될 κ²ƒμž…λ‹ˆλ‹€.

2일차:
- μ•„μΉ¨: ν•œλΌμ‚° μΌλŒ€λ₯Ό νƒν—˜ν•˜κΈ° μœ„ν•΄ ν•œλΌμ‚° μΌ€μ΄ν”„λ‘œ μ΄λ™ν•˜μ„Έμš”. 이 μΌ€μ΄ν”„λŠ” 등산을 μ¦κΈ°λŠ” μ‚¬λžŒλ“€μ—κ²Œ 졜적의 μ„ νƒμž…λ‹ˆλ‹€. 

```
### text-generation-webui
![text-generation-webui.gif](images%2Ftext-generation-webui.gif)

``` 
# text-generation-webui μ½”λ“œ λ°›κΈ°
git clone https://github.com/oobabooga/text-generation-webui
cd text-generation-webui/

# conda ν™˜κ²½μƒμ„± 
conda create -n text-generation-webui python=3.10
conda activate text-generation-webui

# pip install
pip install -r requirements.txt

# model download
pip install huggingface-hub
python -c "from huggingface_hub import hf_hub_download;print(hf_hub_download(repo_id='davidkim205/komt-llama2-7b-v1-ggml', filename='ggml-model-q4_0.gguf', local_dir='./models/'))"
 
# server μ‹€ν–‰
python server.py
```
### llama2-webui
![llama2-webui.gif](images%2Fllama2-webui.gif)

https://github.com/liltom-eth/llama2-webui

llama2-webuiλ₯Ό git cloneν›„ requirementsλ₯Ό install ν•©λ‹ˆλ‹€. κ·ΈλŸ°λ‹€μŒ  μš©λŸ‰μ΄ ν¬κΈ°λ•Œλ¬Έμ— git lfs을 μ΄μš©ν•˜μ—¬ komt-llama2-7bλ₯Ό λ‹€μš΄λ‘œλ“œ λ°›μŠ΅λ‹ˆλ‹€.

``` 
git clone https://github.com/liltom-eth/llama2-webui.git
cd llama2-webui
pip install -r requirements.txt
```
model을 λ‹€μš΄λ‘œλ“œν›„ app을 μ‹€ν–‰ν•©λ‹ˆλ‹€.
```
sudo apt install git-lfs
git lfs clone https://huggingface.co/davidkim205/komt-llama2-7b-v1

python app.py --backend_type transformers --model_path ./komt-llama2-7b-v1/

```
### llama.cpp 
![llama.cpp-example.gif](images%2Fllama.cpp-example.gif)
```
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
pip install -r requirements.txt

pip install huggingface-hub
python -c "from huggingface_hub import hf_hub_download;print(hf_hub_download(repo_id='davidkim205/komt-llama2-7b-v1-ggml', filename='ggml-model-q4_0.gguf', local_dir='./models/'))"

make -j && ./main -m ./models/ggml-model-q4_0.gguf -p "인삼은 μ–΄λ–€ νš¨κ³Όκ°€ μžˆλŠ”κ°€μš”? ##output:"
```
### llama.cpp with google colab
google colabμ—μ„œ llama.cppλ₯Ό μ‚¬μš©ν•˜μ—¬ komtλ₯Ό μ‚¬μš©ν•˜λŠ” 방법 

https://colab.research.google.com/drive/1uLHXv-6NT7yj4FHECrZezfo5pVL-ht63?usp=sharing


### usage_komt_with_lora
pythonκ³Ό jupyterλ₯Ό μ΄μš©ν•œ μ˜ˆμ œμž…λ‹ˆλ‹€.
- [usage_komt_with_lora.py](usage_komt_with_lora.py)
- [usage_komt_with_lora.ipynb](usage_komt_with_lora.ipynb)
``` 
$ python infer.py 
Downloading (…)/adapter_config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 528/528 [00:00<00:00, 5.02MB/s]
Downloading (…)lve/main/config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 631/631 [00:00<00:00, 4.96MB/s]
Downloading pytorch_model.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 27.0G/27.0G [04:29<00:00, 100MB/s]
Downloading (…)neration_config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 183/183 [00:00<00:00, 1.36MB/s]
Downloading adapter_model.bin: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 80.1M/80.1M [00:00<00:00, 82.7MB/s]
Downloading (…)okenizer_config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 749/749 [00:00<00:00, 6.66MB/s]
Downloading tokenizer.model: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 500k/500k [00:00<00:00, 111MB/s]
Downloading (…)in/added_tokens.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 21.0/21.0 [00:00<00:00, 131kB/s]
Downloading (…)cial_tokens_map.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 96.0/96.0 [00:00<00:00, 608kB/s]
/home/david/anaconda3/envs/komt/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:399: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`. This was detected when initializing the generation config instance, which means the corresponding file may hold incorrect parameterization and should be fixed.
  warnings.warn(
/home/david/anaconda3/envs/komt/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:399: UserWarning: `num_beams` is set to 1. However, `early_stopping` is set to `True` -- this flag is only used in beam-based generation modes. You should set `num_beams>1` or unset `early_stopping`.
  warnings.warn(
<s> ### instruction: κ³ μ–‘μ΄λŠ” μ™œ 물을 μ‹«μ–΄ν•˜λ‚˜μš”?

### Response: κ³ μ–‘μ΄λŠ” μ‚¬λžŒκ³Ό 달리 물을 μ‹«μ–΄ν•©λ‹ˆλ‹€. μ΄λŠ” 물에 λ…Ήμ•„ μžˆλŠ” ν—€μ–΄μ³λ°œκ³Ό 물의 λƒ„μƒˆ λ•Œλ¬Έμž…λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” ν—€μ–΄μ³λ°œμ΄ 물에 λ…Ήμ•„ 있으면 물을 λ§ˆμ‹œκ³  μ‹Άμ§€ μ•Šμ•„ν•˜λ©°, 물의 λƒ„μƒˆμ—λ„ λ―Όκ°ν•©λ‹ˆλ‹€. μ΄λŸ¬ν•œ 이유둜 κ³ μ–‘μ΄λŠ” 물을 μ‹«μ–΄ν•˜κ²Œ λ˜μ—ˆμŠ΅λ‹ˆλ‹€. 

κ³ μ–‘μ΄λŠ” μ‚¬λžŒκ³Ό 달리 체온이 λ†’μ•„ μ²΄μ˜¨μ„ μœ μ§€ν•˜κΈ° μœ„ν•΄ λ§Žμ€ 칼둜리λ₯Ό ν•„μš”λ‘œ ν•©λ‹ˆλ‹€. λ”°λΌμ„œ κ³ μ–‘μ΄λŠ” 물을 λ§ˆμ‹œμ§€ μ•Šκ³  물을 μ‹«μ–΄ν•©λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” μ²΄μ˜¨μ„ μœ μ§€ν•˜κΈ° μœ„ν•΄ 물을 μ„­μ·¨ν•˜μ§€ μ•ŠμœΌλ©°, 물을 λ§ˆμ‹œκ³  μ‹Άμ§€ μ•ŠμŠ΅λ‹ˆλ‹€. 

λ˜ν•œ, κ³ μ–‘μ΄λŠ” 물을 λ§ˆμ‹œλ©΄ 손이 μ°¨κ°€μ›Œμ§€λŠ” λ“± 물에 λ…Ήμ•„ μžˆλŠ” ν—€μ–΄μ³λ°œ λ•Œλ¬Έμ— 물을 μ‹«μ–΄ν•©λ‹ˆλ‹€. ν—€μ–΄μ³λ°œμ€ 물을 λ…Ήμ—¬ 손을 
κ³ μ–‘μ΄λŠ” μ‚¬λžŒκ³Ό 달리 물을 μ‹«μ–΄ν•©λ‹ˆλ‹€. μ΄λŠ” 물에 λ…Ήμ•„ μžˆλŠ” ν—€μ–΄μ³λ°œκ³Ό 물의 λƒ„μƒˆ λ•Œλ¬Έμž…λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” ν—€μ–΄μ³λ°œμ΄ 물에 λ…Ήμ•„ 있으면 물을 λ§ˆμ‹œκ³  μ‹Άμ§€ μ•Šμ•„ν•˜λ©°, 물의 λƒ„μƒˆμ—λ„ λ―Όκ°ν•©λ‹ˆλ‹€. μ΄λŸ¬ν•œ 이유둜 κ³ μ–‘μ΄λŠ” 물을 μ‹«μ–΄ν•˜κ²Œ λ˜μ—ˆμŠ΅λ‹ˆλ‹€. 

κ³ μ–‘μ΄λŠ” μ‚¬λžŒκ³Ό 달리 체온이 λ†’μ•„ μ²΄μ˜¨μ„ μœ μ§€ν•˜κΈ° μœ„ν•΄ λ§Žμ€ 칼둜리λ₯Ό ν•„μš”λ‘œ ν•©λ‹ˆλ‹€. λ”°λΌμ„œ κ³ μ–‘μ΄λŠ” 물을 λ§ˆμ‹œμ§€ μ•Šκ³  물을 μ‹«μ–΄ν•©λ‹ˆλ‹€. κ³ μ–‘μ΄λŠ” μ²΄μ˜¨μ„ μœ μ§€ν•˜κΈ° μœ„ν•΄ 물을 μ„­μ·¨ν•˜μ§€ μ•ŠμœΌλ©°, 물을 λ§ˆμ‹œκ³  μ‹Άμ§€ μ•ŠμŠ΅λ‹ˆλ‹€. 

```
### usage komt with vllm
![vllm.gif](images%2Fvllm.gif)
vllm 라이브러리λ₯Ό μ‚¬μš©ν•˜κΈ° μœ„ν•΄μ„œλŠ” μ•„λž˜μ™€ 같이 conda ν™˜κ²½μ„ μƒμ„±ν•œν›„μ— requirements_vllm.txt으둜 νŒ¨ν‚€μ§€λ“€μ„ μ„€μΉ˜ν•΄μ•Όν•©λ‹ˆλ‹€.
``` 
conda create -n vllm python=3.10
conda activate vllm
pip install -r requirements_vllm.txt
```
예제 μ½”λ“œλŠ” μ•„λž˜μ™€ 같이 μ‹€ν–‰ν•œν›„μ— μ§ˆλ¬Έμ„ μž…λ ₯ν•˜λ©΄ λ©λ‹ˆλ‹€.
``` 
$ python usage_komt_with_vllm.py 
INFO 09-25 18:48:20 llm_engine.py:72] Initializing an LLM engine with config: model='davidkim205/komt-llama2-7b-v1', tokenizer='davidkim205/komt-llama2-7b-v1', tokenizer_mode=auto, trust_remote_code=False, dtype=torch.float16, download_dir=None, load_format=auto, tensor_parallel_size=1, seed=0)
INFO 09-25 18:48:20 tokenizer.py:30] For some LLaMA-based models, initializing the fast tokenizer may take a long time. To eliminate the initialization time, consider using 'hf-internal-testing/llama-tokenizer' instead of the original tokenizer.
INFO 09-25 18:48:36 llm_engine.py:199] # GPU blocks: 1048, # CPU blocks: 512
>μ œμ£Όλ„ 데이트 μ½”μŠ€ μ•Œλ €μ€˜
Processed prompts: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1/1 [00:15<00:00, 15.30s/it]
Prompt: '### instruction: μ œμ£Όλ„ 데이트 μ½”μŠ€ μ•Œλ €μ€˜\n\n### Response: ', Generated text: 'μ œμ£Όλ„ 데이트 μ½”μŠ€ μ•Œλ €λ“œλ¦¬κ² μŠ΅λ‹ˆλ‹€.\n1. 아침에 일찍 μΌμ–΄λ‚˜μ„œ μ œμ£Όμƒκ³΅μ›μ—μ„œ μ•„μΉ¨ 해돋이λ₯Ό 보쩰 인사λ₯Ό λ“œλ¦½λ‹ˆλ‹€.\n2. 상곡원을 λŒμ•„λ‹€λ‹ˆλ©° μžμ—°μ˜ 아름닀움을 λ§Œλ½ν•©λ‹ˆλ‹€. 특히, μš©λ‘λ³΄ 폭포λ₯Ό κ±΄λ„ˆ λ‹€λ‹ˆλ©° λ©‹μ§„ 경치λ₯Ό κ°μƒν•©λ‹ˆλ‹€.\n3. μ˜€ν›„ 1μ‹œμ―€ μ œμ£Όμ‹œμ˜ 유λͺ…ν•œ ν–₯κΈ°λ₯Ό 맑을 수 μžˆλŠ” μ„±μ‚°μΌμΆœλ΄‰ 근처 퍼즐을 ν’€μ–΄λ³΄μ„Έμš”. μ—¬κΈ°μ—μ„œλŠ” λ…Έλž˜λ°©, 샀프심 κ°•μ—°, μ›Œμ»€νž μ»¨μ„œνŠΈ, ν•œλΌμ‚°μ„± 발견 μ—¬μˆ™ λ“± ν₯미둜운 μ²΄ν—˜μ„ ν•  수 μžˆμŠ΅λ‹ˆλ‹€.\n4. 제주특유의 λ‹€μ–‘ν•œ ν•΄μ‚°λ¬Ό (ν•΄μ΄ˆ, κΉ€μΉ˜, 해석 λ“±)을 κ΅¬κ²½ν•˜κ³  μ‹Άλ‹€λ©΄, μžμ£Όμ§“λ„€λ―Έλ‚˜ μ œμ£Όμ‹œμ˜ μ „ν†΅μ‹œμž₯을 λ°©λ¬Έν•΄λ³΄μ„Έμš”. ν•΄μ‚°λ¬Ό 사찰 κ·Όμ²˜μ— μœ„μΉ˜ν•œ νŠΉμˆ˜μ‹œμž₯μ—μ„œλŠ” μ œμ£Όκ°κ·€μ„ λ§›λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€.\n5. λ§ˆμ§€λ§‰μœΌλ‘œ μ €λ…μ—λŠ” μ„±μ‚°μΌμΆœλ΄‰μ—μ„œ ν•œλΌμ‚°μ˜ μΌμΆœμ„ λ³Ό 수 μžˆμŠ΅λ‹ˆλ‹€. μΌμΆœμ„ κ°μƒν•˜λ©° κ·Έ 아름닀움에 λŒ€ν•œ 감사λ₯Ό ν‘œν˜„ν•©λ‹ˆλ‹€.\n\n이제 μ œμ£ΌνŠΉλ³„μ˜ λ§€λ ₯을 즐기싀 μ€€λΉ„κ°€ λ˜μ…¨λ‚˜μš”? ν—›λœ μΌμƒμ—μ„œ λ²—μ–΄λ‚˜ μ—¬μœ λ‘œμ›€μ„ λŠλ‚„ 수 μžˆλŠ” μ œμ£Όλ„ 데이트 μ½”μŠ€λ₯Ό μ¦κΈ°λ³΄μ„Έμš”.'

```


## Fine-tune
komt-llama2 λͺ¨λΈμ„ ν•™μŠ΅μ‹œν‚€λŠ” 방법을 μ œκ³΅ν•©λ‹ˆλ‹€. 

λ…Όλ¬Έκ³Ό λ°°ν¬ν•œ λͺ¨λΈμ— μ‚¬μš©ν•œ 데이터셋쀑 λΌμ΄μ„ΌμŠ€κ°€ μ—†λŠ” KorQuAD 1.0 데이터셋을 datasets에 μΆ”κ°€ν–ˆμŠ΅λ‹ˆλ‹€.

논문에 λŒ€ν•œ μžμ„Έν•œ λ‚΄μš©μ€ μ•„λž˜ Korean Multi-task Instruction Tuning λ₯Ό μ°Έκ³ ν•˜μ„Έμš”.

### Fine-tune with lora
![finetune_with_lora.gif](images%2Ffinetune_with_lora.gif)
λ¨Όμ € githubμ—μ„œ μ½”λ“œλ₯Ό 받은후 νŒ¨ν‚€μ§€λ₯Ό μ„€μΉ˜ν•©λ‹ˆλ‹€.(μœ„ setupμ°Έμ‘°)

finetune_with_lora.pyλŠ” custom dataset을 μ΄μš©ν•˜μ—¬ λͺ¨λΈ ν•™μŠ΅μ„ μœ„ν•œ μ½”λ“œμž…λ‹ˆλ‹€.
기본적으둜 μ•„λž˜μ™€ 같이 argumentκ°€ μ—†μ„κ²½μš° default둜 davidkim205/komt-llama2-7b-v1λͺ¨λΈμ„ base둜 [komt_squad.json](datasets%2Fkomt_squad.json)둜 ν•™μŠ΅μ΄ μ§„ν–‰λ©λ‹ˆλ‹€.
``` 

python finetune_with_lora.py

```
λͺ¨λΈμ΄λ‚˜ dataset μ΄λ‚˜ batchsize등은 μ•„λž˜μ™€ 같이 μˆ˜μ •μ΄ κ°€λŠ₯ν•©λ‹ˆλ‹€.
```
python finetune_with_lora.py --model_name_or_path davidkim205/komt-llama2-7b-v1 --data_path datasets/komt_squad.json --num_train_epochs 1 --per_device_train_batch_size 1 --learning_rate 1e-5
```
보닀 μžμ„Έν•œ argument에 λŒ€ν•œ μžμ„Έν•œ μ„€λͺ…은 `python finetune_with_lora.py  -h` ν™•μΈν•˜μ„Έμš”.

#### finetune 8-bit models with Low Rank Adaption (LoRA)
finetune_with_lora.pyλŠ”  기본적으둜 4-bit둜 μ–‘μžν™”ν•˜μ—¬ ν•™μŠ΅μ„ ν•©λ‹ˆλ‹€. 
8bit둜 μ–‘μžν™”ν• κ²½μš° μ•„λž˜μ™€ 같이 μ‚¬μš©ν•˜λ©΄ λ©λ‹ˆλ‹€.
```
python finetune_with_lora.py --bits 8
```
### Fine-tune with deepspeed
finetune_with_ds.py은 DeepSpeed기반으둜 ZeRO-3 Offload을 μ‚¬μš©ν•˜μ—¬ ν•™μŠ΅μ„ ν•©λ‹ˆλ‹€. 
CPU Offloading을 ν†΅ν•˜μ—¬ GPU λ©”λͺ¨λ¦¬ μ‚¬μš©λŸ‰μ„ μ€„μ§€λ§Œ CPU λ©”λͺ¨λ¦¬λ₯Ό μ‚¬μš©ν•˜κΈ°λ•Œλ¬Έμ— hw 사양에 맞게 쑰정을 ν•΄μ•Όν•©λ‹ˆλ‹€.
deepspeed νŒŒμΌμ€ configs/deepseed_config.json에 μΆ”κ°€ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

deepspeedλ₯Ό μ΄μš©ν• κ²½μš° μ•„λž˜μ™€ 같이 conda ν™˜κ²½μ„ μΆ”κ°€ν•œλ‹€μŒ ν•΄λ‹Ή νŒ¨ν‚€μ§€λ₯Ό μ„€μΉ˜ν•΄μ•Ό ν•©λ‹ˆλ‹€.
``` 
conda create -n ds python=3.10
conda activate ds
pip install -r requirements_ds.txt
```

finetune_with_deepspeed μ‚¬μš©λ°©λ²•μ€ μ•„λž˜μ™€ κ°™μŠ΅λ‹ˆλ‹€.
``` 
deepspeed finetune_with_ds.py
```
argument μˆ˜μ •μ‹œ μ•„λž˜λ₯Ό μ°Έκ³ ν•˜μ„Έμš”.
``` 
deepspeed finetune_with_ds.py --model_name_or_path davidkim205/komt-llama2-7b-v1 --data_path datasets/komt_squad.json --num_train_epochs 1 --per_device_train_batch_size 1 --learning_rate 1e-5 --deepspeed configs/deepspeed_config.json
```
### Fine-tune with Direct Preference Optimization (DPO) 
μƒμš©μ„œλΉ„μŠ€λ₯Ό μœ„ν•œ Direct Preference Optimizationλ₯Ό μ΄μš©ν•˜μ—¬ λͺ¨λΈ ν•™μŠ΅ν• μˆ˜ μžˆλ„λ‘ train μ½”λ“œμ™€ λͺ¨λΈμ„ κ³΅κ°œν•©λ‹ˆλ‹€. 

DPO ν•™μŠ΅μ΄ 잘되렀면 SFTλ₯Ό μž˜ν•΄μ•Ό ν•˜λŠ”λ° 이미 ν•™μŠ΅λœ komtλ₯Ό μ΄μš©ν•˜μ—¬ λͺ¨λΈμ„ ν•™μŠ΅ν•˜μ˜€κ³ , κΈ°μ‘΄ λͺ¨λΈλŒ€λΉ„ 5% μ„±λŠ₯ν–₯상이 μžˆμ—ˆμœΌλ©° λ™μΌν•œ μ§ˆλ¬Έμ— λ™μΌν•œ 닡변을 ν• μˆ˜ μžˆλŠ” λͺ¨λΈμ„ κ°œλ°œν•˜μ˜€μŠ΅λ‹ˆλ‹€.

ν•œκΈ€ 데이터셋은 maywell/ko_Ultrafeedback_binarized 을 μ‚¬μš©ν•˜μ˜€μŠ΅λ‹ˆλ‹€.

dpo_train.py λ₯Ό μ‹€ν–‰ν•˜κΈ° μœ„ν•˜μ—¬ requirements_dpo.txtλ₯Ό μ„€μΉ˜ν•˜μ—¬μ•Ό ν•©λ‹ˆλ‹€.
μ„€μΉ˜μ˜ˆμž…λ‹ˆλ‹€.
```
conda create -n dpo_train python=3.10
conda activate dpo_train
pip install -r requirements_dpo.txt
```
μ„€μΉ˜ν›„ `accelerate config`λ₯Ό μ΄μš©ν•˜μ—¬ accelerate config μ„€μ •ν•©λ‹ˆλ‹€.
``` 
accelerate config
```
κ·Έ 후에 accelerate launchλ₯Ό ν†΅ν•˜μ—¬ dpo_train을 ν•©λ‹ˆλ‹€.
```
accelerate launch dpo_train.py
```
A100 1λŒ€κΈ°μ€€μœΌλ‘œ 9μ‹œκ°„ 정도 κ±Έλ¦½λ‹ˆλ‹€.
```  
 warnings.warn(
  0%|                                             | 1/1000 [00:36<10:13:09, 36.83s/it]Token indices sequence length is longer than the specified maximum sequence length for this model (1069 > 1024). Running this sequence through the model will result in indexing errors
{'loss': 0.6961, 'learning_rate': 5e-05, 'rewards/chosen': 0.004012207966297865, 'rewards/rejected': 0.007965649478137493, 'rewards/accuracies': 0.515625, 'rewards/margins': -0.003953440580517054, 'logps/rejected': -222.7124481201172, 'logps/chosen': -259.6094665527344, 'logits/rejected': -2.6427276134490967, 'logits/chosen': -2.6100172996520996, 'epoch': 0.01}
  2%|β–Š                                            | 17/1000 [09:31<8:50:11, 32.36s/it]
```

dpo에 λŒ€ν•œ μžμ„Έν•œ λ‚΄μš©μ€ λ‹€μŒ λ¬Έμ„œλ₯Ό μ°Έκ³ ν•˜μ„Έμš”. https://arxiv.org/abs/2305.18290

## 평가결과
chatgptλ₯Ό μ΄μš©ν•˜μ—¬ 질문과 λŒ€λ‹΅μ—λŒ€ν•œ 평가λ₯Ό μ•„λž˜μ™€ 같이 μ§„ν–‰ν•˜μ˜€μŠ΅λ‹ˆλ‹€. λͺ¨λΈ 평가λ₯Ό μœ„ν•œ 질문과 λ‹΅λ³€ chatgpt의 평가 κ²°κ³ΌλŠ” eval_resultsλ₯Ό μ°Έκ³ ν•˜μ„Έμš”.


| model                                    | score   | average(0~5) | percentage |
|------------------------------------------|---------| ------------ |------------|
| gpt-3.5-turbo(close)                     | 147     | 3.97         | 79.45%     |
| naver Cue(close)                         | 140     | 3.78         | 75.67%     |
| clova X(close)                           | 136     | 3.67         | 73.51%     |
| WizardLM-13B-V1.2(open)                  | 96      | 2.59         | 51.89%     |
| Llama-2-7b-chat-hf(open)                 | 67      | 1.81         | 36.21%     |
| Llama-2-13b-chat-hf(open)                | 73      | 1.91         | 38.37%     |
| nlpai-lab/kullm-polyglot-12.8b-v2(open)  | 70      | 1.89         | 37.83%     |
| kfkas/Llama-2-ko-7b-Chat(open)           | 96      | 2.59         | 51.89%     |
| beomi/KoAlpaca-Polyglot-12.8B(open)      | 100     | 2.70         | 54.05%     |
| **komt-llama2-7b-v1 (open)(ours)**       | **117** | **3.16**     | **63.24%** |
| **komt-llama2-13b-v1  (open)(ours)**     | **129** | **3.48**     | **69.72%** |
| **komt-llama-30b-v1  (open)(ours)**      | **129** | **3.16**     | **63.24%** |
| **komt-mistral-7b-v1  (open)(ours)**     | **131** | **3.54**     | **70.81%** |
| **komt-mistral-7b-v1-dpo  (open)(ours)** | **142** | **3.83**     | **76.75%** |

----
# Korean Multi-task Instruction Tuning

## Abstract
With the recent success of ChatGPT, numerous large language models have emerged in an attempt to catch up with ChatGPT's capabilities. However, it has become evident that these models still struggle to provide accurate responses in Korean or face challenges when generating Korean text. In this study, we introduce the multi-task instruction technique, which is based on supervised datasets from various tasks, to create training data for large language models, aiming to address these issues.

## Introduction

The recent Korean large language models, such as GPT-4-LLM, Dolly, and Vicuna, have predominantly relied on translated datasets. However, using translated datasets presents several challenges:

- Language and Cultural Differences
Languages and cultures have unique expressions, vocabularies, and grammatical structures. Using translated datasets can hinder the model's ability to understand and learn effectively due to these differences.
- Translation Errors and Semantic Distortions
Machine translations are not perfect and can introduce errors or distort the meaning of the original text. This can lead to the model learning incorrect information or failing to grasp the true meaning of the source data.
- Data Quality
The quality of translated data depends on the accuracy of the source data. If the source data is inaccurate or noisy, the translated data can suffer from the same issues.
- Word Embedding Consistency
Mapping words from different languages into a consistent embedding space can be challenging. This can result in the model failing to learn the correct relationships between words or failing to recognize semantic differences among translated words.
- Data Quantity and Diversity
Using translated foreign datasets may not provide sufficient quantity and diversity of data, depending on the language and topic domain. Obtaining the required data quantity and diversity can be challenging.
- Difficulty in Understanding Context
Translated data often fails to convey the original context accurately, making it difficult for the model to understand the real meaning and context of specific words or sentences.

- Specialized Terminology and Idiomatic Expressions
Specialized terminology and idiomatic expressions in specific fields may not be appropriately handled during translation, causing the model to perform poorly in certain subjects or domains.
- Data Bias
Translating data from various countries and cultures can introduce biases or cultural differences into the model, potentially increasing bias in the model's responses.
- Performance Degradation
When original data is translated, some information may be lost in the translation process, leading to a potential decrease in the model's performance compared to using the original data directly.

## 2. Multi-task Instruction
To address these challenges and improve dataset quality, we propose an Instruction Turning Framework (ITF) that leverages multi-task datasets and instruction tuning, inspired by Google's FLAN (Finetuned LANguage Models are zero-shot Learners) technique.

### 2.1. Multi-task Datasets
We have curated multi-task datasets based on various existing Korean datasets, specifically tailored to each task. We have avoided relying on translated datasets used in previous Korean large language models. Our dataset sources include:
- AIHub Dataset: 305,900 samples
- KISTI AI Dataset: 824,337 samples
- KorQuad Dataset: 66,181 samples
- Miscellaneous Datasets: 346,803 samples
- Total Dataset Size: 1,543,221 samples

### 2.2. Instruction Tuning
Our ITF incorporates the instruction tuning technique proposed by Google's FLAN, resulting in improved zero-shot performance.
We have publicly released the freely licensed KorQuad 1.0 dataset on GitHub. However, due to licensing policies, we cannot release the other datasets.

## 3. Evaluation
For objective model evaluation, we initially used EleutherAI's lm-evaluation-harness but obtained unsatisfactory results. Consequently, we conducted evaluations using ChatGPT, a widely used model, as described in [Self-Alignment with Instruction Backtranslation](https://arxiv.org/pdf/2308.06502.pdf) and [Three Ways of Using Large Language Models to Evaluate Chat](https://arxiv.org/pdf/2308.06259.pdf) .


| model                                   | score   | average(0~5) | percentage |
| --------------------------------------- |---------| ------------ | ---------- |
| gpt-3.5-turbo(close)                    | 147     | 3.97         | 79.45%     |
| naver Cue(close)                        | 140     | 3.78         | 75.67%     |
| clova X(close)                          | 136     | 3.67         | 73.51%     |
| WizardLM-13B-V1.2(open)                 | 96      | 2.59         | 51.89%     |
| Llama-2-7b-chat-hf(open)                | 67      | 1.81         | 36.21%     |
| Llama-2-13b-chat-hf(open)               | 73      | 1.91         | 38.37%     |
| nlpai-lab/kullm-polyglot-12.8b-v2(open) | 70      | 1.89         | 37.83%     |
| kfkas/Llama-2-ko-7b-Chat(open)          | 96      | 2.59         | 51.89%     |
| beomi/KoAlpaca-Polyglot-12.8B(open)     | 100     | 2.70         | 54.05%     |
| **komt-llama2-7b-v1 (open)(ours)**      | **117** | **3.16**     | **63.24%** |
| **komt-llama2-13b-v1  (open)(ours)**    | **129** | **3.48**     | **69.72%** |
| **komt-llama-30b-v1  (open)(ours)**    | **129** | **3.16**     | **63.24%** |
| **komt-mistral-7b-v1  (open)(ours)**    | **131** | **3.54**     | **70.81%** |


## 4. Conclusion
In this study, we have proposed a method to optimize the Llama2 model for the Korean language. Experimental results demonstrate that the use of multi-task instruction outperforms other Korean-supporting Llama2 models, showcasing its superior performance. Furthermore, multi-task instruction exhibits excellent performance.
In future research, we plan to leverage multi-task instruction to develop various service models and applications.

---

# References
### Llama 2
https://github.com/facebookresearch/llama
### Llama 1
https://github.com/facebookresearch/llama/tree/llama_v1

### llama.cpp
https://github.com/ggerganov/llama.cpp