RstoneCommand commited on
Commit
130cf57
Β·
verified Β·
1 Parent(s): e04b0c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +125 -7
README.md CHANGED
@@ -5,17 +5,135 @@ tags:
5
  - transformers
6
  - unsloth
7
  - lfm2
 
8
  license: apache-2.0
9
  language:
10
- - en
 
 
 
11
  ---
12
 
13
- # Uploaded finetuned model
 
 
14
 
15
- - **Developed by:** RstoneCommand
16
- - **License:** apache-2.0
17
- - **Finetuned from model :** unsloth/LFM2-1.2B
18
 
19
- This lfm2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
20
 
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  - transformers
6
  - unsloth
7
  - lfm2
8
+ - multiple inference
9
  license: apache-2.0
10
  language:
11
+ - ko
12
+ datasets:
13
+ - RstoneCommand/Synapse-Dataset_v03
14
+ pipeline_tag: text-generation
15
  ---
16
 
17
+ # Synapse-Model 1 model Card
18
+ - **개발자:** RstoneCommand
19
+ - **λΌμ΄μ„ μŠ€:** apache-2.0
20
 
21
+ **μ €μž**: Synapse-Model Team
 
 
22
 
23
+ **λͺ¨λΈμ— λŒ€ν•œ 정보**
24
 
25
+ Synapse-Model 1 의 μž…λ ₯, 좜λ ₯, μž₯단점 에 λŒ€ν•œ μ„€λͺ…μž…λ‹ˆλ‹€.
26
+
27
+ **μ„€λͺ…**
28
+
29
+ Synapse-Model 은 Synapse-Model Team μ—μ„œ κ°œλ°œν•œ 첫 κ²½λŸ‰ν™” λͺ¨λΈμž…λ‹ˆλ‹€. Synapse-Model 1 λͺ¨λΈμ€ ν…μŠ€νŠΈλ₯Ό μž…λ ₯ λ°›κ³  닀쀑좔둠 과정을 거쳐 ν…μŠ€νŠΈ 좜λ ₯을 μƒμ„±ν•˜λŠ” ν…μŠ€νŠΈ 생성 λͺ¨λΈμž…λ‹ˆλ‹€. Synapse-Model 은 LiquidAI 의 LFM2-1.2B λͺ¨λΈμ„ νŒŒμΈνŠœλ‹ ν•˜μ—¬ μΆ”λ‘  κΈ°λŠ₯을 μΆ”κ°€ν•˜μ˜€μœΌλ©°, 닀쀑좔둠 κΈ°λŠ₯을 μ§€μ›ν•˜μ—¬ μ—¬λŸ¬ μž‘μ—…μ—μ„œ νƒμ›”ν•œ μ„±λŠ₯을 λ°œνœ˜ν•˜λ„λ‘ μ œμž‘ λ˜μ—ˆμŠ΅λ‹ˆλ‹€. Synapse-Model 1 은 32,768 개의 μ»¨ν…μŠ€νŠΈ 길이λ₯Ό κ°€μ‘ŒμœΌλ©°(LFM2-1.2B 에 쒅속됨), ν•œκ΅­μ–΄κ°€ μ§‘μ€‘μ μœΌλ‘œ ν•™μŠ΅λ˜μ—ˆμŠ΅λ‹ˆλ‹€. Synapse-Model 1 은 1.2B 에 λ‹¬ν•˜λŠ” μž‘μ€ 크기의 λ§€κ°œλ³€μˆ˜λ₯Ό κ°€μ Έ λ…ΈνŠΈλΆ, λ°μŠ€ν¬ν†±, ν•Έλ“œν°κ³Ό 같은 λ¦¬μ†ŒμŠ€κ°€ μ œν•œλ˜λŠ” 둜컬 ν™˜κ²½μ—μ„œ ꡬ동이 κ°€λŠ₯ν•©λ‹ˆλ‹€.
30
+
31
+ **μž…λ ₯ 및 좜λ ₯**
32
+
33
+ - **μž…λ ₯**
34
+ - μ§ˆμ˜μ‘λ‹΅,
35
+ - μ§€μ‹œν”„λ‘¬ν”„νŠΈ
36
+ - λ¬Έμ„œ μš”μ•½
37
+ - ν…μŠ€νŠΈ μž‘μ—…
38
+ -
39
+ - **좜λ ₯**
40
+ - μ§ˆλ¬Έμ— λŒ€ν•œ 응닡
41
+ - μ§€μ‹œμ— λŒ€ν•œ 응닡
42
+ - λ¬Έμ„œ μš”μ•½
43
+ - μž…λ ₯으둜 인해 μƒμ„±λœ 응닡 ν…μŠ€νŠΈ
44
+
45
+ **μ‚¬μš©λ°©λ²•**
46
+
47
+ 이 λ‹€μŒμ€ λͺ¨λΈμ„ λΉ λ₯΄κ²Œ μ‹€ν–‰ν•˜λŠ” 방법에 λŒ€ν•œ λͺ‡κ°€μ§€ μ½”λ“œμ˜ˆμ‹œ 및 μ£Όμ˜μ‚¬ν•­μ΄ μžˆμŠ΅λ‹ˆλ‹€. λ¨Όμ €, transformers 라이브러리λ₯Ό μ΅œμ‹  λ²„μ „μœΌλ‘œ μ—…κ·Έλ ˆμ΄λ“œ λ˜λŠ” μ„€μΉ˜ ν•΄μ•Ό ν•©λ‹ˆλ‹€.(LFM2 에 쒅속됨) `pip install "transformers @ git+https://github.com/huggingface/transformers.git@main".`
48
+
49
+ λ‹€μŒ Python μ½”λ“œλ₯Ό μ‚¬μš©ν•˜μ—¬ Synapse-Model 1 이 ν…μŠ€νŠΈλ₯Ό μƒμ„±ν•˜λŠ” μ˜ˆμ‹œμž…λ‹ˆλ‹€.
50
+ ```python
51
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
52
+
53
+ # Load model and tokenizer
54
+ model_id = "RstoneCommand/Synapse-Model1-1.2B_16bit"
55
+ model = AutoModelForCausalLM.from_pretrained(
56
+ model_id,
57
+ device_map="auto",
58
+ torch_dtype="bfloat16",
59
+ trust_remote_code=True,
60
+ )
61
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
62
+
63
+
64
+ # Text Generate Process
65
+ instruction = input("instruction: ")
66
+ inputs = input("input: ")
67
+ user = f"# instruction:\n{instruction}\n\n# input:\n{inputs}"
68
+
69
+ messages = [
70
+ {"role" : "system", "content" : ""}, # You can write a custom SystemPrompt here. However, it will not work properly (because SystemPrompt has not been learned).
71
+ {"role" : "user", "content" : user}
72
+ ]
73
+
74
+ input_ids = tokenizer.apply_chat_template(
75
+ messages,
76
+ add_generation_prompt=True,
77
+ return_tensors="pt",
78
+ tokenize=True,
79
+ ).to(model.device)
80
+
81
+ from transformers import TextStreamer
82
+ output = model.generate(
83
+ input_ids,
84
+ do_sample=True,
85
+ temperature=0.3,
86
+ min_p=0.15,
87
+ repetition_penalty=1.05,
88
+ max_new_tokens=4069,
89
+ streamer = TextStreamer(tokenizer, skip_prompt = True),
90
+ )
91
+ ```
92
+
93
+ **λͺ¨λΈ 데이터**
94
+
95
+ λͺ¨λΈ ν•™μŠ΅μ— μ‚¬μš©λœ ν•™μŠ΅ 데이터셋
96
+
97
+ **ν›ˆλ ¨ 데이터셋**
98
+ Synapse-Model 은 λ‹€μ–‘ν•œ λ°μ΄ν„°μ…‹μ˜ μž…λ ₯을 λ°”νƒ•μœΌλ‘œ Synapse-Model Team λ‚΄λΆ€μ—μ„œ Custom Response 을 μž‘μ„±ν–ˆμŠ΅λ‹ˆλ‹€. Synapse-Model 1 의 경우 4096개의 ν† ν°μœΌλ‘œ ν•™μŠ΅λ˜μ—ˆμŠ΅λ‹ˆλ‹€. ν•™μŠ΅λ°μ΄ν„°μ˜ κ΅¬μ„±μš”μ†ŒλŠ” λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.
99
+
100
+ - [HAERAE-HUB / HR-Instruct-Math-v0.1](https://huggingface.co/datasets/HAERAE-HUB/HR-Instruct-Math-v0.1) 의 instruction 에 일뢀λ₯Ό μ‚¬μš©(168)
101
+ - [nlpai-lab / kullm-v2](https://huggingface.co/datasets/nlpai-lab/kullm-v2) 의 instruction 및 input 에 일뢀λ₯Ό μ‚¬μš©(68)
102
+ - [Bingsu / ko_alpaca_data](https://huggingface.co/datasets/Bingsu/ko_alpaca_data) 의 instruction 및 input 에 일뢀λ₯Ό μ‚¬μš©(10)
103
+ - **총합 / 246**
104
+
105
+ λ‹€μ–‘ν•œ λ°μ΄ν„°μ…‹μ˜ μž…λ ₯을 λ°”νƒ•μœΌλ‘œ μ œμž‘λœ Custom Response λŠ” λ‹€μ–‘ν•œ μƒν™©μ—μ„œμ˜ 문제λ₯Ό ν•΄κ²°ν•  수 있게 λ˜μ—ˆμŠ΅λ‹ˆλ‹€.
106
+
107
+ **λͺ¨λΈν•™μŠ΅**
108
+
109
+ λͺ¨λΈμ„ νŒŒμΈνŠœλ‹ν•œ 과정에 κ΄€ν•œ μ„€λͺ…μž…λ‹ˆλ‹€.
110
+
111
+ **ν•™μŠ΅ν™˜κ²½**
112
+
113
+ Unsolth 및 Google Colab 을 μ‚¬μš©ν•˜μ—¬ 총 ν•™μŠ΅μ— μ•½ 4μ‹œκ°„μ΄ κ±Έλ ΈμŠ΅λ‹ˆλ‹€.
114
+
115
+ **λͺ¨λΈμ‚¬μš© 및 μ œν•œμ‚¬ν•­**
116
+
117
+ λͺ¨λΈμ˜ μ‚¬μš©μ„ μœ„ν•΄ μ§€μΌœμ•Όν•  μ œν•œμ‚¬ν•­μ΄ μžˆμŠ΅λ‹ˆλ‹€.
118
+
119
+ **μ˜λ„ν•œ μ‚¬μš©**
120
+
121
+ - 닀쀑 좔둠과정이 ν•„μš”ν•œ λ³΅μž‘ν•œ 계산
122
+ - μˆ˜ν•™ 문제 ν•΄κ²°, 창의적인 μ‹œ λ˜λŠ” λ¬Έμž₯ 생성, 이메일 및 λŒ€λ³Έ μž‘μ„± 등에 μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
123
+
124
+ - 닀쀑 좔둠을 이용�� λŒ€ν™”ν˜• 챗봇 μ œμž‘
125
+ - 고객 μ„œλΉ„μŠ€ λ˜λŠ” λΉ„μ„œ, λŒ€ν™”ν˜• μ• ν”Œλ¦¬μΌ€μ΄μ…˜, λŒ€ν™”ν˜• μΈν„°νŽ˜μ΄μŠ€, 챗봇 μ• ν”Œλ¦¬μΌ€μ΄μ…˜ 등에 μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.
126
+
127
+ - ν…μŠ€νŠΈ λ²ˆμ—­ 및 생성
128
+ - λͺ¨λΈμ€ λΆˆμ•ˆμ •ν•˜λ‚˜ `μ˜μ–΄->ν•œκ΅­μ–΄` λ˜λŠ” `ν•œκ΅­μ–΄->μ˜μ–΄` 둜 λ²ˆμ—­μ΄ κ°€λŠ₯ν•©λ‹ˆλ‹€. 이에 따라 ν•œκ΅­μ–΄ 와 μ˜μ–΄λ₯Ό λ²ˆμ—­ν•˜λŠ”λ° μ‚¬μš©λ  수 μžˆμŠ΅λ‹ˆλ‹€.
129
+
130
+ **μ œν•œλœ 사항**
131
+
132
+ - μž‘μ—…μ— λŒ€ν•œ μ§€μ‹œλΆ€μ‘±
133
+ - Synapse-Model 은 μž‘μ—…μ— λŒ€ν•œ μ§€μ‹œκ°€ λΆ€μ‘±ν•  μ‹œ μ œλŒ€λ‘œλœ μž‘μ—…μ„ μˆ˜ν–‰ν•  수 μ—†μŠ΅λ‹ˆλ‹€. λ”°λΌμ„œ μ§€μ‹œμ— λŒ€ν•œ λͺ¨λ“  μš”κ΅¬λ₯Ό λŒ€ν™”μ— λ‹΄μ•„ 정확도λ₯Ό λ†’ν˜€μ£Όμ‹­μ‹œμ˜€.
134
+
135
+ - λ‹€μ–‘ν•œ μ–Έμ–΄ 처리 λΆ€μ‘±
136
+ - Synapse-Model 은 λ‹€μ–‘ν•œ 언어에 λŒ€ν•œ 데이터셋을 μ œμž‘ν•˜μ§€ λͺ»ν•˜μ—¬ λͺ¨λΈμ€ ν•œκ΅­μ–΄μ™€ μ˜μ–΄(일뢀) 만 μ‚¬μš©κ°€λŠ₯ν•©λ‹ˆλ‹€.
137
+
138
+ - 응닡 μ •ν™•μ„±
139
+ - Synapse-Model 은 ν•™μŠ΅λ°μ΄ν„°λ₯Ό 기반으둜 응닡을 μƒμ„±ν•˜λ―€λ‘œ λͺ¨λΈμ€ λΆ€μ ν™•ν•˜κ±°λ‚˜ 였래된 정보λ₯Ό 기반으둜 응닡을 생성할 수 μžˆμŠ΅λ‹ˆλ‹€. (이λ₯Ό ν•΄κ²°ν•˜κΈ° μœ„ν•΄μ„  Rag λ˜λŠ” Fine-Tuning 이 ν•„μš”ν•¨)