alex4cip Claude commited on
Commit
82b9256
ยท
1 Parent(s): bfd0656

docs: Update README for 13-model multi-model chatbot system

Browse files

Documentation Updates:
- Update title to "Multi-Model Korean LLM Chatbot"
- Add 13 model listing (10 Public + 3 Gated)
- Highlight 3 new Korean models (EXAONE 7.8B/2.4B, Llama-3 Open-Ko)
- Update sdk_version to 5.49.1

New Sections:
- Model selection guide (speed vs quality vs ecosystem)
- Lazy loading system documentation
- Cache management explanation
- Gated model access instructions
- Local setup with .env file configuration

Updated Information:
- Dependencies: gradio 5.49.1, transformers 4.57.1, torch 2.9.0
- Performance metrics for 13-model system
- Hardware requirements and recommendations
- Space URL update to catchitplay/simple-chatbot-gradio

Technical Details:
- Document check_model_cached() function
- Explain lazy loading implementation
- Add HF_TOKEN setup instructions
- Include model download size and time estimates

๐Ÿค– Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (1) hide show
  1. README.md +170 -92
README.md CHANGED
@@ -1,38 +1,64 @@
1
  ---
2
- title: Llama-2-Ko Chatbot
3
  emoji: ๐Ÿค–
4
  colorFrom: blue
5
  colorTo: purple
6
  sdk: gradio
7
- sdk_version: 5.9.1
8
  app_file: app.py
9
  pinned: false
10
  license: mit
11
  ---
12
 
13
- # ๐Ÿค– Llama-2-Ko 7B Chatbot (Flexible Hardware)
14
 
15
- ํ•œ๊ตญ์–ด์— ์ตœ์ ํ™”๋œ Llama-2-Ko 7B ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•œ ๋Œ€ํ™”ํ˜• ์ฑ—๋ด‡์ž…๋‹ˆ๋‹ค. **ZeroGPU**์™€ **CPU Upgrade** ํ•˜๋“œ์›จ์–ด๋ฅผ ๋ชจ๋‘ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
16
 
17
  ## โœจ ์ฃผ์š” ํŠน์ง•
18
 
19
- - **๐Ÿ‡ฐ๐Ÿ‡ท ํ•œ๊ธ€ ๋Œ€ํ™” ์ตœ์ ํ™”**: Llama-2-Ko 7B ๋ชจ๋ธ ์‚ฌ์šฉ
20
- - **โšก ์œ ์—ฐํ•œ ํ•˜๋“œ์›จ์–ด ์ง€์›**: ZeroGPU/CPU Upgrade ์ž๋™ ๊ฐ์ง€
21
- - **๐Ÿ”„ ์ž๋™ ์ „ํ™˜**: ํ•˜๋“œ์›จ์–ด ๋ณ€๊ฒฝ ์‹œ ์ฝ”๋“œ ์ˆ˜์ • ๋ถˆํ•„์š”
22
- - **๐Ÿ’ฐ ๋น„์šฉ ํšจ์œจ์ **: ์ƒํ™ฉ์— ๋งž๋Š” ํ•˜๋“œ์›จ์–ด ์„ ํƒ ๊ฐ€๋Šฅ
 
23
 
24
- ## ๐ŸŽฏ ๋ชจ๋ธ ์ •๋ณด
25
 
26
- - **๋ชจ๋ธ**: `beomi/llama-2-ko-7b`
27
- - **ํฌ๊ธฐ**: ~14GB
28
- - **ํŠน์ง•**: ํ•œ๊ธ€ ๋Œ€ํ™”์— ํŠนํ™”๋œ Llama 2 ๊ธฐ๋ฐ˜ ๋ชจ๋ธ
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
  ## ๐Ÿš€ ํ•˜๋“œ์›จ์–ด ์˜ต์…˜
31
 
32
  ### Option 1: ZeroGPU (์ถ”์ฒœ)
33
 
34
  **์žฅ์ **:
35
- - โšก ๋น ๋ฅธ ์‘๋‹ต (3-5์ดˆ)
36
  - ๐Ÿ’ฐ ์ €๋ ดํ•œ ๋น„์šฉ ($9/month)
37
  - ๐Ÿ”‹ ์ž๋™ GPU ํ• ๋‹น/ํ•ด์ œ
38
 
@@ -50,7 +76,7 @@ license: mit
50
  - ๐Ÿ”ง ๊ฐ„๋‹จํ•œ ์„ค์ •
51
 
52
  **์ œ์•ฝ**:
53
- - ๐Ÿข ๋А๋ฆฐ ์‘๋‹ต (30์ดˆ~1๋ถ„)
54
  - ๐Ÿ’ต ์ƒ๋Œ€์ ์œผ๋กœ ๋น„์‹ผ ๋น„์šฉ
55
 
56
  **๋น„์šฉ**: $0.03/hour (์›” ์•ฝ $22)
@@ -79,8 +105,8 @@ license: mit
79
 
80
  | ํ•ญ๋ชฉ | ZeroGPU | CPU Upgrade |
81
  |------|---------|-------------|
82
- | **์ฒซ ์‘๋‹ต** | 10-15์ดˆ | 1-2๋ถ„ |
83
- | **์ดํ›„ ์‘๋‹ต** | 3-5์ดˆ | 30์ดˆ~1๋ถ„ |
84
  | **์ผ์ผ ํ•œ๋„** | 25๋ถ„ | ๋ฌด์ œํ•œ |
85
  | **์›” ๋น„์šฉ** | $9 | $22 |
86
  | **GPU** | H200 (70GB) | ์—†์Œ |
@@ -101,29 +127,45 @@ except ImportError:
101
  # ์กฐ๊ฑด๋ถ€ decorator ์ ์šฉ
102
  if ZEROGPU_AVAILABLE:
103
  @spaces.GPU(duration=120)
104
- def generate_response(message, history):
105
- return generate_response_impl(message, history)
106
  else:
107
- def generate_response(message, history):
108
- return generate_response_impl(message, history)
109
  ```
110
 
111
- ### ๋™์  UI ์ƒ์„ฑ
 
 
 
 
 
112
 
113
- - ZeroGPU ๋ชจ๋“œ: GPU ๊ฐ€์† ์•ˆ๋‚ด
114
- - CPU Upgrade ๋ชจ๋“œ: CPU ์ œ์•ฝ ์•ˆ๋‚ด
115
- - ํ•˜๋“œ์›จ์–ด ์ •๋ณด ์ž๋™ ํ‘œ์‹œ
 
 
 
 
 
 
 
 
 
 
116
 
117
  ## ๐Ÿ“ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•
118
 
119
  ### 1. Space ์ ‘์†
120
 
121
- https://huggingface.co/spaces/alex4cip/simple-chat
122
 
123
- ### 2. ํ•˜๋“œ์›จ์–ด ํ™•์ธ
124
 
125
- - UI ์ƒ๋‹จ์— ํ˜„์žฌ ํ•˜๋“œ์›จ์–ด ํ‘œ์‹œ
126
- - "ZeroGPU" ๋˜๋Š” "CPU Upgrade"
 
127
 
128
  ### 3. ๋Œ€ํ™” ์‹œ์ž‘
129
 
@@ -133,102 +175,135 @@ https://huggingface.co/spaces/alex4cip/simple-chat
133
  ํ•œ๊ตญ์˜ ์ˆ˜๋„๋Š” ๏ฟฝ๏ฟฝ๋””์ธ๊ฐ€์š”?
134
  ```
135
 
136
- ## ๐Ÿ’ก ์ตœ์ ํ™” ํŒ
137
-
138
- ### ZeroGPU ๋ชจ๋“œ
139
-
140
- 1. **์งง์€ ๋Œ€ํ™”**: ๊ธด ๋Œ€ํ™”๋Š” GPU ์‹œ๊ฐ„ ์†Œ๋ชจ
141
- 2. **ํšจ์œจ์  ํ”„๋กฌํ”„ํŠธ**: ๋ช…ํ™•ํ•˜๊ณ  ๊ฐ„๊ฒฐํ•œ ์งˆ๋ฌธ
142
- 3. **์ผ์ผ ํ•œ๋„ ๊ด€๋ฆฌ**: 25๋ถ„ ๋‚ด ์‚ฌ์šฉ
143
-
144
- ### CPU Upgrade ๋ชจ๋“œ
145
-
146
- 1. **์ธ๋‚ด์‹ฌ**: ์‘๋‹ต ๋Œ€๊ธฐ ์‹œ๊ฐ„ ๊ธธ์–ด์ง
147
- 2. **๋ฐฐ์น˜ ์งˆ๋ฌธ**: ์—ฌ๋Ÿฌ ์งˆ๋ฌธ ๋™์‹œ์—
148
- 3. **์žฅ์‹œ๊ฐ„ ์‚ฌ์šฉ**: 24์‹œ๊ฐ„ ๋ฌด์ œํ•œ
149
-
150
- ## ๐Ÿ”— ํ•˜๋“œ์›จ์–ด ์ „ํ™˜ ์‹œ๋‚˜๋ฆฌ์˜ค
151
-
152
- ### ์‹œ๋‚˜๋ฆฌ์˜ค 1: ๋น ๋ฅธ ๋ฐ๋ชจ (ZeroGPU)
153
-
154
- - ์งง์€ ์‹œ๊ฐ„ ๋‚ด ๋งŽ์€ ์‚ฌ๋žŒ์—๊ฒŒ ์‹œ์—ฐ
155
- - ๋น ๋ฅธ ์‘๋‹ต์œผ๋กœ ์ข‹์€ ์ธ์ƒ
156
- - ์ผ์ผ ํ•œ๋„ ๋‚ด ์ถฉ๋ถ„ํžˆ ์‚ฌ์šฉ
157
 
158
- ### ์‹œ๋‚˜๋ฆฌ์˜ค 2: ์žฅ์‹œ๊ฐ„ ๊ฐœ๋ฐœ (CPU Upgrade)
 
 
159
 
160
- - ์ง€์†์ ์ธ ํ…Œ์ŠคํŠธ ํ•„์š”
161
- - ์ผ์ผ ํ•œ๋„ ๊ฑฑ์ • ์—†์Œ
162
- - ๋А๋ฆฐ ์†๋„ ๊ฐ์ˆ˜
 
163
 
164
- ### ์‹œ๋‚˜๋ฆฌ์˜ค 3: ํ˜ผํ•ฉ ์‚ฌ์šฉ
 
 
165
 
166
- - ํ‰์ƒ์‹œ: CPU Upgrade
167
- - ๋ฐ๋ชจ ์‹œ: ZeroGPU๋กœ ์ „ํ™˜
168
- - ์ฝ”๋“œ ์ˆ˜์ • ๋ถˆํ•„์š” (์ž๋™ ๊ฐ์ง€)
169
-
170
- ## โš ๏ธ ์ œํ•œ์‚ฌํ•ญ
171
-
172
- ### ๊ณตํ†ต
173
-
174
- - **๋ชจ๋ธ ํฌ๊ธฐ**: 14GB (๋กœ๋”ฉ ์‹œ๊ฐ„ ํ•„์š”)
175
- - **์ปจํ…์ŠคํŠธ**: ์ตœ๊ทผ 3ํ„ด๋งŒ ์œ ์ง€
176
- - **ํ•œ๊ธ€ ํŠนํ™”**: ์˜์–ด ์ž…๋ ฅ ์‹œ ํ’ˆ์งˆ ๋‚ฎ์Œ
177
-
178
- ### ZeroGPU ์ „์šฉ
179
-
180
- - **์ผ์ผ ํ•œ๋„**: 25๋ถ„ (PRO ๊ตฌ๋…)
181
- - **๋Œ€๊ธฐ์—ด**: ์‚ฌ์šฉ์ž ๋งŽ์„ ๊ฒฝ์šฐ ๋Œ€๊ธฐ
182
- - **PRO ํ•„์š”**: $9/month ๊ตฌ๋… ํ•„์š”
183
-
184
- ### CPU Upgrade ์ „์šฉ
185
-
186
- - **๋А๋ฆฐ ์†๋„**: 30์ดˆ~1๋ถ„ ์‘๋‹ต
187
- - **๋น„์šฉ**: ์‹œ๊ฐ„๋‹น $0.03 ($22/month)
188
- - **์„ฑ๋Šฅ**: GPU ๋Œ€๋น„ 10๋ฐฐ ์ด์ƒ ๋А๋ฆผ
189
 
190
  ## ๐Ÿ“ฆ ๋กœ์ปฌ ์‹คํ–‰
191
 
 
 
192
  ```bash
193
  # ์ €์žฅ์†Œ ํด๋ก 
194
- git clone <repository-url>
195
  cd simple-chatbot-gradio
196
 
 
 
 
 
197
  # ์˜์กด์„ฑ ์„ค์น˜
198
  pip install -r requirements.txt
 
 
 
 
 
 
 
 
 
 
 
 
 
 
199
 
200
- # HF ํ† ํฐ ์„ค์ •
201
- export HF_TOKEN=your_hugging_face_token
202
 
203
- # ์‹คํ–‰ (GPU ๊ถŒ์žฅ)
204
  python app.py
205
  ```
206
 
207
- **์ฐธ๊ณ **: ๋กœ์ปฌ์€ CPU ๋ชจ๋“œ๋กœ ์‹คํ–‰๋จ (๋งค์šฐ ๋А๋ฆผ)
 
 
 
 
 
208
 
209
  ## ๐Ÿ› ๏ธ ๊ธฐ์ˆ  ์Šคํƒ
210
 
211
- - **ํ”„๋ ˆ์ž„์›Œํฌ**: Gradio 5.x
212
- - **ML ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ**: Transformers, PyTorch
213
  - **GPU ์ธํ”„๋ผ**: Hugging Face ZeroGPU (์„ ํƒ์ )
214
  - **์–ธ์–ด**: Python 3.10+
215
 
216
  ## ๐Ÿ“š Dependencies
217
 
218
  ```txt
219
- gradio==5.9.1
220
- transformers==4.46.0
221
- torch==2.1.0
222
- safetensors==0.4.5
223
- accelerate==0.26.1
224
- spaces # ZeroGPU support (optional)
 
225
  ```
226
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
227
  ## ๐Ÿ”— ๊ด€๋ จ ๋ฆฌ์†Œ์Šค
228
 
229
- - [Llama-2-Ko Model Card](https://huggingface.co/beomi/llama-2-ko-7b)
 
 
 
 
 
 
230
  - [ZeroGPU Documentation](https://huggingface.co/docs/hub/spaces-zerogpu)
231
  - [Gradio Documentation](https://www.gradio.app/docs)
 
232
  - [HF Spaces Pricing](https://huggingface.co/pricing)
233
 
234
  ## ๐Ÿ“„ ๋ผ์ด์„ ์Šค
@@ -241,4 +316,7 @@ MIT License
241
 
242
  ---
243
 
244
- **๐Ÿ’ก TIP**: ๋น ๋ฅธ ๋ฐ๋ชจ๊ฐ€ ํ•„์š”ํ•˜๋ฉด ZeroGPU, ์žฅ์‹œ๊ฐ„ ์‚ฌ์šฉ์ด ํ•„์š”ํ•˜๋ฉด CPU Upgrade๋ฅผ ์„ ํƒํ•˜์„ธ์š”!
 
 
 
 
1
  ---
2
+ title: Multi-Model Korean LLM Chatbot
3
  emoji: ๐Ÿค–
4
  colorFrom: blue
5
  colorTo: purple
6
  sdk: gradio
7
+ sdk_version: 5.49.1
8
  app_file: app.py
9
  pinned: false
10
  license: mit
11
  ---
12
 
13
+ # ๐Ÿค– Multi-Model Korean LLM Chatbot
14
 
15
+ 13๊ฐœ์˜ ๋‹ค์–‘ํ•œ ํ•œ๊ตญ์–ด LLM ๋ชจ๋ธ์„ ์„ ํƒํ•˜์—ฌ ๋Œ€ํ™”ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฉ€ํ‹ฐ๋ชจ๋ธ ์ฑ—๋ด‡์ž…๋‹ˆ๋‹ค. **ZeroGPU**์™€ **CPU Upgrade** ํ•˜๋“œ์›จ์–ด๋ฅผ ๋ชจ๋‘ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
16
 
17
  ## โœจ ์ฃผ์š” ํŠน์ง•
18
 
19
+ - **๐ŸŽฏ 13๊ฐœ ๋ชจ๋ธ ์„ ํƒ**: ๋‹ค์–‘ํ•œ ํฌ๊ธฐ์™€ ํŠน์„ฑ์˜ LLM ๋ชจ๋ธ ์ง€์›
20
+ - **๐Ÿ‡ฐ๐Ÿ‡ท ํ•œ๊ธ€ ์ตœ์ ํ™”**: ํ•œ๊ตญ์–ด ์„ฑ๋Šฅ์ด ์šฐ์ˆ˜ํ•œ ๋ชจ๋ธ๋“ค๋กœ ๊ตฌ์„ฑ
21
+ - **โšก ์œ ์—ฐํ•œ ํ•˜๋“œ์›จ์–ด**: ZeroGPU/CPU Upgrade ์ž๋™ ๊ฐ์ง€
22
+ - **๐Ÿ’พ ์บ์‹œ ์‹œ์Šคํ…œ**: ๋ชจ๋ธ ์žฌ๋‹ค์šด๋กœ๋“œ ๋ฐฉ์ง€, ๋น ๋ฅธ ๋กœ๋”ฉ
23
+ - **๐Ÿ”„ Lazy Loading**: ์„ ํƒํ•œ ๋ชจ๋ธ๋งŒ ๋กœ๋“œํ•˜์—ฌ ๋ฆฌ์†Œ์Šค ์ ˆ์•ฝ
24
 
25
+ ## ๐ŸŽฏ ์ง€์› ๋ชจ๋ธ (13๊ฐœ)
26
 
27
+ ### ๐ŸŒŸ ์ถ”์ฒœ ํ•œ๊ตญ์–ด ๋ชจ๋ธ
28
+
29
+ | ๋ชจ๋ธ | ํฌ๊ธฐ | ํŠน์ง• | ์ƒํƒœ |
30
+ |------|------|------|------|
31
+ | **EXAONE 3.5 7.8B** | 7.3GB | โญ ํŒŒ๋ผ๋ฏธํ„ฐ ๋Œ€๋น„ ์ตœ๊ณ  ํšจ์œจ | Public |
32
+ | **EXAONE 3.5 2.4B** | 2.2GB | โšก ์ดˆ๊ฒฝ๋Ÿ‰, ๋น ๋ฅธ ์‘๋‹ต | Public |
33
+ | **Llama-3 Open-Ko 8B** | 7.5GB | ๐Ÿ”ฅ Llama 3 ์ƒํƒœ๊ณ„ | Public |
34
+
35
+ ### ๐Ÿ“š ์ „์ฒด ๋ชจ๋ธ ๋ชฉ๋ก
36
+
37
+ #### Public ๋ชจ๋ธ (10๊ฐœ)
38
+ 1. LGAI-EXAONE/EXAONE-3.5-7.8B-Instruct
39
+ 2. LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct
40
+ 3. beomi/Llama-3-Open-Ko-8B
41
+ 4. Qwen/Qwen2.5-7B-Instruct
42
+ 5. Qwen/Qwen2.5-14B-Instruct
43
+ 6. 01-ai/Yi-1.5-9B-Chat
44
+ 7. 01-ai/Yi-1.5-34B-Chat
45
+ 8. mistralai/Mistral-7B-Instruct-v0.3
46
+ 9. upstage/SOLAR-10.7B-Instruct-v1.0
47
+ 10. EleutherAI/polyglot-ko-5.8b
48
+
49
+ #### Gated ๋ชจ๋ธ (3๊ฐœ) ๐Ÿ”’
50
+ 11. meta-llama/Llama-3.1-8B-Instruct
51
+ 12. meta-llama/Llama-3.1-70B-Instruct
52
+ 13. CohereForAI/aya-23-8B
53
+
54
+ > **์ฐธ๊ณ **: Gated ๋ชจ๋ธ์€ Hugging Face์—์„œ ๋ณ„๋„ ์Šน์ธ ํ•„์š”
55
 
56
  ## ๐Ÿš€ ํ•˜๋“œ์›จ์–ด ์˜ต์…˜
57
 
58
  ### Option 1: ZeroGPU (์ถ”์ฒœ)
59
 
60
  **์žฅ์ **:
61
+ - โšก ๋น ๋ฅธ ์‘๋‹ต (3-10์ดˆ)
62
  - ๐Ÿ’ฐ ์ €๋ ดํ•œ ๋น„์šฉ ($9/month)
63
  - ๐Ÿ”‹ ์ž๋™ GPU ํ• ๋‹น/ํ•ด์ œ
64
 
 
76
  - ๐Ÿ”ง ๊ฐ„๋‹จํ•œ ์„ค์ •
77
 
78
  **์ œ์•ฝ**:
79
+ - ๐Ÿข ๋А๋ฆฐ ์‘๋‹ต (15์ดˆ~2๋ถ„)
80
  - ๐Ÿ’ต ์ƒ๋Œ€์ ์œผ๋กœ ๋น„์‹ผ ๋น„์šฉ
81
 
82
  **๋น„์šฉ**: $0.03/hour (์›” ์•ฝ $22)
 
105
 
106
  | ํ•ญ๋ชฉ | ZeroGPU | CPU Upgrade |
107
  |------|---------|-------------|
108
+ | **์ฒซ ์‘๋‹ต** | 10-20์ดˆ | 1-3๋ถ„ |
109
+ | **์ดํ›„ ์‘๋‹ต** | 3-10์ดˆ | 15์ดˆ~2๋ถ„ |
110
  | **์ผ์ผ ํ•œ๋„** | 25๋ถ„ | ๋ฌด์ œํ•œ |
111
  | **์›” ๋น„์šฉ** | $9 | $22 |
112
  | **GPU** | H200 (70GB) | ์—†์Œ |
 
127
  # ์กฐ๊ฑด๋ถ€ decorator ์ ์šฉ
128
  if ZEROGPU_AVAILABLE:
129
  @spaces.GPU(duration=120)
130
+ def generate_response(messages):
131
+ return generate_response_impl(messages)
132
  else:
133
+ def generate_response(messages):
134
+ return generate_response_impl(messages)
135
  ```
136
 
137
+ ### Lazy Loading ์‹œ์Šคํ…œ
138
+
139
+ - ์„ ํƒํ•œ ๋ชจ๋ธ๋งŒ ๋ฉ”๋ชจ๋ฆฌ์— ๋กœ๋“œ
140
+ - ๋ชจ๋ธ ์ „ํ™˜ ์‹œ ์ด์ „ ๋ชจ๋ธ ์ž๋™ ์–ธ๋กœ๋“œ
141
+ - ์บ์‹œ ํ™•์ธ์œผ๋กœ ์žฌ๋‹ค์šด๋กœ๋“œ ๋ฐฉ์ง€
142
+ - ๋””์Šคํฌ์—์„œ ๋น ๋ฅธ ๋กœ๋”ฉ (์บ์‹œ๋œ ๊ฒฝ์šฐ)
143
 
144
+ ### ์บ์‹œ ๊ด€๋ฆฌ
145
+
146
+ ```python
147
+ def check_model_cached(model_name):
148
+ """Check if model is already downloaded in HF cache"""
149
+ from huggingface_hub import scan_cache_dir
150
+ cache_info = scan_cache_dir()
151
+
152
+ for repo in cache_info.repos:
153
+ if repo.repo_id == model_name:
154
+ return True
155
+ return False
156
+ ```
157
 
158
  ## ๐Ÿ“ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•
159
 
160
  ### 1. Space ์ ‘์†
161
 
162
+ https://huggingface.co/spaces/catchitplay/simple-chatbot-gradio
163
 
164
+ ### 2. ๋ชจ๋ธ ์„ ํƒ
165
 
166
+ - ๋“œ๋กญ๋‹ค์šด์—์„œ ์›ํ•˜๋Š” ๋ชจ๋ธ ์„ ํƒ
167
+ - ์บ์‹œ ์ƒํƒœ ํ™•์ธ (๐Ÿ’พ ์บ์‹œ๋จ / ๐Ÿ“ฅ ๋‹ค์šด๋กœ๋“œ ํ•„์š”)
168
+ - ์ฒซ ์‚ฌ์šฉ ์‹œ ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ (2-14GB, 5-20๋ถ„)
169
 
170
  ### 3. ๋Œ€ํ™” ์‹œ์ž‘
171
 
 
175
  ํ•œ๊ตญ์˜ ์ˆ˜๋„๋Š” ๏ฟฝ๏ฟฝ๋””์ธ๊ฐ€์š”?
176
  ```
177
 
178
+ ## ๐Ÿ’ก ๋ชจ๋ธ ์„ ํƒ ๊ฐ€์ด๋“œ
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
179
 
180
+ ### ๋น ๋ฅธ ์‘๋‹ต์ด ํ•„์š”ํ•œ ๊ฒฝ์šฐ
181
+ - **EXAONE 3.5 2.4B** โšก (2.2GB) - ๊ฐ€์žฅ ๋น ๋ฆ„
182
+ - **Mistral 7B** (7GB) - ๊ฒฝ๋Ÿ‰ ๋ชจ๋ธ
183
 
184
+ ### ํ’ˆ์งˆ ์ค‘์‹œ
185
+ - **EXAONE 3.5 7.8B** โญ (7.3GB) - ํšจ์œจ์„ฑ ์ตœ๊ณ 
186
+ - **Qwen2.5 14B** (14GB) - ๋‹ค๊ตญ์–ด ๊ฐ•์ 
187
+ - **SOLAR 10.7B** (10GB) - ํ•œ๊ตญ์–ด ํŠนํ™”
188
 
189
+ ### ์ตœ๊ณ  ์„ฑ๋Šฅ (๋А๋ฆผ)
190
+ - **Llama 3.1 70B** ๐Ÿ”’ (70GB) - ์ตœ๊ณ  ํ’ˆ์งˆ
191
+ - **Yi 1.5 34B** (34GB) - ๊ธด ๋ฌธ๋งฅ
192
 
193
+ ### Llama ์ƒํƒœ๊ณ„
194
+ - **Llama-3 Open-Ko 8B** ๐Ÿ”ฅ (7.5GB)
195
+ - **Llama 3.1 8B** ๐Ÿ”’ (8GB)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
196
 
197
  ## ๐Ÿ“ฆ ๋กœ์ปฌ ์‹คํ–‰
198
 
199
+ ### ์„ค์น˜
200
+
201
  ```bash
202
  # ์ €์žฅ์†Œ ํด๋ก 
203
+ git clone https://github.com/catchitplay/simple-chatbot-gradio.git
204
  cd simple-chatbot-gradio
205
 
206
+ # ๊ฐ€์ƒํ™˜๊ฒฝ ์ƒ์„ฑ (๊ถŒ์žฅ)
207
+ python -m venv venv
208
+ source venv/bin/activate # Windows: venv\Scripts\activate
209
+
210
  # ์˜์กด์„ฑ ์„ค์น˜
211
  pip install -r requirements.txt
212
+ ```
213
+
214
+ ### .env ํŒŒ์ผ ์„ค์ •
215
+
216
+ ```bash
217
+ # .env ํŒŒ์ผ ์ƒ์„ฑ
218
+ echo "HF_TOKEN=your_hugging_face_token" > .env
219
+ ```
220
+
221
+ **HF_TOKEN ๋ฐœ๊ธ‰ ๋ฐฉ๋ฒ•**:
222
+ 1. https://huggingface.co/settings/tokens ์ ‘์†
223
+ 2. "New token" ํด๋ฆญ
224
+ 3. "Read" ๊ถŒํ•œ ์„ ํƒ
225
+ 4. ์ƒ์„ฑ๋œ ํ† ํฐ ๋ณต์‚ฌ
226
 
227
+ ### ์‹คํ–‰
 
228
 
229
+ ```bash
230
  python app.py
231
  ```
232
 
233
+ ๋ธŒ๋ผ์šฐ์ €์—์„œ http://localhost:7860 ์ ‘์†
234
+
235
+ **์ฐธ๊ณ **:
236
+ - ๋กœ์ปฌ์€ CPU/GPU ์ž๋™ ๊ฐ์ง€
237
+ - GPU ๊ถŒ์žฅ (CUDA ํ•„์š”)
238
+ - ์ฒซ ์‹คํ–‰ ์‹œ ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ (์‹œ๊ฐ„ ์†Œ์š”)
239
 
240
  ## ๐Ÿ› ๏ธ ๊ธฐ์ˆ  ์Šคํƒ
241
 
242
+ - **ํ”„๋ ˆ์ž„์›Œํฌ**: Gradio 5.49.1
243
+ - **ML ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ**: Transformers 4.57.1, PyTorch 2.9.0
244
  - **GPU ์ธํ”„๋ผ**: Hugging Face ZeroGPU (์„ ํƒ์ )
245
  - **์–ธ์–ด**: Python 3.10+
246
 
247
  ## ๐Ÿ“š Dependencies
248
 
249
  ```txt
250
+ gradio==5.49.1
251
+ transformers==4.57.1
252
+ torch==2.9.0
253
+ safetensors==0.6.2
254
+ sentencepiece==0.2.0
255
+ protobuf==4.25.1
256
+ python-dotenv==1.0.0
257
  ```
258
 
259
+ ## ๐Ÿ”’ Gated ๋ชจ๋ธ ์‚ฌ์šฉ๋ฒ•
260
+
261
+ ### 1. ๋ชจ๋ธ ์Šน์ธ ์š”์ฒญ
262
+
263
+ ๊ฐ Gated ๋ชจ๋ธ ํŽ˜์ด์ง€์—์„œ "Request Access" ํด๋ฆญ:
264
+ - https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct
265
+ - https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct
266
+ - https://huggingface.co/CohereForAI/aya-23-8B
267
+
268
+ ### 2. HF_TOKEN ์„ค์ •
269
+
270
+ ์Šน์ธ ํ›„ HF_TOKEN์„ .env ํŒŒ์ผ์— ์„ค์ • (์œ„ ์ฐธ์กฐ)
271
+
272
+ ### 3. Space Secrets ์„ค์ • (HF Spaces)
273
+
274
+ Space Settings โ†’ Repository secrets:
275
+ - Name: `HF_TOKEN`
276
+ - Value: `your_token_here`
277
+
278
+ ## โš ๏ธ ์ œํ•œ์‚ฌํ•ญ
279
+
280
+ ### ๊ณตํ†ต
281
+ - **๋ชจ๋ธ ํฌ๊ธฐ**: 2-70GB (๋กœ๋”ฉ ์‹œ๊ฐ„ ํ•„์š”)
282
+ - **์ปจํ…์ŠคํŠธ**: ๋Œ€ํ™” ํžˆ์Šคํ† ๋ฆฌ ์œ ์ง€
283
+ - **๋ฉ”๋ชจ๋ฆฌ**: ํฐ ๋ชจ๋ธ์€ GPU/๊ณ ์šฉ๋Ÿ‰ RAM ํ•„์š”
284
+
285
+ ### ZeroGPU ์ „์šฉ
286
+ - **์ผ์ผ ํ•œ๋„**: 25๋ถ„ (PRO ๊ตฌ๋…)
287
+ - **๋Œ€๊ธฐ์—ด**: ์‚ฌ์šฉ์ž ๋งŽ์„ ๊ฒฝ์šฐ ๋Œ€๊ธฐ
288
+ - **PRO ํ•„์š”**: $9/month ๊ตฌ๋… ํ•„์š”
289
+
290
+ ### CPU Upgrade ์ „์šฉ
291
+ - **๋А๋ฆฐ ์†๋„**: GPU ๋Œ€๋น„ 10-30๋ฐฐ ๋А๋ฆผ
292
+ - **๋น„์šฉ**: ์‹œ๊ฐ„๋‹น $0.03 ($22/month)
293
+ - **๋ฉ”๋ชจ๋ฆฌ ์ œ์•ฝ**: 32GB RAM (๋Œ€ํ˜• ๋ชจ๋ธ ์ œ์•ฝ)
294
+
295
  ## ๐Ÿ”— ๊ด€๋ จ ๋ฆฌ์†Œ์Šค
296
 
297
+ ### ๋ชจ๋ธ ์นด๋“œ
298
+ - [EXAONE 3.5](https://huggingface.co/LGAI-EXAONE)
299
+ - [Llama 3 Open-Ko](https://huggingface.co/beomi/Llama-3-Open-Ko-8B)
300
+ - [Qwen2.5](https://huggingface.co/Qwen)
301
+ - [SOLAR](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0)
302
+
303
+ ### ๋ฌธ์„œ
304
  - [ZeroGPU Documentation](https://huggingface.co/docs/hub/spaces-zerogpu)
305
  - [Gradio Documentation](https://www.gradio.app/docs)
306
+ - [HF Spaces Config](https://huggingface.co/docs/hub/spaces-config-reference)
307
  - [HF Spaces Pricing](https://huggingface.co/pricing)
308
 
309
  ## ๐Ÿ“„ ๋ผ์ด์„ ์Šค
 
316
 
317
  ---
318
 
319
+ **๐Ÿ’ก TIP**:
320
+ - ๋น ๋ฅธ ํ…Œ์ŠคํŠธ: EXAONE 2.4B โšก
321
+ - ๊ท ํ˜•์žกํžŒ ์„ฑ๋Šฅ: EXAONE 7.8B โญ
322
+ - ์ตœ๊ณ  ํ’ˆ์งˆ: Llama 3.1 70B ๐Ÿ”’ (๋А๋ฆผ)