alex4cip Claude commited on
Commit
5c29192
ยท
1 Parent(s): 6ae6708

feat: Optimize for Hugging Face Spaces deployment

Browse files

**Hugging Face Spaces Compatibility:**
- Remove python-dotenv dependency (Spaces uses Secrets system)
- Update HF_TOKEN handling to work with both local and Spaces environments
- Add memory warning for KoAlpaca 5.8B model (8GB+ RAM required)
- Display warning message in UI when selecting high-memory models

**User Experience Improvements:**
- Automatic warning display for resource-intensive models
- Clear indication that KoAlpaca may not work on free tier
- Updated README with Spaces-specific deployment details

**Technical Changes:**
- Changed HF_TOKEN from dotenv to direct os.getenv() with None fallback
- Added model warning system in MODELS configuration
- Implemented dynamic warning UI component
- Updated model dropdown to show memory requirements

**Documentation Updates:**
- Clarified HF_TOKEN is optional for public models
- Added Spaces free tier limitations (16GB RAM, 48h sleep)
- Recommended stable models for free tier deployment
- Updated deployment instructions with build time estimates

This ensures the chatbot works seamlessly on both local development
and Hugging Face Spaces free tier without configuration changes.

๐Ÿค– Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (3) hide show
  1. README.md +13 -10
  2. app.py +20 -9
  3. requirements.txt +0 -1
README.md CHANGED
@@ -48,19 +48,20 @@ cd simple-chatbot-gradio
48
  pip install -r requirements.txt
49
  ```
50
 
51
- ### 3. ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ •
52
 
53
- `.env` ํŒŒ์ผ์„ ์ƒ์„ฑํ•˜๊ณ  Hugging Face ํ† ํฐ์„ ์ถ”๊ฐ€ํ•˜์„ธ์š”:
54
 
55
- ```
56
- HF_TOKEN=your_hugging_face_token_here
 
 
57
  ```
58
 
59
  **Hugging Face ํ† ํฐ ๋ฐœ๊ธ‰ ๋ฐฉ๋ฒ•:**
60
  1. [Hugging Face](https://huggingface.co)์— ๋กœ๊ทธ์ธ
61
  2. Settings โ†’ Access Tokens ๋ฉ”๋‰ด๋กœ ์ด๋™
62
  3. "New token" ํด๋ฆญํ•˜์—ฌ ํ† ํฐ ์ƒ์„ฑ
63
- 4. ์ƒ์„ฑ๋œ ํ† ํฐ์„ `.env` ํŒŒ์ผ์— ๋ณต์‚ฌ
64
 
65
  ### 4. ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์‹คํ–‰
66
 
@@ -81,8 +82,8 @@ python app.py
81
  - `app.py`
82
  - `requirements.txt`
83
  - `README.md`
84
- 5. Settings โ†’ Repository secrets์—์„œ `HF_TOKEN` ์ถ”๊ฐ€
85
- 6. ์ž๋™ ๋นŒ๋“œ ๋ฐ ๋ฐฐํฌ ๋Œ€๊ธฐ
86
 
87
  ### ๋ฐฉ๋ฒ• 2: Git ์‚ฌ์šฉ
88
 
@@ -131,9 +132,11 @@ simple-chatbot-gradio/
131
  - **KoAlpaca 5.8B**: 8GB+ RAM ํ•„์š”, CPU์—์„œ ๋งค์šฐ ๋А๋ฆผ
132
 
133
  ### Hugging Face Spaces ๋ฐฐํฌ
134
- - **๋ฌด๋ฃŒ tier**: CPU ์ธ์Šคํ„ด์Šค๋งŒ ์ œ๊ณต
135
- - **Space Sleep**: ๋น„ํ™œ์„ฑ ์‹œ ์ž๋™ sleep, ์ฒซ ๋กœ๋”ฉ ๋А๋ฆผ
136
- - **๋””์Šคํฌ ์ œํ•œ**: KoAlpaca ๊ฐ™์€ ํฐ ๋ชจ๋ธ์€ ๋ฐฐํฌ ๋ถˆ๊ฐ€๋Šฅํ•  ์ˆ˜ ์žˆ์Œ
 
 
137
 
138
  ## ๐Ÿ”ง ๊ฐœ๋ฐœ ๋ฐ ์ปค์Šคํ„ฐ๋งˆ์ด์ง•
139
 
 
48
  pip install -r requirements.txt
49
  ```
50
 
51
+ ### 3. ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ์„ค์ • (์„ ํƒ์‚ฌํ•ญ)
52
 
53
+ Public ๋ชจ๋ธ๋งŒ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ์ด ๋‹จ๊ณ„๋ฅผ ๊ฑด๋„ˆ๋›ธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
54
 
55
+ Private ๋ชจ๋ธ ์ ‘๊ทผ์ด ํ•„์š”ํ•œ ๊ฒฝ์šฐ, ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋กœ HF_TOKEN์„ ์„ค์ •ํ•˜์„ธ์š”:
56
+
57
+ ```bash
58
+ export HF_TOKEN=your_hugging_face_token_here
59
  ```
60
 
61
  **Hugging Face ํ† ํฐ ๋ฐœ๊ธ‰ ๋ฐฉ๋ฒ•:**
62
  1. [Hugging Face](https://huggingface.co)์— ๋กœ๊ทธ์ธ
63
  2. Settings โ†’ Access Tokens ๋ฉ”๋‰ด๋กœ ์ด๋™
64
  3. "New token" ํด๋ฆญํ•˜์—ฌ ํ† ํฐ ์ƒ์„ฑ
 
65
 
66
  ### 4. ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์‹คํ–‰
67
 
 
82
  - `app.py`
83
  - `requirements.txt`
84
  - `README.md`
85
+ 5. (์„ ํƒ์‚ฌํ•ญ) Private ๋ชจ๋ธ ์‚ฌ์šฉ ์‹œ: Settings โ†’ Repository secrets์—์„œ `HF_TOKEN` ์ถ”๊ฐ€
86
+ 6. ์ž๋™ ๋นŒ๋“œ ๋ฐ ๋ฐฐํฌ ๋Œ€๊ธฐ (์ฒซ ๋นŒ๋“œ๋Š” 5-10๋ถ„ ์†Œ์š”)
87
 
88
  ### ๋ฐฉ๋ฒ• 2: Git ์‚ฌ์šฉ
89
 
 
132
  - **KoAlpaca 5.8B**: 8GB+ RAM ํ•„์š”, CPU์—์„œ ๋งค์šฐ ๋А๋ฆผ
133
 
134
  ### Hugging Face Spaces ๋ฐฐํฌ
135
+ - **๋ฌด๋ฃŒ tier**: CPU ์ธ์Šคํ„ด์Šค๋งŒ ์ œ๊ณต (16GB RAM)
136
+ - **Space Sleep**: 48์‹œ๊ฐ„ ๋น„ํ™œ์„ฑ ์‹œ ์ž๋™ sleep, ์ฒซ ๋กœ๋”ฉ ๋А๋ฆผ
137
+ - **๋ฉ”๋ชจ๋ฆฌ ์ œํ•œ**: KoAlpaca 5.8B๋Š” ๋ฌด๋ฃŒ tier์—์„œ ์‹คํ–‰ ๋ถˆ๊ฐ€ (8GB+ ํ•„์š”)
138
+ - **์ฒซ ์‹คํ–‰**: ๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ๋กœ 1-3๋ถ„ ์†Œ์š”
139
+ - **๊ถŒ์žฅ ๋ชจ๋ธ**: DialoGPT Small/Medium, GPT-2, KoGPT-2 (๋ฌด๋ฃŒ tier์—์„œ ์•ˆ์ •์ )
140
 
141
  ## ๐Ÿ”ง ๊ฐœ๋ฐœ ๋ฐ ์ปค์Šคํ„ฐ๋งˆ์ด์ง•
142
 
app.py CHANGED
@@ -7,11 +7,9 @@ import os
7
  import gradio as gr
8
  from transformers import AutoModelForCausalLM, AutoTokenizer
9
  import torch
10
- from dotenv import load_dotenv
11
 
12
- # Load environment variables
13
- load_dotenv()
14
- HF_TOKEN = os.getenv("HF_TOKEN")
15
 
16
  # Check device
17
  device = "cuda" if torch.cuda.is_available() else "cpu"
@@ -40,9 +38,10 @@ MODELS = {
40
  "language": "ko",
41
  },
42
  "beomi/KoAlpaca-Polyglot-5.8B": {
43
- "name": "KoAlpaca 5.8B (ํ•œ๊ธ€ ๋Œ€ํ™”ํ˜•, ๋А๋ฆผ)",
44
  "max_length": 150,
45
  "language": "ko",
 
46
  },
47
  }
48
 
@@ -205,6 +204,9 @@ with gr.Blocks(
205
  info="๋ชจ๋ธ์„ ๋ณ€๊ฒฝํ•˜๋ฉด ์ƒˆ ๋ชจ๋ธ์„ ๋‹ค์šด๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค (์ฒ˜์Œ ํ•œ ๋ฒˆ๋งŒ)",
206
  )
207
 
 
 
 
208
  # Chat interface
209
  chatbot = gr.ChatInterface(
210
  fn=chat_response,
@@ -228,19 +230,28 @@ with gr.Blocks(
228
  ],
229
  )
230
 
231
- # Clear chat when model changes
232
  def on_model_change(new_model):
233
  global current_model
234
  current_model = new_model
 
 
 
 
 
 
 
 
235
  # Preload new model
236
  load_model(new_model)
237
- # Return empty list to clear chat history
238
- return []
 
239
 
240
  model_dropdown.change(
241
  fn=on_model_change,
242
  inputs=[model_dropdown],
243
- outputs=[chatbot.chatbot_value],
244
  )
245
 
246
  gr.Markdown(
 
7
  import gradio as gr
8
  from transformers import AutoModelForCausalLM, AutoTokenizer
9
  import torch
 
10
 
11
+ # Get HF token from environment (Spaces uses Secrets, local uses .env)
12
+ HF_TOKEN = os.getenv("HF_TOKEN", None)
 
13
 
14
  # Check device
15
  device = "cuda" if torch.cuda.is_available() else "cpu"
 
38
  "language": "ko",
39
  },
40
  "beomi/KoAlpaca-Polyglot-5.8B": {
41
+ "name": "KoAlpaca 5.8B (ํ•œ๊ธ€ ๋Œ€ํ™”ํ˜•, โš ๏ธ 8GB+ RAM ํ•„์š”)",
42
  "max_length": 150,
43
  "language": "ko",
44
+ "warning": "์ด ๋ชจ๋ธ์€ 8GB ์ด์ƒ์˜ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. HF Spaces ๋ฌด๋ฃŒ tier์—์„œ๋Š” ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ์œผ๋กœ ์‹คํ–‰๋˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.",
45
  },
46
  }
47
 
 
204
  info="๋ชจ๋ธ์„ ๋ณ€๊ฒฝํ•˜๋ฉด ์ƒˆ ๋ชจ๋ธ์„ ๋‹ค์šด๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค (์ฒ˜์Œ ํ•œ ๋ฒˆ๋งŒ)",
205
  )
206
 
207
+ # Warning message for model requirements
208
+ model_warning = gr.Markdown("", visible=False)
209
+
210
  # Chat interface
211
  chatbot = gr.ChatInterface(
212
  fn=chat_response,
 
230
  ],
231
  )
232
 
233
+ # Show warning and clear chat when model changes
234
  def on_model_change(new_model):
235
  global current_model
236
  current_model = new_model
237
+
238
+ # Check if model has warning
239
+ warning_text = ""
240
+ warning_visible = False
241
+ if "warning" in MODELS[new_model]:
242
+ warning_text = f"โš ๏ธ **๊ฒฝ๊ณ **: {MODELS[new_model]['warning']}"
243
+ warning_visible = True
244
+
245
  # Preload new model
246
  load_model(new_model)
247
+
248
+ # Return: empty chat history, warning text, warning visibility
249
+ return [], warning_text, gr.update(visible=warning_visible)
250
 
251
  model_dropdown.change(
252
  fn=on_model_change,
253
  inputs=[model_dropdown],
254
+ outputs=[chatbot.chatbot_value, model_warning, model_warning],
255
  )
256
 
257
  gr.Markdown(
requirements.txt CHANGED
@@ -1,4 +1,3 @@
1
  gradio>=5.0.0
2
  transformers>=4.30.0
3
  torch>=2.0.0
4
- python-dotenv>=1.0.0
 
1
  gradio>=5.0.0
2
  transformers>=4.30.0
3
  torch>=2.0.0