Spaces:
Sleeping
feat: Optimize for Hugging Face Spaces deployment
Browse files**Hugging Face Spaces Compatibility:**
- Remove python-dotenv dependency (Spaces uses Secrets system)
- Update HF_TOKEN handling to work with both local and Spaces environments
- Add memory warning for KoAlpaca 5.8B model (8GB+ RAM required)
- Display warning message in UI when selecting high-memory models
**User Experience Improvements:**
- Automatic warning display for resource-intensive models
- Clear indication that KoAlpaca may not work on free tier
- Updated README with Spaces-specific deployment details
**Technical Changes:**
- Changed HF_TOKEN from dotenv to direct os.getenv() with None fallback
- Added model warning system in MODELS configuration
- Implemented dynamic warning UI component
- Updated model dropdown to show memory requirements
**Documentation Updates:**
- Clarified HF_TOKEN is optional for public models
- Added Spaces free tier limitations (16GB RAM, 48h sleep)
- Recommended stable models for free tier deployment
- Updated deployment instructions with build time estimates
This ensures the chatbot works seamlessly on both local development
and Hugging Face Spaces free tier without configuration changes.
๐ค Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- README.md +13 -10
- app.py +20 -9
- requirements.txt +0 -1
|
@@ -48,19 +48,20 @@ cd simple-chatbot-gradio
|
|
| 48 |
pip install -r requirements.txt
|
| 49 |
```
|
| 50 |
|
| 51 |
-
### 3. ํ๊ฒฝ ๋ณ์ ์ค์
|
| 52 |
|
| 53 |
-
|
| 54 |
|
| 55 |
-
|
| 56 |
-
|
|
|
|
|
|
|
| 57 |
```
|
| 58 |
|
| 59 |
**Hugging Face ํ ํฐ ๋ฐ๊ธ ๋ฐฉ๋ฒ:**
|
| 60 |
1. [Hugging Face](https://huggingface.co)์ ๋ก๊ทธ์ธ
|
| 61 |
2. Settings โ Access Tokens ๋ฉ๋ด๋ก ์ด๋
|
| 62 |
3. "New token" ํด๋ฆญํ์ฌ ํ ํฐ ์์ฑ
|
| 63 |
-
4. ์์ฑ๋ ํ ํฐ์ `.env` ํ์ผ์ ๋ณต์ฌ
|
| 64 |
|
| 65 |
### 4. ์ ํ๋ฆฌ์ผ์ด์
์คํ
|
| 66 |
|
|
@@ -81,8 +82,8 @@ python app.py
|
|
| 81 |
- `app.py`
|
| 82 |
- `requirements.txt`
|
| 83 |
- `README.md`
|
| 84 |
-
5. Settings โ Repository secrets์์ `HF_TOKEN` ์ถ๊ฐ
|
| 85 |
-
6. ์๋ ๋น๋ ๋ฐ ๋ฐฐํฌ ๋๊ธฐ
|
| 86 |
|
| 87 |
### ๋ฐฉ๋ฒ 2: Git ์ฌ์ฉ
|
| 88 |
|
|
@@ -131,9 +132,11 @@ simple-chatbot-gradio/
|
|
| 131 |
- **KoAlpaca 5.8B**: 8GB+ RAM ํ์, CPU์์ ๋งค์ฐ ๋๋ฆผ
|
| 132 |
|
| 133 |
### Hugging Face Spaces ๋ฐฐํฌ
|
| 134 |
-
- **๋ฌด๋ฃ tier**: CPU ์ธ์คํด์ค๋ง ์ ๊ณต
|
| 135 |
-
- **Space Sleep**: ๋นํ์ฑ ์ ์๋ sleep, ์ฒซ ๋ก๋ฉ ๋๋ฆผ
|
| 136 |
-
-
|
|
|
|
|
|
|
| 137 |
|
| 138 |
## ๐ง ๊ฐ๋ฐ ๋ฐ ์ปค์คํฐ๋ง์ด์ง
|
| 139 |
|
|
|
|
| 48 |
pip install -r requirements.txt
|
| 49 |
```
|
| 50 |
|
| 51 |
+
### 3. ํ๊ฒฝ ๋ณ์ ์ค์ (์ ํ์ฌํญ)
|
| 52 |
|
| 53 |
+
Public ๋ชจ๋ธ๋ง ์ฌ์ฉํ๋ ๊ฒฝ์ฐ ์ด ๋จ๊ณ๋ฅผ ๊ฑด๋๋ธ ์ ์์ต๋๋ค.
|
| 54 |
|
| 55 |
+
Private ๋ชจ๋ธ ์ ๊ทผ์ด ํ์ํ ๊ฒฝ์ฐ, ํ๊ฒฝ ๋ณ์๋ก HF_TOKEN์ ์ค์ ํ์ธ์:
|
| 56 |
+
|
| 57 |
+
```bash
|
| 58 |
+
export HF_TOKEN=your_hugging_face_token_here
|
| 59 |
```
|
| 60 |
|
| 61 |
**Hugging Face ํ ํฐ ๋ฐ๊ธ ๋ฐฉ๋ฒ:**
|
| 62 |
1. [Hugging Face](https://huggingface.co)์ ๋ก๊ทธ์ธ
|
| 63 |
2. Settings โ Access Tokens ๋ฉ๋ด๋ก ์ด๋
|
| 64 |
3. "New token" ํด๋ฆญํ์ฌ ํ ํฐ ์์ฑ
|
|
|
|
| 65 |
|
| 66 |
### 4. ์ ํ๋ฆฌ์ผ์ด์
์คํ
|
| 67 |
|
|
|
|
| 82 |
- `app.py`
|
| 83 |
- `requirements.txt`
|
| 84 |
- `README.md`
|
| 85 |
+
5. (์ ํ์ฌํญ) Private ๋ชจ๋ธ ์ฌ์ฉ ์: Settings โ Repository secrets์์ `HF_TOKEN` ์ถ๊ฐ
|
| 86 |
+
6. ์๋ ๋น๋ ๋ฐ ๋ฐฐํฌ ๋๊ธฐ (์ฒซ ๋น๋๋ 5-10๋ถ ์์)
|
| 87 |
|
| 88 |
### ๋ฐฉ๋ฒ 2: Git ์ฌ์ฉ
|
| 89 |
|
|
|
|
| 132 |
- **KoAlpaca 5.8B**: 8GB+ RAM ํ์, CPU์์ ๋งค์ฐ ๋๋ฆผ
|
| 133 |
|
| 134 |
### Hugging Face Spaces ๋ฐฐํฌ
|
| 135 |
+
- **๋ฌด๋ฃ tier**: CPU ์ธ์คํด์ค๋ง ์ ๊ณต (16GB RAM)
|
| 136 |
+
- **Space Sleep**: 48์๊ฐ ๋นํ์ฑ ์ ์๋ sleep, ์ฒซ ๋ก๋ฉ ๋๋ฆผ
|
| 137 |
+
- **๋ฉ๋ชจ๋ฆฌ ์ ํ**: KoAlpaca 5.8B๋ ๋ฌด๋ฃ tier์์ ์คํ ๋ถ๊ฐ (8GB+ ํ์)
|
| 138 |
+
- **์ฒซ ์คํ**: ๋ชจ๋ธ ๋ค์ด๋ก๋๋ก 1-3๋ถ ์์
|
| 139 |
+
- **๊ถ์ฅ ๋ชจ๋ธ**: DialoGPT Small/Medium, GPT-2, KoGPT-2 (๋ฌด๋ฃ tier์์ ์์ ์ )
|
| 140 |
|
| 141 |
## ๐ง ๊ฐ๋ฐ ๋ฐ ์ปค์คํฐ๋ง์ด์ง
|
| 142 |
|
|
@@ -7,11 +7,9 @@ import os
|
|
| 7 |
import gradio as gr
|
| 8 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 9 |
import torch
|
| 10 |
-
from dotenv import load_dotenv
|
| 11 |
|
| 12 |
-
#
|
| 13 |
-
|
| 14 |
-
HF_TOKEN = os.getenv("HF_TOKEN")
|
| 15 |
|
| 16 |
# Check device
|
| 17 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
|
@@ -40,9 +38,10 @@ MODELS = {
|
|
| 40 |
"language": "ko",
|
| 41 |
},
|
| 42 |
"beomi/KoAlpaca-Polyglot-5.8B": {
|
| 43 |
-
"name": "KoAlpaca 5.8B (ํ๊ธ ๋ํํ,
|
| 44 |
"max_length": 150,
|
| 45 |
"language": "ko",
|
|
|
|
| 46 |
},
|
| 47 |
}
|
| 48 |
|
|
@@ -205,6 +204,9 @@ with gr.Blocks(
|
|
| 205 |
info="๋ชจ๋ธ์ ๋ณ๊ฒฝํ๋ฉด ์ ๋ชจ๋ธ์ ๋ค์ด๋ก๋ํฉ๋๋ค (์ฒ์ ํ ๋ฒ๋ง)",
|
| 206 |
)
|
| 207 |
|
|
|
|
|
|
|
|
|
|
| 208 |
# Chat interface
|
| 209 |
chatbot = gr.ChatInterface(
|
| 210 |
fn=chat_response,
|
|
@@ -228,19 +230,28 @@ with gr.Blocks(
|
|
| 228 |
],
|
| 229 |
)
|
| 230 |
|
| 231 |
-
#
|
| 232 |
def on_model_change(new_model):
|
| 233 |
global current_model
|
| 234 |
current_model = new_model
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 235 |
# Preload new model
|
| 236 |
load_model(new_model)
|
| 237 |
-
|
| 238 |
-
|
|
|
|
| 239 |
|
| 240 |
model_dropdown.change(
|
| 241 |
fn=on_model_change,
|
| 242 |
inputs=[model_dropdown],
|
| 243 |
-
outputs=[chatbot.chatbot_value],
|
| 244 |
)
|
| 245 |
|
| 246 |
gr.Markdown(
|
|
|
|
| 7 |
import gradio as gr
|
| 8 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 9 |
import torch
|
|
|
|
| 10 |
|
| 11 |
+
# Get HF token from environment (Spaces uses Secrets, local uses .env)
|
| 12 |
+
HF_TOKEN = os.getenv("HF_TOKEN", None)
|
|
|
|
| 13 |
|
| 14 |
# Check device
|
| 15 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
|
|
|
| 38 |
"language": "ko",
|
| 39 |
},
|
| 40 |
"beomi/KoAlpaca-Polyglot-5.8B": {
|
| 41 |
+
"name": "KoAlpaca 5.8B (ํ๊ธ ๋ํํ, โ ๏ธ 8GB+ RAM ํ์)",
|
| 42 |
"max_length": 150,
|
| 43 |
"language": "ko",
|
| 44 |
+
"warning": "์ด ๋ชจ๋ธ์ 8GB ์ด์์ ๋ฉ๋ชจ๋ฆฌ๊ฐ ํ์ํฉ๋๋ค. HF Spaces ๋ฌด๋ฃ tier์์๋ ๋ฉ๋ชจ๋ฆฌ ๋ถ์กฑ์ผ๋ก ์คํ๋์ง ์์ ์ ์์ต๋๋ค.",
|
| 45 |
},
|
| 46 |
}
|
| 47 |
|
|
|
|
| 204 |
info="๋ชจ๋ธ์ ๋ณ๊ฒฝํ๋ฉด ์ ๋ชจ๋ธ์ ๋ค์ด๋ก๋ํฉ๋๋ค (์ฒ์ ํ ๋ฒ๋ง)",
|
| 205 |
)
|
| 206 |
|
| 207 |
+
# Warning message for model requirements
|
| 208 |
+
model_warning = gr.Markdown("", visible=False)
|
| 209 |
+
|
| 210 |
# Chat interface
|
| 211 |
chatbot = gr.ChatInterface(
|
| 212 |
fn=chat_response,
|
|
|
|
| 230 |
],
|
| 231 |
)
|
| 232 |
|
| 233 |
+
# Show warning and clear chat when model changes
|
| 234 |
def on_model_change(new_model):
|
| 235 |
global current_model
|
| 236 |
current_model = new_model
|
| 237 |
+
|
| 238 |
+
# Check if model has warning
|
| 239 |
+
warning_text = ""
|
| 240 |
+
warning_visible = False
|
| 241 |
+
if "warning" in MODELS[new_model]:
|
| 242 |
+
warning_text = f"โ ๏ธ **๊ฒฝ๊ณ **: {MODELS[new_model]['warning']}"
|
| 243 |
+
warning_visible = True
|
| 244 |
+
|
| 245 |
# Preload new model
|
| 246 |
load_model(new_model)
|
| 247 |
+
|
| 248 |
+
# Return: empty chat history, warning text, warning visibility
|
| 249 |
+
return [], warning_text, gr.update(visible=warning_visible)
|
| 250 |
|
| 251 |
model_dropdown.change(
|
| 252 |
fn=on_model_change,
|
| 253 |
inputs=[model_dropdown],
|
| 254 |
+
outputs=[chatbot.chatbot_value, model_warning, model_warning],
|
| 255 |
)
|
| 256 |
|
| 257 |
gr.Markdown(
|
|
@@ -1,4 +1,3 @@
|
|
| 1 |
gradio>=5.0.0
|
| 2 |
transformers>=4.30.0
|
| 3 |
torch>=2.0.0
|
| 4 |
-
python-dotenv>=1.0.0
|
|
|
|
| 1 |
gradio>=5.0.0
|
| 2 |
transformers>=4.30.0
|
| 3 |
torch>=2.0.0
|
|
|