Spaces:

alex4cip
/

simple-chat

Sleeping

alex4cip Claude commited on Oct 20

Commit

5c29192

1 Parent(s): 6ae6708

feat: Optimize for Hugging Face Spaces deployment

**Hugging Face Spaces Compatibility:**
- Remove python-dotenv dependency (Spaces uses Secrets system)
- Update HF_TOKEN handling to work with both local and Spaces environments
- Add memory warning for KoAlpaca 5.8B model (8GB+ RAM required)
- Display warning message in UI when selecting high-memory models

**User Experience Improvements:**
- Automatic warning display for resource-intensive models
- Clear indication that KoAlpaca may not work on free tier
- Updated README with Spaces-specific deployment details

**Technical Changes:**
- Changed HF_TOKEN from dotenv to direct os.getenv() with None fallback
- Added model warning system in MODELS configuration
- Implemented dynamic warning UI component
- Updated model dropdown to show memory requirements

**Documentation Updates:**
- Clarified HF_TOKEN is optional for public models
- Added Spaces free tier limitations (16GB RAM, 48h sleep)
- Recommended stable models for free tier deployment
- Updated deployment instructions with build time estimates

This ensures the chatbot works seamlessly on both local development
and Hugging Face Spaces free tier without configuration changes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (3) hide show

README.md +13 -10
app.py +20 -9
requirements.txt +0 -1

README.md CHANGED Viewed

@@ -48,19 +48,20 @@ cd simple-chatbot-gradio
 pip install -r requirements.txt
 ```
-### 3. 환경 변수 설정
-`.env` 파일을 생성하고 Hugging Face 토큰을 추가하세요:
-```
-HF_TOKEN=your_hugging_face_token_here
 ```
 **Hugging Face 토큰 발급 방법:**
 1. [Hugging Face](https://huggingface.co)에 로그인
 2. Settings → Access Tokens 메뉴로 이동
 3. "New token" 클릭하여 토큰 생성
-4. 생성된 토큰을 `.env` 파일에 복사
 ### 4. 애플리케이션 실행
@@ -81,8 +82,8 @@ python app.py
    - `app.py`
    - `requirements.txt`
    - `README.md`
-5. Settings → Repository secrets에서 `HF_TOKEN` 추가
-6. 자동 빌드 및 배포 대기
 ### 방법 2: Git 사용
@@ -131,9 +132,11 @@ simple-chatbot-gradio/
 - **KoAlpaca 5.8B**: 8GB+ RAM 필요, CPU에서 매우 느림
 ### Hugging Face Spaces 배포
-- **무료 tier**: CPU 인스턴스만 제공
-- **Space Sleep**: 비활성 시 자동 sleep, 첫 로딩 느림
-- **디스크 제한**: KoAlpaca 같은 큰 모델은 배포 불가능할 수 있음
 ## 🔧 개발 및 커스터마이징

 pip install -r requirements.txt
 ```
+### 3. 환경 변수 설정 (선택사항)
+Public 모델만 사용하는 경우 이 단계를 건너뛸 수 있습니다.
+Private 모델 접근이 필요한 경우, 환경 변수로 HF_TOKEN을 설정하세요:
+```bash
+export HF_TOKEN=your_hugging_face_token_here
 ```
 **Hugging Face 토큰 발급 방법:**
 1. [Hugging Face](https://huggingface.co)에 로그인
 2. Settings → Access Tokens 메뉴로 이동
 3. "New token" 클릭하여 토큰 생성
 ### 4. 애플리케이션 실행
    - `app.py`
    - `requirements.txt`
    - `README.md`
+5. (선택사항) Private 모델 사용 시: Settings → Repository secrets에서 `HF_TOKEN` 추가
+6. 자동 빌드 및 배포 대기 (첫 빌드는 5-10분 소요)
 ### 방법 2: Git 사용
 - **KoAlpaca 5.8B**: 8GB+ RAM 필요, CPU에서 매우 느림
 ### Hugging Face Spaces 배포
+- **무료 tier**: CPU 인스턴스만 제공 (16GB RAM)
+- **Space Sleep**: 48시간 비활성 시 자동 sleep, 첫 로딩 느림
+- **메모리 제한**: KoAlpaca 5.8B는 무료 tier에서 실행 불가 (8GB+ 필요)
+- **첫 실행**: 모델 다운로드로 1-3분 소요
+- **권장 모델**: DialoGPT Small/Medium, GPT-2, KoGPT-2 (무료 tier에서 안정적)
 ## 🔧 개발 및 커스터마이징

app.py CHANGED Viewed

@@ -7,11 +7,9 @@ import os
 import gradio as gr
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
-from dotenv import load_dotenv
-# Load environment variables
-load_dotenv()
-HF_TOKEN = os.getenv("HF_TOKEN")
 # Check device
 device = "cuda" if torch.cuda.is_available() else "cpu"
@@ -40,9 +38,10 @@ MODELS = {
         "language": "ko",
     },
     "beomi/KoAlpaca-Polyglot-5.8B": {
-        "name": "KoAlpaca 5.8B (한글 대화형, 느림)",
         "max_length": 150,
         "language": "ko",
     },
 }
@@ -205,6 +204,9 @@ with gr.Blocks(
         info="모델을 변경하면 새 모델을 다운로드합니다 (처음 한 번만)",
     )
     # Chat interface
     chatbot = gr.ChatInterface(
         fn=chat_response,
@@ -228,19 +230,28 @@ with gr.Blocks(
         ],
     )
-    # Clear chat when model changes
     def on_model_change(new_model):
         global current_model
         current_model = new_model
         # Preload new model
         load_model(new_model)
-        # Return empty list to clear chat history
-        return []
     model_dropdown.change(
         fn=on_model_change,
         inputs=[model_dropdown],
-        outputs=[chatbot.chatbot_value],
     )
     gr.Markdown(

 import gradio as gr
 from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
+# Get HF token from environment (Spaces uses Secrets, local uses .env)
+HF_TOKEN = os.getenv("HF_TOKEN", None)
 # Check device
 device = "cuda" if torch.cuda.is_available() else "cpu"
         "language": "ko",
     },
     "beomi/KoAlpaca-Polyglot-5.8B": {
+        "name": "KoAlpaca 5.8B (한글 대화형, ⚠️ 8GB+ RAM 필요)",
         "max_length": 150,
         "language": "ko",
+        "warning": "이 모델은 8GB 이상의 메모리가 필요합니다. HF Spaces 무료 tier에서는 메모리 부족으로 실행되지 않을 수 있습니다.",
     },
 }
         info="모델을 변경하면 새 모델을 다운로드합니다 (처음 한 번만)",
     )
+    # Warning message for model requirements
+    model_warning = gr.Markdown("", visible=False)
     # Chat interface
     chatbot = gr.ChatInterface(
         fn=chat_response,
         ],
     )
+    # Show warning and clear chat when model changes
     def on_model_change(new_model):
         global current_model
         current_model = new_model
+        # Check if model has warning
+        warning_text = ""
+        warning_visible = False
+        if "warning" in MODELS[new_model]:
+            warning_text = f"⚠️ **경고**: {MODELS[new_model]['warning']}"
+            warning_visible = True
         # Preload new model
         load_model(new_model)
+        # Return: empty chat history, warning text, warning visibility
+        return [], warning_text, gr.update(visible=warning_visible)
     model_dropdown.change(
         fn=on_model_change,
         inputs=[model_dropdown],
+        outputs=[chatbot.chatbot_value, model_warning, model_warning],
     )
     gr.Markdown(

requirements.txt CHANGED Viewed

@@ -1,4 +1,3 @@
 gradio>=5.0.0
 transformers>=4.30.0
 torch>=2.0.0
-python-dotenv>=1.0.0

 gradio>=5.0.0
 transformers>=4.30.0
 torch>=2.0.0