Spaces:
Sleeping
Sleeping
Commit
Β·
c9ef1fe
0
Parent(s):
feat: Hugging Face LLM chatbot with multi-language support
Browse files- Implement local model execution using transformers
- Add 5 models: 3 English (DialoGPT, GPT-2) + 2 Korean (KoGPT-2, KoAlpaca)
- Support both English and Korean conversations
- No API rate limits, fully offline-capable after initial download
- Built with Gradio 5.x for web interface
Features:
- Multiple model selection with automatic chat reset
- Local model caching for improved performance
- Detailed error handling and user feedback
- Comprehensive documentation in README and CLAUDE.md
Technical stack:
- Gradio 5.x for web UI
- Transformers + PyTorch for model inference
- CPU/GPU support with automatic device detection
π€ Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- .claude/settings.local.json +13 -0
- .gitignore +46 -0
- CLAUDE.md +180 -0
- README.md +164 -0
- app.py +270 -0
- requirements.txt +4 -0
.claude/settings.local.json
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"permissions": {
|
| 3 |
+
"allow": [
|
| 4 |
+
"Bash(python app.py)",
|
| 5 |
+
"Bash(curl -s http://localhost:7860)",
|
| 6 |
+
"Bash(curl -X POST \"https://api-inference.huggingface.co/models/gpt2\" )",
|
| 7 |
+
"Bash(git init)",
|
| 8 |
+
"Bash(git add .)"
|
| 9 |
+
],
|
| 10 |
+
"deny": [],
|
| 11 |
+
"ask": []
|
| 12 |
+
}
|
| 13 |
+
}
|
.gitignore
ADDED
|
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Environment variables
|
| 2 |
+
.env
|
| 3 |
+
.env.local
|
| 4 |
+
|
| 5 |
+
# Python
|
| 6 |
+
__pycache__/
|
| 7 |
+
*.py[cod]
|
| 8 |
+
*$py.class
|
| 9 |
+
*.so
|
| 10 |
+
.Python
|
| 11 |
+
build/
|
| 12 |
+
develop-eggs/
|
| 13 |
+
dist/
|
| 14 |
+
downloads/
|
| 15 |
+
eggs/
|
| 16 |
+
.eggs/
|
| 17 |
+
lib/
|
| 18 |
+
lib64/
|
| 19 |
+
parts/
|
| 20 |
+
sdist/
|
| 21 |
+
var/
|
| 22 |
+
wheels/
|
| 23 |
+
*.egg-info/
|
| 24 |
+
.installed.cfg
|
| 25 |
+
*.egg
|
| 26 |
+
|
| 27 |
+
# Virtual environments
|
| 28 |
+
venv/
|
| 29 |
+
env/
|
| 30 |
+
ENV/
|
| 31 |
+
.venv
|
| 32 |
+
|
| 33 |
+
# IDE
|
| 34 |
+
.vscode/
|
| 35 |
+
.idea/
|
| 36 |
+
*.swp
|
| 37 |
+
*.swo
|
| 38 |
+
*~
|
| 39 |
+
|
| 40 |
+
# Gradio
|
| 41 |
+
gradio_cached_examples/
|
| 42 |
+
flagged/
|
| 43 |
+
|
| 44 |
+
# OS
|
| 45 |
+
.DS_Store
|
| 46 |
+
Thumbs.db
|
CLAUDE.md
ADDED
|
@@ -0,0 +1,180 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# CLAUDE.md
|
| 2 |
+
|
| 3 |
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
| 4 |
+
|
| 5 |
+
## Project Overview
|
| 6 |
+
|
| 7 |
+
Multi-model LLM chatbot using Hugging Face Inference API and Gradio. Users can select from multiple pre-configured models and have conversations with them. Model changes automatically reset the conversation.
|
| 8 |
+
|
| 9 |
+
## Tech Stack
|
| 10 |
+
|
| 11 |
+
- **Python**: 3.10+
|
| 12 |
+
- **Framework**: Gradio 5.x (ChatInterface + Blocks)
|
| 13 |
+
- **API**: Hugging Face Serverless Inference API (free tier)
|
| 14 |
+
- **Deployment**: Hugging Face Spaces (free CPU instance)
|
| 15 |
+
|
| 16 |
+
## Project Structure
|
| 17 |
+
|
| 18 |
+
```
|
| 19 |
+
βββ app.py # Main application
|
| 20 |
+
βββ requirements.txt # Python dependencies
|
| 21 |
+
βββ README.md # Spaces configuration + documentation
|
| 22 |
+
βββ .env # HF_TOKEN (git ignored)
|
| 23 |
+
βββ CLAUDE.md # This file
|
| 24 |
+
```
|
| 25 |
+
|
| 26 |
+
## Development Commands
|
| 27 |
+
|
| 28 |
+
### Local Development
|
| 29 |
+
|
| 30 |
+
```bash
|
| 31 |
+
# Install dependencies
|
| 32 |
+
pip install -r requirements.txt
|
| 33 |
+
|
| 34 |
+
# Run locally (requires HF_TOKEN in .env)
|
| 35 |
+
python app.py
|
| 36 |
+
|
| 37 |
+
# Access at http://localhost:7860
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
### Deployment to Hugging Face Spaces
|
| 41 |
+
|
| 42 |
+
**Method 1: Web UI**
|
| 43 |
+
1. Create Space at https://huggingface.co/spaces
|
| 44 |
+
2. Select Gradio SDK
|
| 45 |
+
3. Upload `app.py`, `requirements.txt`, `README.md`
|
| 46 |
+
4. Add `HF_TOKEN` to Settings β Repository secrets
|
| 47 |
+
|
| 48 |
+
**Method 2: Git Push**
|
| 49 |
+
```bash
|
| 50 |
+
git remote add space https://huggingface.co/spaces/<username>/<space-name>
|
| 51 |
+
git push space main
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
## Architecture
|
| 55 |
+
|
| 56 |
+
### Core Components
|
| 57 |
+
|
| 58 |
+
**`app.py` Structure**:
|
| 59 |
+
- `MODELS` dict: Model configurations (ID, display name, parameters)
|
| 60 |
+
- `chat_response()`: Main inference function handling multiple model types
|
| 61 |
+
- `on_model_change()`: Clears chat when model selection changes
|
| 62 |
+
- Gradio Blocks: UI composition with model dropdown + ChatInterface
|
| 63 |
+
|
| 64 |
+
**Model Handling Patterns**:
|
| 65 |
+
- **DialoGPT**: Text continuation with conversation history formatting
|
| 66 |
+
- **BlenderBot**: Conversational API with single-turn context
|
| 67 |
+
- **Flan-T5**: Instruction-based text generation with prompt engineering
|
| 68 |
+
- **Zephyr**: Chat completion API with message history formatting
|
| 69 |
+
|
| 70 |
+
**State Management**:
|
| 71 |
+
- Global `current_model` tracks selected model
|
| 72 |
+
- Model change triggers chat history reset via Gradio event handlers
|
| 73 |
+
- Each model type uses appropriate API method from `InferenceClient`
|
| 74 |
+
|
| 75 |
+
### API Integration
|
| 76 |
+
|
| 77 |
+
**Hugging Face InferenceClient Usage**:
|
| 78 |
+
```python
|
| 79 |
+
client = InferenceClient(token=HF_TOKEN)
|
| 80 |
+
|
| 81 |
+
# Different methods for different model types
|
| 82 |
+
client.text_generation() # DialoGPT, Flan-T5
|
| 83 |
+
client.conversational() # BlenderBot
|
| 84 |
+
client.chat_completion() # Zephyr (chat models)
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
**Rate Limiting & Error Handling**:
|
| 88 |
+
- Free tier: ~100-300 requests/hour
|
| 89 |
+
- Graceful degradation with user-friendly error messages
|
| 90 |
+
- Timeout and rate limit detection in exception handling
|
| 91 |
+
|
| 92 |
+
## Environment Setup
|
| 93 |
+
|
| 94 |
+
**Required Environment Variable**:
|
| 95 |
+
```bash
|
| 96 |
+
HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
|
| 97 |
+
```
|
| 98 |
+
|
| 99 |
+
**Obtaining HF_TOKEN**:
|
| 100 |
+
1. Login to https://huggingface.co
|
| 101 |
+
2. Settings β Access Tokens
|
| 102 |
+
3. Create new token with "Read" permissions
|
| 103 |
+
4. Copy to `.env` file (local) or Space secrets (deployment)
|
| 104 |
+
|
| 105 |
+
## Adding New Models
|
| 106 |
+
|
| 107 |
+
1. **Add to MODELS dict** in [app.py:23-45](app.py#L23-L45):
|
| 108 |
+
```python
|
| 109 |
+
"model-org/model-name": {
|
| 110 |
+
"name": "Display Name",
|
| 111 |
+
"max_length": 512,
|
| 112 |
+
"temperature": 0.7,
|
| 113 |
+
}
|
| 114 |
+
```
|
| 115 |
+
|
| 116 |
+
2. **Update chat_response()** if model requires special handling:
|
| 117 |
+
- Check model name in conditional logic
|
| 118 |
+
- Use appropriate InferenceClient method
|
| 119 |
+
- Format prompt/messages according to model requirements
|
| 120 |
+
|
| 121 |
+
3. **Verify free tier compatibility**:
|
| 122 |
+
- Test model availability via Inference API
|
| 123 |
+
- Check rate limits and response times
|
| 124 |
+
- Update README.md model list
|
| 125 |
+
|
| 126 |
+
## UI Customization
|
| 127 |
+
|
| 128 |
+
**Changing Language**:
|
| 129 |
+
- All UI strings are in Korean by default
|
| 130 |
+
- Modify markdown strings and button labels in [app.py:140-220](app.py#L140-L220)
|
| 131 |
+
|
| 132 |
+
**Theme & Styling**:
|
| 133 |
+
```python
|
| 134 |
+
gr.Blocks(theme=gr.themes.Soft()) # Change theme here
|
| 135 |
+
```
|
| 136 |
+
|
| 137 |
+
**Chat Examples**:
|
| 138 |
+
- Modify `examples` parameter in ChatInterface [app.py:187-192](app.py#L187-L192)
|
| 139 |
+
|
| 140 |
+
## Common Issues
|
| 141 |
+
|
| 142 |
+
**"Rate limit exceeded"**:
|
| 143 |
+
- Free tier limitation, wait ~1 hour or upgrade to PRO ($9/month)
|
| 144 |
+
|
| 145 |
+
**Model timeout/unavailable**:
|
| 146 |
+
- High demand on free tier, try different model or retry later
|
| 147 |
+
|
| 148 |
+
**Space sleeping**:
|
| 149 |
+
- Spaces sleep after inactivity, first load may be slow
|
| 150 |
+
|
| 151 |
+
## Testing Locally
|
| 152 |
+
|
| 153 |
+
```bash
|
| 154 |
+
# Ensure .env exists with HF_TOKEN
|
| 155 |
+
python app.py
|
| 156 |
+
|
| 157 |
+
# Test each model:
|
| 158 |
+
# 1. Select model from dropdown
|
| 159 |
+
# 2. Send test message
|
| 160 |
+
# 3. Verify response generation
|
| 161 |
+
# 4. Change model and verify chat resets
|
| 162 |
+
```
|
| 163 |
+
|
| 164 |
+
## Deployment Notes
|
| 165 |
+
|
| 166 |
+
**README.md YAML Header**:
|
| 167 |
+
- Required for Spaces configuration
|
| 168 |
+
- Specifies SDK, Python version, app file
|
| 169 |
+
- Auto-detected by Hugging Face
|
| 170 |
+
|
| 171 |
+
**Environment Variables in Spaces**:
|
| 172 |
+
- Set via Settings β Repository secrets
|
| 173 |
+
- Name must match exactly: `HF_TOKEN`
|
| 174 |
+
- Never commit tokens to repository
|
| 175 |
+
|
| 176 |
+
**Free Tier Constraints**:
|
| 177 |
+
- CPU only (no GPU)
|
| 178 |
+
- Auto-sleep after inactivity
|
| 179 |
+
- Rate limits on API calls
|
| 180 |
+
- May experience slower inference
|
README.md
ADDED
|
@@ -0,0 +1,164 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: LLM Chatbot
|
| 3 |
+
emoji: π€
|
| 4 |
+
colorFrom: blue
|
| 5 |
+
colorTo: purple
|
| 6 |
+
sdk: gradio
|
| 7 |
+
sdk_version: 5.9.1
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
license: mit
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# π€ Hugging Face LLM Chatbot
|
| 14 |
+
|
| 15 |
+
λ€μν μ€νμμ€ LLM λͺ¨λΈκ³Ό λνν μ μλ μΉ κΈ°λ° μ±λ΄ μ ν리μΌμ΄μ
μ
λλ€.
|
| 16 |
+
|
| 17 |
+
## β¨ μ£Όμ κΈ°λ₯
|
| 18 |
+
|
| 19 |
+
- **λ€μ€ λͺ¨λΈ μ§μ**: 5κ° λͺ¨λΈ (μμ΄ 3κ°, νκΈ 2κ°)
|
| 20 |
+
- **λ‘컬 μ€ν**: Transformers λΌμ΄λΈλ¬λ¦¬λ‘ λ‘컬μμ λͺ¨λΈ μ€ν
|
| 21 |
+
- **API μ ν μμ**: μΈν°λ· μ°κ²° μμ΄λ μλ (첫 λ€μ΄λ‘λ ν)
|
| 22 |
+
- **μλ μΈμ
κ΄λ¦¬**: λͺ¨λΈ λ³κ²½ μ λν μλ μ΄κΈ°ν
|
| 23 |
+
- **μμ 무λ£**: API λΉμ© μμ, μ€νμμ€
|
| 24 |
+
|
| 25 |
+
## π― μ§μ λͺ¨λΈ
|
| 26 |
+
|
| 27 |
+
### μμ΄ λͺ¨λΈ
|
| 28 |
+
1. **DialoGPT Small** - λΉ λ₯Έ λνν λͺ¨λΈ (~350MB)
|
| 29 |
+
2. **DialoGPT Medium** - κ³ νμ§ λνν λͺ¨λΈ (~800MB)
|
| 30 |
+
3. **GPT-2** - λ²μ© ν
μ€νΈ μμ± λͺ¨λΈ (~500MB)
|
| 31 |
+
|
| 32 |
+
### νκΈ λͺ¨λΈ
|
| 33 |
+
4. **KoGPT-2** - SKTμ νκΈ νΉν λͺ¨λΈ (~500MB)
|
| 34 |
+
5. **KoAlpaca 5.8B** - λνν νκΈ λͺ¨λΈ, κ³ μ¬μ νμ (~12GB)
|
| 35 |
+
|
| 36 |
+
## π λ‘컬 μ€ν λ°©λ²
|
| 37 |
+
|
| 38 |
+
### 1. μ μ₯μ ν΄λ‘
|
| 39 |
+
|
| 40 |
+
```bash
|
| 41 |
+
git clone <repository-url>
|
| 42 |
+
cd simple-chatbot-gradio
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
### 2. μμ‘΄μ± μ€μΉ
|
| 46 |
+
|
| 47 |
+
```bash
|
| 48 |
+
pip install -r requirements.txt
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
### 3. νκ²½ λ³μ μ€μ
|
| 52 |
+
|
| 53 |
+
`.env` νμΌμ μμ±νκ³ Hugging Face ν ν°μ μΆκ°νμΈμ:
|
| 54 |
+
|
| 55 |
+
```
|
| 56 |
+
HF_TOKEN=your_hugging_face_token_here
|
| 57 |
+
```
|
| 58 |
+
|
| 59 |
+
**Hugging Face ν ν° λ°κΈ λ°©λ²:**
|
| 60 |
+
1. [Hugging Face](https://huggingface.co)μ λ‘κ·ΈμΈ
|
| 61 |
+
2. Settings β Access Tokens λ©λ΄λ‘ μ΄λ
|
| 62 |
+
3. "New token" ν΄λ¦νμ¬ ν ν° μμ±
|
| 63 |
+
4. μμ±λ ν ν°μ `.env` νμΌμ 볡μ¬
|
| 64 |
+
|
| 65 |
+
### 4. μ ν리μΌμ΄μ
μ€ν
|
| 66 |
+
|
| 67 |
+
```bash
|
| 68 |
+
python app.py
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
λΈλΌμ°μ μμ `http://localhost:7860`μΌλ‘ μ μνμΈμ.
|
| 72 |
+
|
| 73 |
+
## π Hugging Face Spaces λ°°ν¬
|
| 74 |
+
|
| 75 |
+
### λ°©λ² 1: μΉ UI μ¬μ©
|
| 76 |
+
|
| 77 |
+
1. [Hugging Face Spaces](https://huggingface.co/spaces)μ μ μ
|
| 78 |
+
2. "Create new Space" ν΄λ¦
|
| 79 |
+
3. SDKλ‘ "Gradio" μ ν
|
| 80 |
+
4. νμΌ μ
λ‘λ:
|
| 81 |
+
- `app.py`
|
| 82 |
+
- `requirements.txt`
|
| 83 |
+
- `README.md`
|
| 84 |
+
5. Settings β Repository secretsμμ `HF_TOKEN` μΆκ°
|
| 85 |
+
6. μλ λΉλ λ° λ°°ν¬ λκΈ°
|
| 86 |
+
|
| 87 |
+
### λ°©λ² 2: Git μ¬μ©
|
| 88 |
+
|
| 89 |
+
```bash
|
| 90 |
+
# Hugging Face Space μ μ₯μλ₯Ό remoteλ‘ μΆκ°
|
| 91 |
+
git remote add space https://huggingface.co/spaces/<username>/<space-name>
|
| 92 |
+
|
| 93 |
+
# νμΌ νΈμ
|
| 94 |
+
git add .
|
| 95 |
+
git commit -m "Initial commit"
|
| 96 |
+
git push space main
|
| 97 |
+
```
|
| 98 |
+
|
| 99 |
+
## βοΈ κΈ°μ μ€ν
|
| 100 |
+
|
| 101 |
+
- **νλ μμν¬**: Gradio 5.x
|
| 102 |
+
- **ML λΌμ΄λΈλ¬λ¦¬**: Transformers, PyTorch
|
| 103 |
+
- **μΈμ΄**: Python 3.10+
|
| 104 |
+
- **μ£Όμ λΌμ΄λΈλ¬λ¦¬**:
|
| 105 |
+
- `gradio` - μΉ μΈν°νμ΄μ€
|
| 106 |
+
- `transformers` - λͺ¨λΈ λ‘λ© λ° μΆλ‘
|
| 107 |
+
- `torch` - λ₯λ¬λ νλ μμν¬
|
| 108 |
+
- `python-dotenv` - νκ²½ λ³μ κ΄λ¦¬
|
| 109 |
+
|
| 110 |
+
## π νλ‘μ νΈ κ΅¬μ‘°
|
| 111 |
+
|
| 112 |
+
```
|
| 113 |
+
simple-chatbot-gradio/
|
| 114 |
+
βββ app.py # λ©μΈ μ ν리μΌμ΄μ
|
| 115 |
+
βββ requirements.txt # Python μμ‘΄μ±
|
| 116 |
+
βββ README.md # νλ‘μ νΈ λ¬Έμ
|
| 117 |
+
βββ .env # νκ²½ λ³μ (git ignored)
|
| 118 |
+
βββ CLAUDE.md # κ°λ° κ°μ΄λ
|
| 119 |
+
```
|
| 120 |
+
|
| 121 |
+
## β οΈ μ νμ¬ν λ° μ£Όμμ¬ν
|
| 122 |
+
|
| 123 |
+
### μ±λ₯
|
| 124 |
+
- **CPU μ€ν**: GPU μμ΄ CPUμμ μ€νλλ―λ‘ μλ΅μ΄ λ릴 μ μμ΅λλ€ (5-10μ΄)
|
| 125 |
+
- **λ©λͺ¨λ¦¬**: λͺ¨λΈ ν¬κΈ°μ λ°λΌ 1-8GB RAM νμ
|
| 126 |
+
- **첫 μ€ν**: λͺ¨λΈ λ€μ΄λ‘λλ‘ μκ° μμ (350MB~12GB)
|
| 127 |
+
|
| 128 |
+
### λͺ¨λΈλ³ νΉμ±
|
| 129 |
+
- **μμ΄ λͺ¨λΈ**: νκΈ μ
λ ₯ μ λΆμμ°μ€λ¬μ΄ μλ΅
|
| 130 |
+
- **νκΈ λͺ¨λΈ**: μμ΄ μ
λ ₯ μ μ±λ₯ μ ν
|
| 131 |
+
- **KoAlpaca 5.8B**: 8GB+ RAM νμ, CPUμμ λ§€μ° λλ¦Ό
|
| 132 |
+
|
| 133 |
+
### Hugging Face Spaces λ°°ν¬
|
| 134 |
+
- **λ¬΄λ£ tier**: CPU μΈμ€ν΄μ€λ§ μ 곡
|
| 135 |
+
- **Space Sleep**: λΉνμ± μ μλ sleep, 첫 λ‘λ© λλ¦Ό
|
| 136 |
+
- **λμ€ν¬ μ ν**: KoAlpaca κ°μ ν° λͺ¨λΈμ λ°°ν¬ λΆκ°λ₯ν μ μμ
|
| 137 |
+
|
| 138 |
+
## π§ κ°λ° λ° μ»€μ€ν°λ§μ΄μ§
|
| 139 |
+
|
| 140 |
+
### λͺ¨λΈ μΆκ°
|
| 141 |
+
|
| 142 |
+
`app.py`μ `MODELS` λμ
λ리μ μ λͺ¨λΈμ μΆκ°νμΈμ:
|
| 143 |
+
|
| 144 |
+
```python
|
| 145 |
+
MODELS = {
|
| 146 |
+
"your-model-id": {
|
| 147 |
+
"name": "λͺ¨λΈ νμ μ΄λ¦",
|
| 148 |
+
"max_length": 512,
|
| 149 |
+
"temperature": 0.7,
|
| 150 |
+
},
|
| 151 |
+
}
|
| 152 |
+
```
|
| 153 |
+
|
| 154 |
+
### UI 컀μ€ν°λ§μ΄μ§
|
| 155 |
+
|
| 156 |
+
Gradio Blocksμ ChatInterfaceλ₯Ό μμ νμ¬ UIλ₯Ό λ³κ²½ν μ μμ΅λλ€. μμΈν λ΄μ©μ [Gradio λ¬Έμ](https://www.gradio.app/docs)λ₯Ό μ°Έκ³ νμΈμ.
|
| 157 |
+
|
| 158 |
+
## π λΌμ΄μ μ€
|
| 159 |
+
|
| 160 |
+
MIT License
|
| 161 |
+
|
| 162 |
+
## πββοΈ μ§μ
|
| 163 |
+
|
| 164 |
+
μ΄μλ μ§λ¬Έμ΄ μμΌμλ©΄ GitHub Issuesλ₯Ό ν΅ν΄ λ¬Έμν΄μ£ΌμΈμ.
|
app.py
ADDED
|
@@ -0,0 +1,270 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Hugging Face LLM Chatbot with Gradio
|
| 3 |
+
Using transformers library to run models locally
|
| 4 |
+
"""
|
| 5 |
+
|
| 6 |
+
import os
|
| 7 |
+
import gradio as gr
|
| 8 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 9 |
+
import torch
|
| 10 |
+
from dotenv import load_dotenv
|
| 11 |
+
|
| 12 |
+
# Load environment variables
|
| 13 |
+
load_dotenv()
|
| 14 |
+
HF_TOKEN = os.getenv("HF_TOKEN")
|
| 15 |
+
|
| 16 |
+
# Check device
|
| 17 |
+
device = "cuda" if torch.cuda.is_available() else "cpu"
|
| 18 |
+
print(f"Using device: {device}")
|
| 19 |
+
|
| 20 |
+
# Available models (optimized for local execution)
|
| 21 |
+
MODELS = {
|
| 22 |
+
"microsoft/DialoGPT-small": {
|
| 23 |
+
"name": "DialoGPT Small (μμ΄, λΉ λ¦)",
|
| 24 |
+
"max_length": 80,
|
| 25 |
+
"language": "en",
|
| 26 |
+
},
|
| 27 |
+
"microsoft/DialoGPT-medium": {
|
| 28 |
+
"name": "DialoGPT Medium (μμ΄, κ³ νμ§)",
|
| 29 |
+
"max_length": 100,
|
| 30 |
+
"language": "en",
|
| 31 |
+
},
|
| 32 |
+
"gpt2": {
|
| 33 |
+
"name": "GPT-2 (μμ΄, λ²μ©)",
|
| 34 |
+
"max_length": 80,
|
| 35 |
+
"language": "en",
|
| 36 |
+
},
|
| 37 |
+
"skt/kogpt2-base-v2": {
|
| 38 |
+
"name": "KoGPT-2 (νκΈ νΉν)",
|
| 39 |
+
"max_length": 100,
|
| 40 |
+
"language": "ko",
|
| 41 |
+
},
|
| 42 |
+
"beomi/KoAlpaca-Polyglot-5.8B": {
|
| 43 |
+
"name": "KoAlpaca 5.8B (νκΈ λνν, λλ¦Ό)",
|
| 44 |
+
"max_length": 150,
|
| 45 |
+
"language": "ko",
|
| 46 |
+
},
|
| 47 |
+
}
|
| 48 |
+
|
| 49 |
+
# Model cache
|
| 50 |
+
loaded_models = {}
|
| 51 |
+
loaded_tokenizers = {}
|
| 52 |
+
|
| 53 |
+
|
| 54 |
+
def load_model(model_name):
|
| 55 |
+
"""Load model and tokenizer"""
|
| 56 |
+
if model_name not in loaded_models:
|
| 57 |
+
try:
|
| 58 |
+
print(f"Loading model: {model_name}")
|
| 59 |
+
|
| 60 |
+
# Load tokenizer
|
| 61 |
+
tokenizer = AutoTokenizer.from_pretrained(
|
| 62 |
+
model_name,
|
| 63 |
+
token=HF_TOKEN,
|
| 64 |
+
padding_side='left'
|
| 65 |
+
)
|
| 66 |
+
|
| 67 |
+
# Add pad token if missing
|
| 68 |
+
if tokenizer.pad_token is None:
|
| 69 |
+
tokenizer.pad_token = tokenizer.eos_token
|
| 70 |
+
|
| 71 |
+
# Load model
|
| 72 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 73 |
+
model_name,
|
| 74 |
+
token=HF_TOKEN,
|
| 75 |
+
torch_dtype=torch.float32,
|
| 76 |
+
)
|
| 77 |
+
model.to(device)
|
| 78 |
+
model.eval()
|
| 79 |
+
|
| 80 |
+
loaded_models[model_name] = model
|
| 81 |
+
loaded_tokenizers[model_name] = tokenizer
|
| 82 |
+
|
| 83 |
+
print(f"β
Model {model_name} loaded successfully")
|
| 84 |
+
|
| 85 |
+
except Exception as e:
|
| 86 |
+
print(f"β Failed to load model {model_name}: {e}")
|
| 87 |
+
return None, None
|
| 88 |
+
|
| 89 |
+
return loaded_models.get(model_name), loaded_tokenizers.get(model_name)
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
def chat_response(message, history, model_name):
|
| 93 |
+
"""
|
| 94 |
+
Generate chatbot response
|
| 95 |
+
|
| 96 |
+
Args:
|
| 97 |
+
message: User input
|
| 98 |
+
history: Chat history in Gradio format
|
| 99 |
+
model_name: Selected model
|
| 100 |
+
|
| 101 |
+
Returns:
|
| 102 |
+
Response text
|
| 103 |
+
"""
|
| 104 |
+
try:
|
| 105 |
+
# Load model and tokenizer
|
| 106 |
+
model, tokenizer = load_model(model_name)
|
| 107 |
+
|
| 108 |
+
if model is None or tokenizer is None:
|
| 109 |
+
return f"β λͺ¨λΈ '{model_name}'μ λ‘λν μ μμ΅λλ€. λ€λ₯Έ λͺ¨λΈμ μ νν΄μ£ΌμΈμ."
|
| 110 |
+
|
| 111 |
+
model_config = MODELS[model_name]
|
| 112 |
+
|
| 113 |
+
# Build conversation context
|
| 114 |
+
conversation = ""
|
| 115 |
+
for msg in history:
|
| 116 |
+
if msg["role"] == "user":
|
| 117 |
+
conversation += f"{msg['content']}\n"
|
| 118 |
+
elif msg["role"] == "assistant":
|
| 119 |
+
conversation += f"{msg['content']}\n"
|
| 120 |
+
|
| 121 |
+
# Add current message
|
| 122 |
+
conversation += f"{message}\n"
|
| 123 |
+
|
| 124 |
+
# Tokenize
|
| 125 |
+
inputs = tokenizer.encode(conversation, return_tensors="pt").to(device)
|
| 126 |
+
|
| 127 |
+
# Generate response
|
| 128 |
+
with torch.no_grad():
|
| 129 |
+
outputs = model.generate(
|
| 130 |
+
inputs,
|
| 131 |
+
max_new_tokens=model_config["max_length"],
|
| 132 |
+
temperature=0.9,
|
| 133 |
+
do_sample=True,
|
| 134 |
+
pad_token_id=tokenizer.pad_token_id,
|
| 135 |
+
eos_token_id=tokenizer.eos_token_id,
|
| 136 |
+
)
|
| 137 |
+
|
| 138 |
+
# Decode response
|
| 139 |
+
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
| 140 |
+
|
| 141 |
+
# Remove the input prompt from response
|
| 142 |
+
response = response[len(conversation):].strip()
|
| 143 |
+
|
| 144 |
+
# If empty, return a default message
|
| 145 |
+
if not response:
|
| 146 |
+
response = "I understand. Could you tell me more?"
|
| 147 |
+
|
| 148 |
+
return response
|
| 149 |
+
|
| 150 |
+
except Exception as e:
|
| 151 |
+
import traceback
|
| 152 |
+
error_msg = str(e)
|
| 153 |
+
error_type = type(e).__name__
|
| 154 |
+
|
| 155 |
+
print("=" * 50)
|
| 156 |
+
print(f"Error Type: {error_type}")
|
| 157 |
+
print(f"Error Message: {error_msg}")
|
| 158 |
+
print(f"Traceback:\n{traceback.format_exc()}")
|
| 159 |
+
print("=" * 50)
|
| 160 |
+
|
| 161 |
+
if "out of memory" in error_msg.lower() or "oom" in error_msg.lower():
|
| 162 |
+
return "β λ©λͺ¨λ¦¬ λΆμ‘±. λ μμ λͺ¨λΈμ μ ννκ±°λ μ±μ μ¬μμνμΈμ."
|
| 163 |
+
elif "cuda" in error_msg.lower() and device == "cpu":
|
| 164 |
+
return "β οΈ GPU μμ΄ CPUλ‘ μ€ν μ€μ
λλ€. μλ΅μ΄ λ릴 μ μμ΅λλ€."
|
| 165 |
+
else:
|
| 166 |
+
return f"β μ€λ₯: {error_type}\n{error_msg[:200]}\n\nν°λ―Έλμμ μ 체 λ‘κ·Έλ₯Ό νμΈνμΈμ."
|
| 167 |
+
|
| 168 |
+
|
| 169 |
+
# Global state
|
| 170 |
+
current_model = "microsoft/DialoGPT-small"
|
| 171 |
+
|
| 172 |
+
# Preload default model
|
| 173 |
+
print("Preloading default model...")
|
| 174 |
+
load_model(current_model)
|
| 175 |
+
|
| 176 |
+
# Create Gradio interface
|
| 177 |
+
with gr.Blocks(
|
| 178 |
+
title="π€ Hugging Face Chatbot",
|
| 179 |
+
theme=gr.themes.Soft(),
|
| 180 |
+
) as demo:
|
| 181 |
+
gr.Markdown(
|
| 182 |
+
"""
|
| 183 |
+
# π€ Hugging Face LLM Chatbot
|
| 184 |
+
|
| 185 |
+
**λ‘컬 λͺ¨λΈ μ€ν λ°©μ** - API μ ν μμ!
|
| 186 |
+
|
| 187 |
+
**μ¬μ© λ°©λ²:**
|
| 188 |
+
1. λͺ¨λΈμ μ ννμΈμ (μ²μμλ λ‘λ© μκ° νμ)
|
| 189 |
+
2. λ©μμ§λ₯Ό μ
λ ₯νκ³ λννμΈμ
|
| 190 |
+
3. CPUμμ μ€νλλ―λ‘ μλ΅μ΄ μ‘°κΈ λ릴 μ μμ΅λλ€
|
| 191 |
+
|
| 192 |
+
**μΈμ΄λ³ μΆμ² λͺ¨λΈ:**
|
| 193 |
+
- π¬π§ μμ΄: DialoGPT, GPT-2
|
| 194 |
+
- π°π· νκΈ: KoGPT-2, KoAlpaca (5.8Bλ ν° λͺ¨λΈ, λλ¦Ό)
|
| 195 |
+
|
| 196 |
+
**μ₯μ :** API μ ν μμ, μμ 무λ£, μ€νλΌμΈ μλ κ°λ₯
|
| 197 |
+
"""
|
| 198 |
+
)
|
| 199 |
+
|
| 200 |
+
# Model selector
|
| 201 |
+
model_dropdown = gr.Dropdown(
|
| 202 |
+
choices=[(config["name"], model_id) for model_id, config in MODELS.items()],
|
| 203 |
+
value="microsoft/DialoGPT-small",
|
| 204 |
+
label="π― λͺ¨λΈ μ ν",
|
| 205 |
+
info="λͺ¨λΈμ λ³κ²½νλ©΄ μ λͺ¨λΈμ λ€μ΄λ‘λν©λλ€ (μ²μ ν λ²λ§)",
|
| 206 |
+
)
|
| 207 |
+
|
| 208 |
+
# Chat interface
|
| 209 |
+
chatbot = gr.ChatInterface(
|
| 210 |
+
fn=chat_response,
|
| 211 |
+
type="messages",
|
| 212 |
+
additional_inputs=[model_dropdown],
|
| 213 |
+
chatbot=gr.Chatbot(
|
| 214 |
+
height=500,
|
| 215 |
+
placeholder="λ©μμ§λ₯Ό μ
λ ₯νμΈμ...",
|
| 216 |
+
type="messages",
|
| 217 |
+
),
|
| 218 |
+
textbox=gr.Textbox(
|
| 219 |
+
placeholder="λ©μμ§λ₯Ό μ
λ ₯νμΈμ (μμ΄ κΆμ₯)...",
|
| 220 |
+
container=False,
|
| 221 |
+
scale=7,
|
| 222 |
+
),
|
| 223 |
+
examples=[
|
| 224 |
+
["Hello! How are you?", "microsoft/DialoGPT-small"],
|
| 225 |
+
["Tell me a joke", "microsoft/DialoGPT-medium"],
|
| 226 |
+
["μλ
νμΈμ! μ€λ λ μ¨κ° μ’λ€μ.", "skt/kogpt2-base-v2"],
|
| 227 |
+
["μΈκ³΅μ§λ₯μ λν΄ μ€λͺ
ν΄μ£ΌμΈμ.", "skt/kogpt2-base-v2"],
|
| 228 |
+
],
|
| 229 |
+
)
|
| 230 |
+
|
| 231 |
+
# Clear chat when model changes
|
| 232 |
+
def on_model_change(new_model):
|
| 233 |
+
global current_model
|
| 234 |
+
current_model = new_model
|
| 235 |
+
# Preload new model
|
| 236 |
+
load_model(new_model)
|
| 237 |
+
return None
|
| 238 |
+
|
| 239 |
+
model_dropdown.change(
|
| 240 |
+
fn=on_model_change,
|
| 241 |
+
inputs=[model_dropdown],
|
| 242 |
+
outputs=[chatbot.chatbot],
|
| 243 |
+
)
|
| 244 |
+
|
| 245 |
+
gr.Markdown(
|
| 246 |
+
"""
|
| 247 |
+
---
|
| 248 |
+
|
| 249 |
+
**β οΈ μ°Έκ³ :**
|
| 250 |
+
- λͺ¨λΈμ λ‘컬μμ μ€νλ©λλ€ (첫 μ€ν μ λ€μ΄λ‘λ)
|
| 251 |
+
- CPUμμ μ€νλλ―λ‘ GPUλ³΄λ€ λ립λλ€
|
| 252 |
+
- κ° λͺ¨λΈμ νΉμ μΈμ΄μ μ΅μ νλμ΄ μμ΅λλ€
|
| 253 |
+
|
| 254 |
+
**πΎ λμ€ν¬ μ¬μ©λ:**
|
| 255 |
+
- DialoGPT-small: ~350MB
|
| 256 |
+
- DialoGPT-medium: ~800MB
|
| 257 |
+
- GPT-2: ~500MB
|
| 258 |
+
- KoGPT-2: ~500MB
|
| 259 |
+
- KoAlpaca-5.8B: ~12GB (ν° λͺ¨λΈ, λ©λͺ¨λ¦¬ 8GB+ νμ)
|
| 260 |
+
|
| 261 |
+
**π‘ ν:**
|
| 262 |
+
- μμ΄ λνλ DialoGPT μΆμ²
|
| 263 |
+
- νκΈ λνλ KoGPT-2 μΆμ² (KoAlpacaλ 리μμ€ μΆ©λΆν λλ§)
|
| 264 |
+
- μ§§μ λ¬Έμ₯μΌλ‘ λννλ©΄ λ λμ κ²°κ³Ό
|
| 265 |
+
- λͺ¨λΈμ΄ ν λ² λ‘λλλ©΄ λ€μ λ€μ΄λ‘λνμ§ μμ΅λλ€
|
| 266 |
+
"""
|
| 267 |
+
)
|
| 268 |
+
|
| 269 |
+
if __name__ == "__main__":
|
| 270 |
+
demo.launch()
|
requirements.txt
ADDED
|
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
gradio>=5.0.0
|
| 2 |
+
transformers>=4.30.0
|
| 3 |
+
torch>=2.0.0
|
| 4 |
+
python-dotenv>=1.0.0
|