# 🔧 Model Configuration Guide The backend now supports **configurable models via environment variables**, making it easy to switch between different AI models without code changes. ## 📋 Environment Variables ### **Primary Configuration** ```bash # Main AI model for text generation (required) export AI_MODEL="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B" # Vision model for image processing (optional) export VISION_MODEL="Salesforce/blip-image-captioning-base" # HuggingFace token for private models (optional) export HF_TOKEN="your_huggingface_token_here" ``` --- ## 🚀 Usage Examples ### **1. Use DeepSeek-R1 (Default)** ```bash # Uses your originally requested model export AI_MODEL="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B" ./gradio_env/bin/python backend_service.py ``` ### **2. Use DialoGPT (Faster, smaller)** ```bash # Switch to lighter model for development/testing export AI_MODEL="microsoft/DialoGPT-medium" ./gradio_env/bin/python backend_service.py ``` ### **3. Use Unsloth 4-bit Quantized Models** ```bash # Use Unsloth 4-bit Mistral model (memory efficient) export AI_MODEL="unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit" ./gradio_env/bin/python backend_service.py # Use other Unsloth models export AI_MODEL="unsloth/llama-3-8b-Instruct-bnb-4bit" ./gradio_env/bin/python backend_service.py ``` ### **4. Use Other Popular Models** ```bash # Use Zephyr chat model export AI_MODEL="HuggingFaceH4/zephyr-7b-beta" ./gradio_env/bin/python backend_service.py # Use CodeLlama for code generation export AI_MODEL="codellama/CodeLlama-7b-Instruct-hf" ./gradio_env/bin/python backend_service.py # Use Mistral export AI_MODEL="mistralai/Mistral-7B-Instruct-v0.2" ./gradio_env/bin/python backend_service.py ``` ### **5. Use Different Vision Model** ```bash export AI_MODEL="microsoft/DialoGPT-medium" export VISION_MODEL="nlpconnect/vit-gpt2-image-captioning" ./gradio_env/bin/python backend_service.py ``` --- ## 📝 Startup Script Examples ### **Development Mode (Fast startup)** ```bash #!/bin/bash # dev_mode.sh export AI_MODEL="microsoft/DialoGPT-medium" export VISION_MODEL="Salesforce/blip-image-captioning-base" ./gradio_env/bin/python backend_service.py ``` ### **Production Mode (Your preferred model)** ```bash #!/bin/bash # production_mode.sh export AI_MODEL="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B" export VISION_MODEL="Salesforce/blip-image-captioning-base" export HF_TOKEN="$YOUR_HF_TOKEN" ./gradio_env/bin/python backend_service.py ``` ### **Testing Mode (Lightweight)** ```bash #!/bin/bash # test_mode.sh export AI_MODEL="microsoft/DialoGPT-medium" export VISION_MODEL="Salesforce/blip-image-captioning-base" ./gradio_env/bin/python backend_service.py ``` --- ## 🔍 Model Verification After starting the backend, check which model is loaded: ```bash curl http://localhost:8000/health ``` Response will show: ```json { "status": "healthy", "model": "deepseek-ai/DeepSeek-R1-0528-Qwen3-8B", "version": "1.0.0" } ``` --- ## 📊 Model Comparison | Model | Size | Speed | Quality | Use Case | | --------------------------------------------- | ------ | --------- | ------------ | ------------------- | | `microsoft/DialoGPT-medium` | ~355MB | ⚡ Fast | Good | Development/Testing | | `deepseek-ai/DeepSeek-R1-0528-Qwen3-8B` | ~16GB | 🐌 Slow | ⭐ Excellent | Production | | `unsloth/Mistral-Nemo-Instruct-2407-bnb-4bit` | ~7GB | 🚀 Medium | ⭐ Excellent | Production (4-bit) | | `HuggingFaceH4/zephyr-7b-beta` | ~14GB | 🐌 Slow | ⭐ Excellent | Chat/Conversation | | `codellama/CodeLlama-7b-Instruct-hf` | ~13GB | 🐌 Slow | ⭐ Good | Code Generation | --- ## 🛠️ Troubleshooting ### **Model Not Found** ```bash # Verify model exists on HuggingFace ./gradio_env/bin/python -c " from huggingface_hub import HfApi api = HfApi() try: info = api.model_info('your-model-name') print(f'✅ Model exists: {info.id}') except: print('❌ Model not found') " ``` ### **Memory Issues** ```bash # Use smaller model for limited RAM export AI_MODEL="microsoft/DialoGPT-medium" # ~355MB # or export AI_MODEL="distilgpt2" # ~82MB ``` ### **Authentication Issues** ```bash # Set HuggingFace token for private models export HF_TOKEN="hf_your_token_here" ``` --- ## 🎯 Quick Switch Commands ```bash # Quick switch to development mode export AI_MODEL="microsoft/DialoGPT-medium" && ./gradio_env/bin/python backend_service.py # Quick switch to production mode export AI_MODEL="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B" && ./gradio_env/bin/python backend_service.py # Quick switch with custom vision model export AI_MODEL="microsoft/DialoGPT-medium" AI_VISION="nlpconnect/vit-gpt2-image-captioning" && ./gradio_env/bin/python backend_service.py ``` --- ## ✅ Summary - **Environment Variable**: `AI_MODEL` controls the main text generation model - **Default**: `deepseek-ai/DeepSeek-R1-0528-Qwen3-8B` (your original preference) - **Alternative**: `microsoft/DialoGPT-medium` (faster for development) - **Vision Model**: `VISION_MODEL` controls image processing model - **No Code Changes**: Switch models by changing environment variables only **Your original DeepSeek-R1 model is still the default** - I simply made it configurable so you can easily switch when needed!