| # Model Selection Guide | |
| ## π― At-a-Glance Recommendations | |
| | Priority | Best Choice | Provider | Monthly Cost* | Setup Time | Quality Score | Why Choose This | | |
| |----------|-------------|----------|---------------|------------|---------------|-----------------| | |
| | **Ease of Use** | Gemini 2.5 Flash | Google | Free - $2 | 2 min | 90% | Excellent free tier | | |
| | **Best Value** | GPT-5-nano | OpenAI | $1.00 | 2 min | 88% | Modern GPT-5 at nano price | | |
| | **Premium Quality** | Claude 3 Opus | Anthropic | $225 | 2 min | 95% | Highest reasoning quality | | |
| | **Self-Hosted** | Llama 3.1:8b | Ollama | Free | 10 min | 82% | Perfect balance | | |
| | **High-End Local** | DeepSeek-R1:7b | Ollama | Free | 15 min | 88% | Best reasoning model | | |
| | **Budget Cloud** | Claude 3.5 Haiku | Anthropic | $4 | 2 min | 87% | Fast and affordable | | |
| | **Alternative Local** | CodeQwen1.5:7b | Ollama | Free | 10 min | 85% | Excellent for structured data | | |
| *Based on 30,000 queries/month | |
| --- | |
| ## π’ Cloud Models (Closed Source) | |
| ### OpenAI Models | |
| #### GPT-5 (Latest Flagship) β **NEW** | |
| ```bash | |
| OPENAI_MODEL=gpt-5 | |
| ``` | |
| - **Pricing**: $20/month (Plus plan) - Unlimited with guardrails | |
| - **Capabilities**: Advanced reasoning, thinking, code execution | |
| - **Best For**: Premium applications requiring cutting-edge AI | |
| - **Recipe Quality**: Outstanding (96%) - Best culinary understanding | |
| - **Context**: 196K tokens (reasoning mode) | |
| #### GPT-5-nano (Ultra Budget) β **MISSED GEM** | |
| ```bash | |
| OPENAI_MODEL=gpt-5-nano | |
| ``` | |
| - **Pricing**: $0.05/1M input, $0.40/1M output tokens | |
| - **Monthly Cost**: ~$1.00 for 30K queries | |
| - **Best For**: Budget-conscious deployments with modern capabilities | |
| - **Recipe Quality**: Very Good (88%) | |
| - **Speed**: Very Fast | |
| - **Features**: GPT-5 architecture at nano pricing | |
| #### GPT-4o-mini (Proven Budget Choice) | |
| ```bash | |
| OPENAI_MODEL=gpt-4o-mini | |
| ``` | |
| - **Pricing**: $0.15/1M input, $0.60/1M output tokens | |
| - **Monthly Cost**: ~$4 for 30K queries | |
| - **Best For**: Cost-effective production deployments | |
| - **Recipe Quality**: Very Good (86%) | |
| - **Speed**: Very Fast | |
| ### Google AI (Gemini) Models | |
| #### Gemini 2.5 Flash β **RECOMMENDED** | |
| ```bash | |
| GOOGLE_MODEL=gemini-2.5-flash | |
| ``` | |
| - **Pricing**: Free tier, then $0.30/1M input, $2.50/1M output | |
| - **Monthly Cost**: Free - $2 for most usage patterns | |
| - **Best For**: Development and cost-conscious production | |
| - **Recipe Quality**: Excellent (90%) | |
| - **Features**: Thinking budgets, 1M context window | |
| #### Gemini 2.5 Pro (High-End) | |
| ```bash | |
| GOOGLE_MODEL=gemini-2.5-pro | |
| ``` | |
| - **Pricing**: $1.25/1M input, $10/1M output (β€200K context) | |
| - **Monthly Cost**: ~$25 for 30K queries | |
| - **Best For**: Premium applications requiring best Google AI | |
| - **Recipe Quality**: Excellent (92%) | |
| #### Gemini 2.0 Flash-Lite (Ultra Budget) | |
| ```bash | |
| GOOGLE_MODEL=gemini-2.0-flash-lite | |
| ``` | |
| - **Pricing**: $0.075/1M input, $0.30/1M output | |
| - **Monthly Cost**: ~$0.90 for 30K queries | |
| - **Best For**: High-volume, cost-sensitive applications | |
| - **Recipe Quality**: Good (85%) | |
| ## π Open Source Models (Self-Hosted) | |
| ### Ollama Models (Latest Releases) | |
| #### DeepSeek-R1:7b β **BREAKTHROUGH MODEL** | |
| ```bash | |
| OLLAMA_MODEL=deepseek-r1:7b | |
| ``` | |
| - **Parameters**: 7B | |
| - **Download**: ~4.7GB | |
| - **RAM Required**: 8GB | |
| - **Best For**: Advanced reasoning tasks, O1-level performance | |
| - **Recipe Quality**: Outstanding (88%) | |
| - **Special**: Chain-of-thought reasoning, approaching GPT-4 performance | |
| #### Gemma 3:27b β **NEW FLAGSHIP** | |
| ```bash | |
| OLLAMA_MODEL=gemma3:27b | |
| ``` | |
| - **Parameters**: 27B | |
| - **Download**: ~17GB | |
| - **RAM Required**: 32GB | |
| - **Best For**: Highest quality open source experience | |
| - **Recipe Quality**: Outstanding (89%) | |
| - **Features**: Vision capabilities, state-of-the-art performance | |
| #### Llama 3.1:8b (Proven Choice) | |
| ```bash | |
| OLLAMA_MODEL=llama3.1:8b | |
| ``` | |
| - **Parameters**: 8B | |
| - **Download**: ~4.7GB | |
| - **RAM Required**: 8GB | |
| - **Best For**: Balanced production deployment | |
| - **Recipe Quality**: Very Good (82%) | |
| - **Status**: Your current choice - excellent balance! | |
| #### Qwen 3:8b β **NEW RELEASE** | |
| ```bash | |
| OLLAMA_MODEL=qwen3:8b | |
| ``` | |
| - **Parameters**: 8B | |
| - **Download**: ~4.4GB | |
| - **RAM Required**: 8GB | |
| - **Best For**: Multilingual support, latest technology | |
| - **Recipe Quality**: Very Good (84%) | |
| - **Features**: Tool use, thinking capabilities | |
| #### Phi 4:14b β **MICROSOFT'S LATEST** | |
| ```bash | |
| OLLAMA_MODEL=phi4:14b | |
| ``` | |
| - **Parameters**: 14B | |
| - **Download**: ~9.1GB | |
| - **RAM Required**: 16GB | |
| - **Best For**: Reasoning and math tasks | |
| - **Recipe Quality**: Very Good (85%) | |
| - **Features**: State-of-the-art efficiency | |
| #### Gemma 3:4b (Efficient Choice) | |
| ```bash | |
| OLLAMA_MODEL=gemma3:4b | |
| ``` | |
| - **Parameters**: 4B | |
| - **Download**: ~3.3GB | |
| - **RAM Required**: 6GB | |
| - **Best For**: Resource-constrained deployments | |
| - **Recipe Quality**: Good (78%) | |
| - **Features**: Excellent for size, runs on modest hardware | |
| ### HuggingFace Models (Downloadable for Local Use) | |
| #### CodeQwen1.5:7b β **ALIBABA'S CODE MODEL** | |
| ```bash | |
| OLLAMA_MODEL=codeqwen:7b | |
| ``` | |
| - **Parameters**: 7B | |
| - **Download**: ~4.2GB | |
| - **RAM Required**: 8GB | |
| - **Best For**: Recipe parsing, ingredient analysis, structured data | |
| - **Recipe Quality**: Very Good (85%) | |
| - **Features**: Excellent at understanding structured recipe formats | |
| #### Mistral-Nemo:12b β **BALANCED CHOICE** | |
| ```bash | |
| OLLAMA_MODEL=mistral-nemo:12b | |
| ``` | |
| - **Parameters**: 12B | |
| - **Download**: ~7GB | |
| - **RAM Required**: 12GB | |
| - **Best For**: General conversation with good reasoning | |
| - **Recipe Quality**: Very Good (84%) | |
| - **Features**: Multilingual, efficient, well-balanced | |
| #### Nous-Hermes2:10.7b β **FINE-TUNED EXCELLENCE** | |
| ```bash | |
| OLLAMA_MODEL=nous-hermes2:10.7b | |
| ``` | |
| - **Parameters**: 10.7B | |
| - **Download**: ~6.4GB | |
| - **RAM Required**: 12GB | |
| - **Best For**: Instruction following, detailed responses | |
| - **Recipe Quality**: Very Good (83%) | |
| - **Features**: Excellent instruction following, helpful responses | |
| #### OpenHermes2.5-Mistral:7b β **COMMUNITY FAVORITE** | |
| ```bash | |
| OLLAMA_MODEL=openhermes2.5-mistral:7b | |
| ``` | |
| - **Parameters**: 7B | |
| - **Download**: ~4.1GB | |
| - **RAM Required**: 8GB | |
| - **Best For**: Creative recipe suggestions, conversational AI | |
| - **Recipe Quality**: Good (81%) | |
| - **Features**: Creative, conversational, reliable | |
| #### Solar:10.7b β **UPSTAGE'S MODEL** | |
| ```bash | |
| OLLAMA_MODEL=solar:10.7b | |
| ``` | |
| - **Parameters**: 10.7B | |
| - **Download**: ~6.1GB | |
| - **RAM Required**: 12GB | |
| - **Best For**: Analytical tasks, recipe modifications | |
| - **Recipe Quality**: Very Good (83%) | |
| - **Features**: Strong analytical capabilities, detailed explanations | |
| ### Anthropic Claude Models | |
| #### Claude 3.5 Sonnet (Production Standard) | |
| ```bash | |
| ANTHROPIC_MODEL=claude-3-5-sonnet-20241022 | |
| ``` | |
| - **Pricing**: $3/1M input, $15/1M output tokens | |
| - **Monthly Cost**: ~$45 for 30K queries | |
| - **Best For**: Balanced performance and reasoning | |
| - **Recipe Quality**: Outstanding (94%) | |
| - **Features**: Advanced analysis, code understanding | |
| #### Claude 3.5 Haiku (Speed Focused) | |
| ```bash | |
| ANTHROPIC_MODEL=claude-3-5-haiku-20241022 | |
| ``` | |
| - **Pricing**: $0.25/1M input, $1.25/1M output tokens | |
| - **Monthly Cost**: ~$4 for 30K queries | |
| - **Best For**: Fast, cost-effective responses | |
| - **Recipe Quality**: Very Good (87%) | |
| - **Features**: Lightning fast, good quality | |
| #### Claude 3 Opus (Premium Reasoning) | |
| ```bash | |
| ANTHROPIC_MODEL=claude-3-opus-20240229 | |
| ``` | |
| - **Pricing**: $15/1M input, $75/1M output tokens | |
| - **Monthly Cost**: ~$225 for 30K queries | |
| - **Best For**: Complex reasoning, highest quality | |
| - **Recipe Quality**: Outstanding (95%) | |
| - **Features**: Top-tier reasoning, complex tasks | |
| --- | |
| ## π― Scenario-Based Recommendations | |
| ### π¨βπ» **Development & Testing** | |
| **Choice**: Gemini 2.5 Flash | |
| ```bash | |
| LLM_PROVIDER=google | |
| GOOGLE_MODEL=gemini-2.5-flash | |
| ``` | |
| - Free tier covers most development | |
| - Excellent quality for testing | |
| - Easy setup and integration | |
| ### π **Small to Medium Production** | |
| **Choice**: Gemini 2.5 Flash or GPT-4o-mini | |
| ```bash | |
| # Cost-focused | |
| LLM_PROVIDER=google | |
| GOOGLE_MODEL=gemini-2.5-flash | |
| # Quality-focused | |
| LLM_PROVIDER=openai | |
| OPENAI_MODEL=gpt-4o-mini | |
| ``` | |
| ### π **Self-Hosted** | |
| **Choice**: Llama 3.1:8b or upgrade to DeepSeek-R1:7b | |
| ```bash | |
| # Your current (excellent choice) | |
| LLM_PROVIDER=ollama | |
| OLLAMA_MODEL=llama3.1:8b | |
| # Upgrade option (better reasoning) | |
| LLM_PROVIDER=ollama | |
| OLLAMA_MODEL=deepseek-r1:7b | |
| ``` | |
| ### π° **Budget/Free** | |
| **Choice**: Local models or GPT-5-nano | |
| ```bash | |
| # Best local alternative | |
| LLM_PROVIDER=ollama | |
| OLLAMA_MODEL=codeqwen:7b | |
| # Best budget paid option | |
| LLM_PROVIDER=openai | |
| OPENAI_MODEL=gpt-5-nano | |
| # Quality budget cloud | |
| LLM_PROVIDER=anthropic | |
| ANTHROPIC_MODEL=claude-3-5-haiku-20241022 | |
| ``` | |
| ### π **Privacy/Offline** | |
| **Choice**: DeepSeek-R1:7b or Gemma 3:4b | |
| ```bash | |
| # Best reasoning | |
| LLM_PROVIDER=ollama | |
| OLLAMA_MODEL=deepseek-r1:7b | |
| # Resource-efficient | |
| LLM_PROVIDER=ollama | |
| OLLAMA_MODEL=gemma3:4b | |
| ``` | |
| --- | |
| ## β‘ Quick Setup Commands | |
| ### Cloud Models (Instant Setup) | |
| #### Gemini 2.5 Flash (Recommended) | |
| ```bash | |
| # Update .env | |
| LLM_PROVIDER=google | |
| GOOGLE_MODEL=gemini-2.5-flash | |
| GOOGLE_TEMPERATURE=0.7 | |
| GOOGLE_MAX_TOKENS=1000 | |
| # Test | |
| python -c " | |
| from services.llm_service import LLMService | |
| service = LLMService() | |
| print('β Gemini 2.5 Flash ready!') | |
| response = service.simple_chat_completion('Suggest a quick pasta recipe') | |
| print(f'Response: {response[:100]}...') | |
| " | |
| ``` | |
| #### CodeQwen1.5:7b (Structured Data Expert) | |
| ```bash | |
| # Pull model | |
| ollama pull codeqwen:7b | |
| # Update .env | |
| LLM_PROVIDER=ollama | |
| OLLAMA_MODEL=codeqwen:7b | |
| OLLAMA_TEMPERATURE=0.7 | |
| # Test | |
| python -c " | |
| from services.llm_service import LLMService | |
| service = LLMService() | |
| print('β CodeQwen 1.5:7b ready!') | |
| response = service.simple_chat_completion('Parse this recipe: 2 cups flour, 1 egg, 1 cup milk') | |
| print(f'Response: {response[:100]}...') | |
| " | |
| ``` | |
| #### Mistral-Nemo:12b (Balanced Performance) | |
| ```bash | |
| # Pull model | |
| ollama pull mistral-nemo:12b | |
| # Update .env | |
| LLM_PROVIDER=ollama | |
| OLLAMA_MODEL=mistral-nemo:12b | |
| OLLAMA_TEMPERATURE=0.7 | |
| # Test | |
| python -c " | |
| from services.llm_service import LLMService | |
| service = LLMService() | |
| print('β Mistral-Nemo ready!') | |
| response = service.simple_chat_completion('Suggest a Mediterranean dinner menu') | |
| print(f'Response: {response[:100]}...') | |
| " | |
| ``` | |
| #### Claude 3.5 Haiku (Speed + Quality) | |
| ```bash | |
| # Update .env | |
| LLM_PROVIDER=anthropic | |
| ANTHROPIC_MODEL=claude-3-5-haiku-20241022 | |
| ANTHROPIC_TEMPERATURE=0.7 | |
| ANTHROPIC_MAX_TOKENS=1000 | |
| # Test | |
| python -c " | |
| from services.llm_service import LLMService | |
| service = LLMService() | |
| print('β Claude 3.5 Haiku ready!') | |
| response = service.simple_chat_completion('Quick dinner ideas with vegetables') | |
| print(f'Response: {response[:100]}...') | |
| " | |
| ``` | |
| #### GPT-5-nano (Budget Winner) | |
| ```bash | |
| # Update .env | |
| LLM_PROVIDER=openai | |
| OPENAI_MODEL=gpt-5-nano | |
| OPENAI_TEMPERATURE=0.7 | |
| OPENAI_MAX_TOKENS=1000 | |
| # Test | |
| python -c " | |
| from services.llm_service import LLMService | |
| service = LLMService() | |
| print('β GPT-5-nano ready!') | |
| response = service.simple_chat_completion('Quick healthy breakfast ideas') | |
| print(f'Response: {response[:100]}...') | |
| " | |
| ``` | |
| #### GPT-5 (Premium) | |
| ```bash | |
| # Update .env | |
| LLM_PROVIDER=openai | |
| OPENAI_MODEL=gpt-5 | |
| OPENAI_TEMPERATURE=0.7 | |
| OPENAI_MAX_TOKENS=1000 | |
| # Test | |
| python -c " | |
| from services.llm_service import LLMService | |
| service = LLMService() | |
| print('β GPT-5 ready!') | |
| response = service.simple_chat_completion('Create a healthy meal plan') | |
| print(f'Response: {response[:100]}...') | |
| " | |
| ``` | |
| ### Self-Hosted Models | |
| #### DeepSeek-R1:7b (Latest Breakthrough) | |
| ```bash | |
| # Pull model | |
| ollama pull deepseek-r1:7b | |
| # Update .env | |
| LLM_PROVIDER=ollama | |
| OLLAMA_MODEL=deepseek-r1:7b | |
| OLLAMA_TEMPERATURE=0.7 | |
| # Start Ollama | |
| ollama serve & | |
| # Test | |
| python -c " | |
| from services.llm_service import LLMService | |
| service = LLMService() | |
| print('β DeepSeek-R1 ready!') | |
| response = service.simple_chat_completion('Explain the science behind sourdough fermentation') | |
| print(f'Response: {response[:100]}...') | |
| " | |
| ``` | |
| #### Gemma 3:4b (Efficient) | |
| ```bash | |
| # Pull model | |
| ollama pull gemma3:4b | |
| # Update .env | |
| LLM_PROVIDER=ollama | |
| OLLAMA_MODEL=gemma3:4b | |
| OLLAMA_TEMPERATURE=0.7 | |
| # Test | |
| python -c " | |
| from services.llm_service import LLMService | |
| service = LLMService() | |
| print('β Gemma 3:4b ready!') | |
| response = service.simple_chat_completion('Quick chicken recipes for weeknight dinners') | |
| print(f'Response: {response[:100]}...') | |
| " | |
| ``` | |
| --- | |
| ## π§ Hardware Requirements | |
| ### Cloud Models | |
| - **Requirements**: Internet connection, API key | |
| - **RAM**: Any (processing done remotely) | |
| - **Storage**: Minimal | |
| - **Best For**: Instant setup, no hardware constraints | |
| ### Self-Hosted Requirements | |
| | Model | Parameters | RAM Needed | Storage | GPU Beneficial | Best For | | |
| |-------|------------|------------|---------|----------------|----------| | |
| | `gemma3:4b` | 4B | 6GB | 3.3GB | Optional | Laptops, modest hardware | | |
| | `codeqwen:7b` | 7B | 8GB | 4.2GB | Yes | Structured data, parsing | | |
| | `llama3.1:8b` | 8B | 8GB | 4.7GB | Yes | Standard workstations | | |
| | `deepseek-r1:7b` | 7B | 8GB | 4.7GB | Yes | Reasoning tasks | | |
| | `openhermes2.5-mistral:7b` | 7B | 8GB | 4.1GB | Yes | Conversational AI | | |
| | `nous-hermes2:10.7b` | 10.7B | 12GB | 6.4GB | Recommended | Instruction following | | |
| | `mistral-nemo:12b` | 12B | 12GB | 7GB | Recommended | Balanced performance | | |
| | `phi4:14b` | 14B | 16GB | 9.1GB | Recommended | High-end workstations | | |
| | `gemma3:27b` | 27B | 32GB | 17GB | Required | Powerful servers | | |
| --- |