Recipe Recommendation Chatbot - Backend API
Backend for AI-powered recipe recommendation system built with FastAPI, featuring RAG (Retrieval-Augmented Generation) capabilities, conversational memory, and multi-provider LLM support.
π Quick Start
Prerequisites
- Python 3.9+
- pip or poetry
- API keys for your chosen LLM provider (OpenAI, Google, or HuggingFace)
Installation
Clone and navigate to backend
git clone <repository-url> cd PLG4-Recipe-Recommendation-Chatbot/backendInstall dependencies
pip install -r requirements.txtπ‘ Note: Some packages are commented out by default to keep the installation lightweight:
- HuggingFace dependencies (
transformers,accelerate,sentence-transformers) - Uncomment if using HuggingFace models - sentence-transformers (~800MB) - Uncomment for HuggingFace embeddings
- HuggingFace dependencies (
Configure environment
cp .env.example .env # Edit .env with your API keys and configurationRun the server
# Development mode with auto-reload uvicorn app:app --reload --host 127.0.0.1 --port 8080 # Or production mode uvicorn app:app --host 127.0.0.1 --port 8080Test the API
curl http://localhost:8080/healthHuggingFace Spaces deployment
sh deploy-to-hf.sh <remote>where points to the HuggingFace Spaces repository
π Project Structure
backend/
βββ app.py # FastAPI application entry point
βββ requirements.txt # Python dependencies
βββ .env.example # Environment configuration template
βββ .gitignore # Git ignore rules
β
βββ config/ # Configuration modules
β βββ __init__.py
β βββ settings.py # Application settings
β βββ database.py # Database configuration
β βββ logging_config.py # Logging setup
β
βββ services/ # Core business logic
β βββ __init__.py
β βββ llm_service.py # LLM and RAG pipeline
β βββ vector_store.py # Vector database management
β
βββ data/ # Data storage
β βββ recipes/ # Recipe JSON files
β β βββ recipe.json # Sample recipe data
β βββ chromadb_persist/ # ChromaDB persistence
β
βββ logs/ # Application logs
β βββ recipe_bot.log # Main log file
β
βββ docs/ # Documentation
β βββ model-selection-guide.md # π― Complete model selection & comparison guide
β βββ model-quick-reference.md # β‘ Quick model switching commands
β βββ chromadb_refresh.md # ChromaDB refresh guide
β βββ opensource-llm-configuration.md # Open source LLM setup guide
β βββ logging_guide.md # Logging documentation
β βββ optimal_recipes_structure.md # Recipe data structure guide
β βββ sanitization_guide.md # Input sanitization guide
β βββ unified-provider-configuration.md # Unified provider approach guide
β
βββ utils/ # Utility functions
βββ __init__.py
βοΈ Configuration
Environment Variables
Copy .env.example to .env and configure the following:
π― Unified Provider Approach: The
LLM_PROVIDERsetting controls both LLM and embedding models, preventing configuration mismatches. Seedocs/unified-provider-configuration.mdfor details.
Server Configuration
PORT=8000 # Server port
HOST=0.0.0.0 # Server host
ENVIRONMENT=development # Environment mode
DEBUG=true # Debug mode
Provider Configuration
Choose one provider for both LLM and embeddings (unified approach):
π― NEW: Complete Model Selection Guide: For detailed comparisons of all models (OpenAI, Google, Anthropic, Ollama, HuggingFace) including latest 2025 models, performance metrics, costs, and scenario-based recommendations, see
docs/model-selection-guide.md
β‘ Quick Reference: For one-command model switching, see
docs/model-quick-reference.md
OpenAI (Best Value & Latest Models)
LLM_PROVIDER=openai
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-5-nano # π― BEST VALUE: $1/month for 30K queries - Modern GPT-5 at nano price
# Alternatives:
# - gpt-4o-mini # Proven choice: $4/month for 30K queries
# - gpt-5 # Premium: $20/month unlimited (Plus plan)
OPENAI_EMBEDDING_MODEL=text-embedding-3-small # Used automatically
Google Gemini (Best Free Tier)
LLM_PROVIDER=google
GOOGLE_API_KEY=your_google_api_key_here
GOOGLE_MODEL=gemini-2.5-flash # π― RECOMMENDED: Excellent free tier, then $2/month
# Alternatives:
# - gemini-2.0-flash-lite # Ultra budget: $0.90/month for 30K queries
# - gemini-2.5-pro # Premium: $25/month for 30K queries
GOOGLE_EMBEDDING_MODEL=models/embedding-001 # Used automatically
Anthropic Claude (Best Quality-to-Cost)
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=your_anthropic_api_key_here
ANTHROPIC_MODEL=claude-3-5-haiku-20241022 # π― BUDGET WINNER: $4/month for 30K queries
# Alternatives:
# - claude-3-5-sonnet-20241022 # Production standard: $45/month for 30K queries
# - claude-3-opus-20240229 # Premium quality: $225/month for 30K queries
ANTHROPIC_EMBEDDING_MODEL=voyage-large-2 # Used automatically
Ollama (Best for Privacy/Self-Hosting)
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b # π― YOUR CURRENT: 4.7GB download, 8GB RAM, excellent balance
# New alternatives:
# - deepseek-r1:7b # Breakthrough reasoning: 4.7GB download, O1-level performance
# - codeqwen:7b # Structured data expert: 4.2GB download, excellent for recipes
# - gemma3:4b # Resource-efficient: 3.3GB download, 6GB RAM
# - mistral-nemo:12b # Balanced performance: 7GB download, 12GB RAM
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Used automatically
HuggingFace (Downloadable Models Only - APIs Unreliable)
LLM_PROVIDER=ollama # Use Ollama to run HuggingFace models locally
OLLAMA_MODEL=codeqwen:7b # π― RECOMMENDED: Download HF models via Ollama for reliability
# Other downloadable options:
# - mistral-nemo:12b # Mistral's balanced model
# - nous-hermes2:10.7b # Fine-tuned for instruction following
# - openhermes2.5-mistral:7b # Community favorite
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Used automatically
β οΈ Important Change: HuggingFace APIs have proven unreliable for production. We now recommend downloading HuggingFace models locally via Ollama for consistent performance. β οΈ HuggingFace Update: HuggingFace dependencies are no longer required as we recommend using downloadable models via Ollama instead of unreliable APIs. For local HuggingFace models, use Ollama which provides better reliability and performance.
π Local Model Setup: See
docs/opensource-llm-configuration.mdfor GPU setup, model selection, and performance optimization with Ollama.
π‘ Unified Provider: The
LLM_PROVIDERsetting automatically configures both the LLM and embedding models, ensuring consistency and preventing mismatched configurations.
Vector Store Configuration
Choose between ChromaDB (local) or MongoDB Atlas:
ChromaDB (Default)
VECTOR_STORE_PROVIDER=chromadb
DB_COLLECTION_NAME=recipes
DB_PERSIST_DIRECTORY=./data/chromadb_persist
# Set to true to delete and recreate DB on startup (useful for adding new recipes)
DB_REFRESH_ON_START=false
MongoDB Atlas
VECTOR_STORE_PROVIDER=mongodb
MONGODB_URI=mongodb+srv://username:password@cluster.mongodb.net/
MONGODB_DATABASE=recipe_bot
MONGODB_COLLECTION=recipes
Embedding Configuration
# Embedding provider automatically matches LLM_PROVIDER (unified approach)
# No separate configuration needed - handled automatically based on LLM_PROVIDER setting
π‘ Unified Provider: The
LLM_PROVIDERsetting automatically configures both the LLM and embedding models, ensuring consistency and preventing mismatched configurations. Seedocs/model-selection-guide.mdfor all available options.
π οΈ API Endpoints
Core Endpoints
Health Check
GET /health
Returns service health and configuration status.
Chat with RAG
POST /chat
Content-Type: application/json
{
"message": "What chicken recipes do you have?"
}
Full conversational RAG pipeline with memory and vector retrieval.
Simple Demo
GET /demo?prompt=Tell me about Italian cuisine
Simple LLM completion without RAG for testing.
Clear Memory
POST /clear-memory
Clears conversation memory for fresh start.
Example Requests
Chat Request:
curl -X POST "http://localhost:8080/chat"
-H "Content-Type: application/json"
-d '{"message": "What are some quick breakfast recipes?"}'
Demo Request:
curl "http://localhost:8080/demo?prompt=What%20is%20your%20favorite%20pasta%20dish?"
ποΈ Architecture
Core Components
LLM Service (services/llm_service.py)
- ConversationalRetrievalChain: Main RAG pipeline with memory
- Simple Chat Completion: Direct LLM responses without RAG
- Multi-provider Support: OpenAI, Google, HuggingFace
- Conversation Memory: Persistent chat history
Vector Store Service (services/vector_store.py)
- ChromaDB Integration: Local vector database
- MongoDB Atlas Support: Cloud vector search
- Document Loading: Automatic recipe data ingestion
- Embedding Management: Multi-provider embedding support
Configuration System (config/)
- Settings Management: Environment-based configuration
- Database Configuration: Vector store setup
- Logging Configuration: Structured logging with rotation
Data Flow
- User Query β FastAPI endpoint
- RAG Pipeline β Vector similarity search
- Context Retrieval β Top-k relevant recipes
- LLM Generation β Context-aware response
- Memory Storage β Conversation persistence
- Response β JSON formatted reply
π Logging
Comprehensive logging system with:
- File Rotation: 10MB max size, 5 backups
- Structured Format: Timestamps, levels, source location
- Emoji Indicators: Visual status indicators
- Error Tracking: Full stack traces for debugging
Log Levels:
- π INFO: Normal operations
- β οΈ WARNING: Non-critical issues
- β ERROR: Failures with stack traces
- π§ DEBUG: Detailed operation steps
Log Location: ./logs/recipe_bot.log
π Data Management
Recipe Data
- Location:
./data/recipes/ - Format: JSON files with structured recipe data
- Schema: title, ingredients, directions, tags
- Auto-loading: Automatic chunking and vectorization
Vector Storage
- ChromaDB: Local persistence in
./data/chromadb_persist/ - MongoDB: Cloud-based vector search
- Embeddings: Configurable embedding models
- Retrieval: Top-k similarity search (k=25)
π§ Development
Running in Development
# Install dependencies
pip install -r requirements.txt
# Set up environment
cp .env.example .env
# Configure your API keys
# Run with auto-reload
uvicorn app:app --reload --host 127.0.0.1 --port 8080
Testing Individual Components
# Test vector store
python -c "from services.vector_store import vector_store_service; print('Vector store initialized')"
# Test LLM service
python -c "from services.llm_service import llm_service; print('LLM service initialized')"
Adding New Recipes
- Add JSON files to
./data/recipes/ - Set
DB_REFRESH_ON_START=truein.envfile - Restart the application (ChromaDB will be recreated)
- Set
DB_REFRESH_ON_START=falseto prevent repeated deletion - New recipes are now available for search
Quick refresh:
# Enable refresh, restart, then disable
echo "DB_REFRESH_ON_START=true" >> .env
uvicorn app:app --reload --host 127.0.0.1 --port 8080
# After startup completes:
sed -i 's/DB_REFRESH_ON_START=true/DB_REFRESH_ON_START=false/' .env
π Production Deployment
Environment Setup
ENVIRONMENT=production
DEBUG=false
LOG_LEVEL=INFO
Docker Deployment
The backend is containerized and ready for deployment on platforms like Hugging Face Spaces.
Security Features
- Environment Variables: Secure API key management
- CORS Configuration: Frontend integration protection
- Input Sanitization: Context-appropriate validation for recipe queries
- XSS protection through HTML encoding
- Length validation (1-1000 characters)
- Basic harmful pattern removal
- Whitespace normalization
- Pydantic Validation: Type safety and automatic sanitization
- Structured Error Handling: Safe error responses without data leaks
π οΈ Troubleshooting
Common Issues
Vector store initialization fails
- Check API keys for embedding provider
- Verify data folder contains recipe files
- Check ChromaDB permissions
LLM service fails
- Verify API key configuration
- Check provider-specific requirements
- Review logs for detailed error messages
HuggingFace model import errors
- HuggingFace APIs have proven unreliable for production use
- Recommended: Use Ollama to run HuggingFace models locally instead:
# Install and run HuggingFace models via Ollama ollama pull codeqwen:7b ollama pull mistral-nemo:12b # Set LLM_PROVIDER=ollama in .env - For legacy HuggingFace API setup, uncomment dependencies in
requirements.txt(not recommended) - For detailed model comparisons, see
docs/model-selection-guide.md
Memory issues
# Clear conversation memory
curl -X POST http://localhost:8080/clear-memory
Debug Mode
Set DEBUG=true in .env for detailed logging and error traces.
Log Analysis
Check ./logs/recipe_bot.log for detailed operation logs with emoji indicators for quick status identification.
π Documentation
Troubleshooting Guides
- Embedding Troubleshooting - Quick fixes for common embedding dimension errors
- Embedding Compatibility Guide - Comprehensive guide to embedding models and dimensions
- Logging Guide - Understanding the logging system
Technical Guides
- Architecture Documentation - System architecture overview
- API Documentation - Detailed API reference
- Deployment Guide - Production deployment instructions
Common Issues
- Dimension mismatch errors: See Embedding Troubleshooting
- Model loading issues: Check provider configuration in
.env - Database connection problems: Verify MongoDB/ChromaDB settings
π Dependencies
Core Dependencies
- FastAPI: Modern web framework
- uvicorn: ASGI server
- pydantic: Data validation
- python-dotenv: Environment management
AI/ML Dependencies
- langchain: LLM framework and chains
- langchain-openai: OpenAI integration
- langchain-google-genai: Google AI integration
- sentence-transformers: Embedding models
- chromadb: Vector database
- pymongo: MongoDB integration
Optional Dependencies
- langchain-huggingface: HuggingFace integration
- torch: PyTorch for local models
π License
This project is part of the PLG4 Recipe Recommendation Chatbot system.
For more detailed documentation, check the docs/ folder or visit the API documentation at http://localhost:8080/docs when running the server.