Jesse Johnson
New commit for backend deployment: 2025-09-25_13-24-03
c59d808

Recipe Recommendation Chatbot - Backend API

Backend for AI-powered recipe recommendation system built with FastAPI, featuring RAG (Retrieval-Augmented Generation) capabilities, conversational memory, and multi-provider LLM support.

πŸš€ Quick Start

Prerequisites

  • Python 3.9+
  • pip or poetry
  • API keys for your chosen LLM provider (OpenAI, Google, or HuggingFace)

Installation

  1. Clone and navigate to backend

    git clone <repository-url>
    cd PLG4-Recipe-Recommendation-Chatbot/backend
    
  2. Install dependencies

    pip install -r requirements.txt
    

    πŸ’‘ Note: Some packages are commented out by default to keep the installation lightweight:

    • HuggingFace dependencies (transformers, accelerate, sentence-transformers) - Uncomment if using HuggingFace models
    • sentence-transformers (~800MB) - Uncomment for HuggingFace embeddings
  3. Configure environment

    cp .env.example .env
    # Edit .env with your API keys and configuration
    
  4. Run the server

    # Development mode with auto-reload
    uvicorn app:app --reload --host 127.0.0.1 --port 8080
    
    # Or production mode
    uvicorn app:app --host 127.0.0.1 --port 8080
    
  5. Test the API

    curl http://localhost:8080/health
    
  6. HuggingFace Spaces deployment

    sh deploy-to-hf.sh <remote>
    

    where points to the HuggingFace Spaces repository

πŸ“ Project Structure

backend/
β”œβ”€β”€ app.py                 # FastAPI application entry point
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ .env.example          # Environment configuration template
β”œβ”€β”€ .gitignore            # Git ignore rules
β”‚
β”œβ”€β”€ config/               # Configuration modules
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ settings.py       # Application settings
β”‚   β”œβ”€β”€ database.py       # Database configuration
β”‚   └── logging_config.py # Logging setup
β”‚
β”œβ”€β”€ services/             # Core business logic
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ llm_service.py    # LLM and RAG pipeline
β”‚   └── vector_store.py   # Vector database management
β”‚
β”œβ”€β”€ data/                 # Data storage
β”‚   β”œβ”€β”€ recipes/          # Recipe JSON files
β”‚   β”‚   └── recipe.json   # Sample recipe data
β”‚   └── chromadb_persist/ # ChromaDB persistence
β”‚
β”œβ”€β”€ logs/                 # Application logs
β”‚   └── recipe_bot.log    # Main log file
β”‚
β”œβ”€β”€ docs/                 # Documentation
β”‚   β”œβ”€β”€ model-selection-guide.md      # 🎯 Complete model selection & comparison guide
β”‚   β”œβ”€β”€ model-quick-reference.md      # ⚑ Quick model switching commands  
β”‚   β”œβ”€β”€ chromadb_refresh.md           # ChromaDB refresh guide
β”‚   β”œβ”€β”€ opensource-llm-configuration.md  # Open source LLM setup guide
β”‚   β”œβ”€β”€ logging_guide.md              # Logging documentation
β”‚   β”œβ”€β”€ optimal_recipes_structure.md  # Recipe data structure guide
β”‚   β”œβ”€β”€ sanitization_guide.md         # Input sanitization guide
β”‚   └── unified-provider-configuration.md  # Unified provider approach guide
β”‚
└── utils/                # Utility functions
    └── __init__.py

βš™οΈ Configuration

Environment Variables

Copy .env.example to .env and configure the following:

🎯 Unified Provider Approach: The LLM_PROVIDER setting controls both LLM and embedding models, preventing configuration mismatches. See docs/unified-provider-configuration.md for details.

Server Configuration

PORT=8000                 # Server port
HOST=0.0.0.0             # Server host
ENVIRONMENT=development   # Environment mode
DEBUG=true               # Debug mode

Provider Configuration

Choose one provider for both LLM and embeddings (unified approach):

🎯 NEW: Complete Model Selection Guide: For detailed comparisons of all models (OpenAI, Google, Anthropic, Ollama, HuggingFace) including latest 2025 models, performance metrics, costs, and scenario-based recommendations, see docs/model-selection-guide.md

⚑ Quick Reference: For one-command model switching, see docs/model-quick-reference.md

OpenAI (Best Value & Latest Models)

LLM_PROVIDER=openai
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-5-nano             # 🎯 BEST VALUE: $1/month for 30K queries - Modern GPT-5 at nano price
# Alternatives:
# - gpt-4o-mini                     # Proven choice: $4/month for 30K queries
# - gpt-5                           # Premium: $20/month unlimited (Plus plan)
OPENAI_EMBEDDING_MODEL=text-embedding-3-small # Used automatically

Google Gemini (Best Free Tier)

LLM_PROVIDER=google
GOOGLE_API_KEY=your_google_api_key_here
GOOGLE_MODEL=gemini-2.5-flash       # 🎯 RECOMMENDED: Excellent free tier, then $2/month
# Alternatives:
# - gemini-2.0-flash-lite           # Ultra budget: $0.90/month for 30K queries
# - gemini-2.5-pro                  # Premium: $25/month for 30K queries
GOOGLE_EMBEDDING_MODEL=models/embedding-001 # Used automatically

Anthropic Claude (Best Quality-to-Cost)

LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=your_anthropic_api_key_here
ANTHROPIC_MODEL=claude-3-5-haiku-20241022  # 🎯 BUDGET WINNER: $4/month for 30K queries
# Alternatives:
# - claude-3-5-sonnet-20241022      # Production standard: $45/month for 30K queries
# - claude-3-opus-20240229          # Premium quality: $225/month for 30K queries
ANTHROPIC_EMBEDDING_MODEL=voyage-large-2 # Used automatically

Ollama (Best for Privacy/Self-Hosting)

LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b            # 🎯 YOUR CURRENT: 4.7GB download, 8GB RAM, excellent balance
# New alternatives: 
# - deepseek-r1:7b                  # Breakthrough reasoning: 4.7GB download, O1-level performance
# - codeqwen:7b                     # Structured data expert: 4.2GB download, excellent for recipes
# - gemma3:4b                       # Resource-efficient: 3.3GB download, 6GB RAM
# - mistral-nemo:12b                # Balanced performance: 7GB download, 12GB RAM
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Used automatically

HuggingFace (Downloadable Models Only - APIs Unreliable)

LLM_PROVIDER=ollama  # Use Ollama to run HuggingFace models locally
OLLAMA_MODEL=codeqwen:7b             # 🎯 RECOMMENDED: Download HF models via Ollama for reliability
# Other downloadable options:
# - mistral-nemo:12b                # Mistral's balanced model
# - nous-hermes2:10.7b              # Fine-tuned for instruction following
# - openhermes2.5-mistral:7b        # Community favorite
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Used automatically

⚠️ Important Change: HuggingFace APIs have proven unreliable for production. We now recommend downloading HuggingFace models locally via Ollama for consistent performance. ⚠️ HuggingFace Update: HuggingFace dependencies are no longer required as we recommend using downloadable models via Ollama instead of unreliable APIs. For local HuggingFace models, use Ollama which provides better reliability and performance.

πŸ“– Local Model Setup: See docs/opensource-llm-configuration.md for GPU setup, model selection, and performance optimization with Ollama.

πŸ’‘ Unified Provider: The LLM_PROVIDER setting automatically configures both the LLM and embedding models, ensuring consistency and preventing mismatched configurations.

Vector Store Configuration

Choose between ChromaDB (local) or MongoDB Atlas:

ChromaDB (Default)

VECTOR_STORE_PROVIDER=chromadb
DB_COLLECTION_NAME=recipes
DB_PERSIST_DIRECTORY=./data/chromadb_persist
# Set to true to delete and recreate DB on startup (useful for adding new recipes)
DB_REFRESH_ON_START=false

MongoDB Atlas

VECTOR_STORE_PROVIDER=mongodb
MONGODB_URI=mongodb+srv://username:password@cluster.mongodb.net/
MONGODB_DATABASE=recipe_bot
MONGODB_COLLECTION=recipes

Embedding Configuration

# Embedding provider automatically matches LLM_PROVIDER (unified approach)
# No separate configuration needed - handled automatically based on LLM_PROVIDER setting

πŸ’‘ Unified Provider: The LLM_PROVIDER setting automatically configures both the LLM and embedding models, ensuring consistency and preventing mismatched configurations. See docs/model-selection-guide.md for all available options.

πŸ› οΈ API Endpoints

Core Endpoints

Health Check

GET /health

Returns service health and configuration status.

Chat with RAG

POST /chat
Content-Type: application/json

{
  "message": "What chicken recipes do you have?"
}

Full conversational RAG pipeline with memory and vector retrieval.

Simple Demo

GET /demo?prompt=Tell me about Italian cuisine

Simple LLM completion without RAG for testing.

Clear Memory

POST /clear-memory

Clears conversation memory for fresh start.

Example Requests

Chat Request:

curl -X POST "http://localhost:8080/chat" 
  -H "Content-Type: application/json" 
  -d '{"message": "What are some quick breakfast recipes?"}'

Demo Request:

curl "http://localhost:8080/demo?prompt=What%20is%20your%20favorite%20pasta%20dish?"

πŸ—οΈ Architecture

Core Components

LLM Service (services/llm_service.py)

  • ConversationalRetrievalChain: Main RAG pipeline with memory
  • Simple Chat Completion: Direct LLM responses without RAG
  • Multi-provider Support: OpenAI, Google, HuggingFace
  • Conversation Memory: Persistent chat history

Vector Store Service (services/vector_store.py)

  • ChromaDB Integration: Local vector database
  • MongoDB Atlas Support: Cloud vector search
  • Document Loading: Automatic recipe data ingestion
  • Embedding Management: Multi-provider embedding support

Configuration System (config/)

  • Settings Management: Environment-based configuration
  • Database Configuration: Vector store setup
  • Logging Configuration: Structured logging with rotation

Data Flow

  1. User Query β†’ FastAPI endpoint
  2. RAG Pipeline β†’ Vector similarity search
  3. Context Retrieval β†’ Top-k relevant recipes
  4. LLM Generation β†’ Context-aware response
  5. Memory Storage β†’ Conversation persistence
  6. Response β†’ JSON formatted reply

πŸ“Š Logging

Comprehensive logging system with:

  • File Rotation: 10MB max size, 5 backups
  • Structured Format: Timestamps, levels, source location
  • Emoji Indicators: Visual status indicators
  • Error Tracking: Full stack traces for debugging

Log Levels:

  • πŸš€ INFO: Normal operations
  • ⚠️ WARNING: Non-critical issues
  • ❌ ERROR: Failures with stack traces
  • πŸ”§ DEBUG: Detailed operation steps

Log Location: ./logs/recipe_bot.log

πŸ“ Data Management

Recipe Data

  • Location: ./data/recipes/
  • Format: JSON files with structured recipe data
  • Schema: title, ingredients, directions, tags
  • Auto-loading: Automatic chunking and vectorization

Vector Storage

  • ChromaDB: Local persistence in ./data/chromadb_persist/
  • MongoDB: Cloud-based vector search
  • Embeddings: Configurable embedding models
  • Retrieval: Top-k similarity search (k=25)

πŸ”§ Development

Running in Development

# Install dependencies
pip install -r requirements.txt

# Set up environment
cp .env.example .env
# Configure your API keys

# Run with auto-reload
uvicorn app:app --reload --host 127.0.0.1 --port 8080

Testing Individual Components

# Test vector store
python -c "from services.vector_store import vector_store_service; print('Vector store initialized')"

# Test LLM service
python -c "from services.llm_service import llm_service; print('LLM service initialized')"

Adding New Recipes

  1. Add JSON files to ./data/recipes/
  2. Set DB_REFRESH_ON_START=true in .env file
  3. Restart the application (ChromaDB will be recreated)
  4. Set DB_REFRESH_ON_START=false to prevent repeated deletion
  5. New recipes are now available for search

Quick refresh:

# Enable refresh, restart, then disable
echo "DB_REFRESH_ON_START=true" >> .env
uvicorn app:app --reload --host 127.0.0.1 --port 8080
# After startup completes:
sed -i 's/DB_REFRESH_ON_START=true/DB_REFRESH_ON_START=false/' .env

πŸš€ Production Deployment

Environment Setup

ENVIRONMENT=production
DEBUG=false
LOG_LEVEL=INFO

Docker Deployment

The backend is containerized and ready for deployment on platforms like Hugging Face Spaces.

Security Features

  • Environment Variables: Secure API key management
  • CORS Configuration: Frontend integration protection
  • Input Sanitization: Context-appropriate validation for recipe queries
    • XSS protection through HTML encoding
    • Length validation (1-1000 characters)
    • Basic harmful pattern removal
    • Whitespace normalization
  • Pydantic Validation: Type safety and automatic sanitization
  • Structured Error Handling: Safe error responses without data leaks

πŸ› οΈ Troubleshooting

Common Issues

Vector store initialization fails

  • Check API keys for embedding provider
  • Verify data folder contains recipe files
  • Check ChromaDB permissions

LLM service fails

  • Verify API key configuration
  • Check provider-specific requirements
  • Review logs for detailed error messages

HuggingFace model import errors

  • HuggingFace APIs have proven unreliable for production use
  • Recommended: Use Ollama to run HuggingFace models locally instead:
    # Install and run HuggingFace models via Ollama
    ollama pull codeqwen:7b
    ollama pull mistral-nemo:12b
    # Set LLM_PROVIDER=ollama in .env
    
  • For legacy HuggingFace API setup, uncomment dependencies in requirements.txt (not recommended)
  • For detailed model comparisons, see docs/model-selection-guide.md

Memory issues

# Clear conversation memory
curl -X POST http://localhost:8080/clear-memory

Debug Mode

Set DEBUG=true in .env for detailed logging and error traces.

Log Analysis

Check ./logs/recipe_bot.log for detailed operation logs with emoji indicators for quick status identification.

πŸ“š Documentation

Troubleshooting Guides

Technical Guides

Common Issues

  • Dimension mismatch errors: See Embedding Troubleshooting
  • Model loading issues: Check provider configuration in .env
  • Database connection problems: Verify MongoDB/ChromaDB settings

πŸ“š Dependencies

Core Dependencies

  • FastAPI: Modern web framework
  • uvicorn: ASGI server
  • pydantic: Data validation
  • python-dotenv: Environment management

AI/ML Dependencies

  • langchain: LLM framework and chains
  • langchain-openai: OpenAI integration
  • langchain-google-genai: Google AI integration
  • sentence-transformers: Embedding models
  • chromadb: Vector database
  • pymongo: MongoDB integration

Optional Dependencies

  • langchain-huggingface: HuggingFace integration
  • torch: PyTorch for local models

πŸ“„ License

This project is part of the PLG4 Recipe Recommendation Chatbot system.


For more detailed documentation, check the docs/ folder or visit the API documentation at http://localhost:8080/docs when running the server.