Spaces:

jessejohnson
/

plg4-dev-server

Paused

App Files Files Community

plg4-dev-server / backend /README.md

Jesse Johnson

New commit for backend deployment: 2025-09-25_13-24-03

c59d808 5 months ago

preview code

raw

history blame contribute delete

16.5 kB

Recipe Recommendation Chatbot - Backend API

Backend for AI-powered recipe recommendation system built with FastAPI, featuring RAG (Retrieval-Augmented Generation) capabilities, conversational memory, and multi-provider LLM support.

🚀 Quick Start

Prerequisites

Python 3.9+
pip or poetry
API keys for your chosen LLM provider (OpenAI, Google, or HuggingFace)

Installation

Clone and navigate to backend

git clone <repository-url>
cd PLG4-Recipe-Recommendation-Chatbot/backend

Install dependencies
```
pip install -r requirements.txt
```
💡 Note: Some packages are commented out by default to keep the installation lightweight:
- HuggingFace dependencies (transformers, accelerate, sentence-transformers) - Uncomment if using HuggingFace models
- sentence-transformers (~800MB) - Uncomment for HuggingFace embeddings

Configure environment

cp .env.example .env
# Edit .env with your API keys and configuration

Run the server

# Development mode with auto-reload
uvicorn app:app --reload --host 127.0.0.1 --port 8080

# Or production mode
uvicorn app:app --host 127.0.0.1 --port 8080

Test the API
```
curl http://localhost:8080/health
```
HuggingFace Spaces deployment
```
sh deploy-to-hf.sh <remote>
```
where points to the HuggingFace Spaces repository

📁 Project Structure

backend/
├── app.py                 # FastAPI application entry point
├── requirements.txt       # Python dependencies
├── .env.example          # Environment configuration template
├── .gitignore            # Git ignore rules
│
├── config/               # Configuration modules
│   ├── __init__.py
│   ├── settings.py       # Application settings
│   ├── database.py       # Database configuration
│   └── logging_config.py # Logging setup
│
├── services/             # Core business logic
│   ├── __init__.py
│   ├── llm_service.py    # LLM and RAG pipeline
│   └── vector_store.py   # Vector database management
│
├── data/                 # Data storage
│   ├── recipes/          # Recipe JSON files
│   │   └── recipe.json   # Sample recipe data
│   └── chromadb_persist/ # ChromaDB persistence
│
├── logs/                 # Application logs
│   └── recipe_bot.log    # Main log file
│
├── docs/                 # Documentation
│   ├── model-selection-guide.md      # 🎯 Complete model selection & comparison guide
│   ├── model-quick-reference.md      # ⚡ Quick model switching commands  
│   ├── chromadb_refresh.md           # ChromaDB refresh guide
│   ├── opensource-llm-configuration.md  # Open source LLM setup guide
│   ├── logging_guide.md              # Logging documentation
│   ├── optimal_recipes_structure.md  # Recipe data structure guide
│   ├── sanitization_guide.md         # Input sanitization guide
│   └── unified-provider-configuration.md  # Unified provider approach guide
│
└── utils/                # Utility functions
    └── __init__.py

⚙️ Configuration

Environment Variables

Copy .env.example to .env and configure the following:

🎯 Unified Provider Approach: The LLM_PROVIDER setting controls both LLM and embedding models, preventing configuration mismatches. See docs/unified-provider-configuration.md for details.

Server Configuration

PORT=8000                 # Server port
HOST=0.0.0.0             # Server host
ENVIRONMENT=development   # Environment mode
DEBUG=true               # Debug mode

Provider Configuration

Choose one provider for both LLM and embeddings (unified approach):

🎯 NEW: Complete Model Selection Guide: For detailed comparisons of all models (OpenAI, Google, Anthropic, Ollama, HuggingFace) including latest 2025 models, performance metrics, costs, and scenario-based recommendations, see docs/model-selection-guide.md

⚡ Quick Reference: For one-command model switching, see docs/model-quick-reference.md

OpenAI (Best Value & Latest Models)

LLM_PROVIDER=openai
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-5-nano             # 🎯 BEST VALUE: $1/month for 30K queries - Modern GPT-5 at nano price
# Alternatives:
# - gpt-4o-mini                     # Proven choice: $4/month for 30K queries
# - gpt-5                           # Premium: $20/month unlimited (Plus plan)
OPENAI_EMBEDDING_MODEL=text-embedding-3-small # Used automatically

Google Gemini (Best Free Tier)

LLM_PROVIDER=google
GOOGLE_API_KEY=your_google_api_key_here
GOOGLE_MODEL=gemini-2.5-flash       # 🎯 RECOMMENDED: Excellent free tier, then $2/month
# Alternatives:
# - gemini-2.0-flash-lite           # Ultra budget: $0.90/month for 30K queries
# - gemini-2.5-pro                  # Premium: $25/month for 30K queries
GOOGLE_EMBEDDING_MODEL=models/embedding-001 # Used automatically

Anthropic Claude (Best Quality-to-Cost)

LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=your_anthropic_api_key_here
ANTHROPIC_MODEL=claude-3-5-haiku-20241022  # 🎯 BUDGET WINNER: $4/month for 30K queries
# Alternatives:
# - claude-3-5-sonnet-20241022      # Production standard: $45/month for 30K queries
# - claude-3-opus-20240229          # Premium quality: $225/month for 30K queries
ANTHROPIC_EMBEDDING_MODEL=voyage-large-2 # Used automatically

Ollama (Best for Privacy/Self-Hosting)

LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b            # 🎯 YOUR CURRENT: 4.7GB download, 8GB RAM, excellent balance
# New alternatives: 
# - deepseek-r1:7b                  # Breakthrough reasoning: 4.7GB download, O1-level performance
# - codeqwen:7b                     # Structured data expert: 4.2GB download, excellent for recipes
# - gemma3:4b                       # Resource-efficient: 3.3GB download, 6GB RAM
# - mistral-nemo:12b                # Balanced performance: 7GB download, 12GB RAM
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Used automatically

HuggingFace (Downloadable Models Only - APIs Unreliable)

LLM_PROVIDER=ollama  # Use Ollama to run HuggingFace models locally
OLLAMA_MODEL=codeqwen:7b             # 🎯 RECOMMENDED: Download HF models via Ollama for reliability
# Other downloadable options:
# - mistral-nemo:12b                # Mistral's balanced model
# - nous-hermes2:10.7b              # Fine-tuned for instruction following
# - openhermes2.5-mistral:7b        # Community favorite
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Used automatically

⚠️ Important Change: HuggingFace APIs have proven unreliable for production. We now recommend downloading HuggingFace models locally via Ollama for consistent performance. ⚠️ HuggingFace Update: HuggingFace dependencies are no longer required as we recommend using downloadable models via Ollama instead of unreliable APIs. For local HuggingFace models, use Ollama which provides better reliability and performance.

📖 Local Model Setup: See docs/opensource-llm-configuration.md for GPU setup, model selection, and performance optimization with Ollama.

💡 Unified Provider: The LLM_PROVIDER setting automatically configures both the LLM and embedding models, ensuring consistency and preventing mismatched configurations.

Vector Store Configuration

Choose between ChromaDB (local) or MongoDB Atlas:

ChromaDB (Default)

VECTOR_STORE_PROVIDER=chromadb
DB_COLLECTION_NAME=recipes
DB_PERSIST_DIRECTORY=./data/chromadb_persist
# Set to true to delete and recreate DB on startup (useful for adding new recipes)
DB_REFRESH_ON_START=false

MongoDB Atlas

VECTOR_STORE_PROVIDER=mongodb
MONGODB_URI=mongodb+srv://username:password@cluster.mongodb.net/
MONGODB_DATABASE=recipe_bot
MONGODB_COLLECTION=recipes

Embedding Configuration

# Embedding provider automatically matches LLM_PROVIDER (unified approach)
# No separate configuration needed - handled automatically based on LLM_PROVIDER setting

💡 Unified Provider: The LLM_PROVIDER setting automatically configures both the LLM and embedding models, ensuring consistency and preventing mismatched configurations. See docs/model-selection-guide.md for all available options.

🛠️ API Endpoints

Core Endpoints

Health Check

GET /health

Returns service health and configuration status.

Chat with RAG

POST /chat
Content-Type: application/json

{
  "message": "What chicken recipes do you have?"
}

Full conversational RAG pipeline with memory and vector retrieval.

Simple Demo

GET /demo?prompt=Tell me about Italian cuisine

Simple LLM completion without RAG for testing.

Clear Memory

POST /clear-memory

Clears conversation memory for fresh start.

Example Requests

Chat Request:

curl -X POST "http://localhost:8080/chat" 
  -H "Content-Type: application/json" 
  -d '{"message": "What are some quick breakfast recipes?"}'

Demo Request:

curl "http://localhost:8080/demo?prompt=What%20is%20your%20favorite%20pasta%20dish?"

🏗️ Architecture

Core Components

LLM Service (`services/llm_service.py`)

ConversationalRetrievalChain: Main RAG pipeline with memory
Simple Chat Completion: Direct LLM responses without RAG
Multi-provider Support: OpenAI, Google, HuggingFace
Conversation Memory: Persistent chat history

Vector Store Service (`services/vector_store.py`)

ChromaDB Integration: Local vector database
MongoDB Atlas Support: Cloud vector search
Document Loading: Automatic recipe data ingestion
Embedding Management: Multi-provider embedding support

Configuration System (`config/`)

Settings Management: Environment-based configuration
Database Configuration: Vector store setup
Logging Configuration: Structured logging with rotation

Data Flow

User Query → FastAPI endpoint
RAG Pipeline → Vector similarity search
Context Retrieval → Top-k relevant recipes
LLM Generation → Context-aware response
Memory Storage → Conversation persistence
Response → JSON formatted reply

📊 Logging

Comprehensive logging system with:

File Rotation: 10MB max size, 5 backups
Structured Format: Timestamps, levels, source location
Emoji Indicators: Visual status indicators
Error Tracking: Full stack traces for debugging

Log Levels:

🚀 INFO: Normal operations
⚠️ WARNING: Non-critical issues
❌ ERROR: Failures with stack traces
🔧 DEBUG: Detailed operation steps

Log Location: ./logs/recipe_bot.log

📁 Data Management

Recipe Data

Location: ./data/recipes/
Format: JSON files with structured recipe data
Schema: title, ingredients, directions, tags
Auto-loading: Automatic chunking and vectorization

Vector Storage

ChromaDB: Local persistence in ./data/chromadb_persist/
MongoDB: Cloud-based vector search
Embeddings: Configurable embedding models
Retrieval: Top-k similarity search (k=25)

🔧 Development

Running in Development

# Install dependencies
pip install -r requirements.txt

# Set up environment
cp .env.example .env
# Configure your API keys

# Run with auto-reload
uvicorn app:app --reload --host 127.0.0.1 --port 8080

Testing Individual Components

# Test vector store
python -c "from services.vector_store import vector_store_service; print('Vector store initialized')"

# Test LLM service
python -c "from services.llm_service import llm_service; print('LLM service initialized')"

Adding New Recipes

Add JSON files to ./data/recipes/
Set DB_REFRESH_ON_START=true in .env file
Restart the application (ChromaDB will be recreated)
Set DB_REFRESH_ON_START=false to prevent repeated deletion
New recipes are now available for search

Quick refresh:

# Enable refresh, restart, then disable
echo "DB_REFRESH_ON_START=true" >> .env
uvicorn app:app --reload --host 127.0.0.1 --port 8080
# After startup completes:
sed -i 's/DB_REFRESH_ON_START=true/DB_REFRESH_ON_START=false/' .env

🚀 Production Deployment

Environment Setup

ENVIRONMENT=production
DEBUG=false
LOG_LEVEL=INFO

Docker Deployment

The backend is containerized and ready for deployment on platforms like Hugging Face Spaces.

Security Features

Environment Variables: Secure API key management
CORS Configuration: Frontend integration protection
Input Sanitization: Context-appropriate validation for recipe queries
- XSS protection through HTML encoding
- Length validation (1-1000 characters)
- Basic harmful pattern removal
- Whitespace normalization
Pydantic Validation: Type safety and automatic sanitization
Structured Error Handling: Safe error responses without data leaks

🛠️ Troubleshooting

Common Issues

Vector store initialization fails

Check API keys for embedding provider
Verify data folder contains recipe files
Check ChromaDB permissions

LLM service fails

Verify API key configuration
Check provider-specific requirements
Review logs for detailed error messages

HuggingFace model import errors

HuggingFace APIs have proven unreliable for production use

Recommended: Use Ollama to run HuggingFace models locally instead:

# Install and run HuggingFace models via Ollama
ollama pull codeqwen:7b
ollama pull mistral-nemo:12b
# Set LLM_PROVIDER=ollama in .env

For legacy HuggingFace API setup, uncomment dependencies in requirements.txt (not recommended)
For detailed model comparisons, see docs/model-selection-guide.md

Memory issues

# Clear conversation memory
curl -X POST http://localhost:8080/clear-memory

Debug Mode

Set DEBUG=true in .env for detailed logging and error traces.

Log Analysis

Check ./logs/recipe_bot.log for detailed operation logs with emoji indicators for quick status identification.

📚 Documentation

Troubleshooting Guides

Embedding Troubleshooting - Quick fixes for common embedding dimension errors
Embedding Compatibility Guide - Comprehensive guide to embedding models and dimensions
Logging Guide - Understanding the logging system

Technical Guides

Architecture Documentation - System architecture overview
API Documentation - Detailed API reference
Deployment Guide - Production deployment instructions

Common Issues

Dimension mismatch errors: See Embedding Troubleshooting
Model loading issues: Check provider configuration in .env
Database connection problems: Verify MongoDB/ChromaDB settings

📚 Dependencies

Core Dependencies

FastAPI: Modern web framework
uvicorn: ASGI server
pydantic: Data validation
python-dotenv: Environment management

AI/ML Dependencies

langchain: LLM framework and chains
langchain-openai: OpenAI integration
langchain-google-genai: Google AI integration
sentence-transformers: Embedding models
chromadb: Vector database
pymongo: MongoDB integration

Optional Dependencies

langchain-huggingface: HuggingFace integration
torch: PyTorch for local models

📄 License

This project is part of the PLG4 Recipe Recommendation Chatbot system.

For more detailed documentation, check the docs/ folder or visit the API documentation at http://localhost:8080/docs when running the server.

Recipe Recommendation Chatbot - Backend API

🚀 Quick Start

Prerequisites

Installation

📁 Project Structure

⚙️ Configuration

Environment Variables

Server Configuration

Provider Configuration

Vector Store Configuration

Embedding Configuration

🛠️ API Endpoints

Core Endpoints

Health Check

Chat with RAG

Simple Demo

Clear Memory

Example Requests

🏗️ Architecture

Core Components

LLM Service (services/llm_service.py)

Vector Store Service (services/vector_store.py)

Configuration System (config/)

Data Flow

📊 Logging

📁 Data Management

Recipe Data

Vector Storage

🔧 Development

Running in Development

Testing Individual Components

Adding New Recipes

🚀 Production Deployment

Environment Setup

Docker Deployment

Security Features

🛠️ Troubleshooting

Common Issues

Debug Mode

Log Analysis

📚 Documentation

Troubleshooting Guides

Technical Guides

Common Issues

📚 Dependencies

Core Dependencies

AI/ML Dependencies

Optional Dependencies

📄 License

LLM Service (`services/llm_service.py`)

Vector Store Service (`services/vector_store.py`)

Configuration System (`config/`)