Jesse Johnson
New commit for backend deployment: 2025-09-25_13-24-03
c59d808
# Recipe Recommendation Chatbot - Backend API
Backend for AI-powered recipe recommendation system built with FastAPI, featuring RAG (Retrieval-Augmented Generation) capabilities, conversational memory, and multi-provider LLM support.
## πŸš€ Quick Start
### Prerequisites
- Python 3.9+
- pip or poetry
- API keys for your chosen LLM provider (OpenAI, Google, or HuggingFace)
### Installation
1. **Clone and navigate to backend**
```bash
git clone <repository-url>
cd PLG4-Recipe-Recommendation-Chatbot/backend
```
2. **Install dependencies**
```bash
pip install -r requirements.txt
```
> πŸ’‘ **Note**: Some packages are commented out by default to keep the installation lightweight:
> - **HuggingFace dependencies** (`transformers`, `accelerate`, `sentence-transformers`) - Uncomment if using HuggingFace models
> - **sentence-transformers** (~800MB) - Uncomment for HuggingFace embeddings
3. **Configure environment**
```bash
cp .env.example .env
# Edit .env with your API keys and configuration
```
4. **Run the server**
```bash
# Development mode with auto-reload
uvicorn app:app --reload --host 127.0.0.1 --port 8080
# Or production mode
uvicorn app:app --host 127.0.0.1 --port 8080
```
5. **Test the API**
```bash
curl http://localhost:8080/health
```
6. **HuggingFace Spaces deployment**
```
sh deploy-to-hf.sh <remote>
```
where <remote> points to the HuggingFace Spaces repository
## πŸ“ Project Structure
```
backend/
β”œβ”€β”€ app.py # FastAPI application entry point
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ .env.example # Environment configuration template
β”œβ”€β”€ .gitignore # Git ignore rules
β”‚
β”œβ”€β”€ config/ # Configuration modules
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ settings.py # Application settings
β”‚ β”œβ”€β”€ database.py # Database configuration
β”‚ └── logging_config.py # Logging setup
β”‚
β”œβ”€β”€ services/ # Core business logic
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ llm_service.py # LLM and RAG pipeline
β”‚ └── vector_store.py # Vector database management
β”‚
β”œβ”€β”€ data/ # Data storage
β”‚ β”œβ”€β”€ recipes/ # Recipe JSON files
β”‚ β”‚ └── recipe.json # Sample recipe data
β”‚ └── chromadb_persist/ # ChromaDB persistence
β”‚
β”œβ”€β”€ logs/ # Application logs
β”‚ └── recipe_bot.log # Main log file
β”‚
β”œβ”€β”€ docs/ # Documentation
β”‚ β”œβ”€β”€ model-selection-guide.md # 🎯 Complete model selection & comparison guide
β”‚ β”œβ”€β”€ model-quick-reference.md # ⚑ Quick model switching commands
β”‚ β”œβ”€β”€ chromadb_refresh.md # ChromaDB refresh guide
β”‚ β”œβ”€β”€ opensource-llm-configuration.md # Open source LLM setup guide
β”‚ β”œβ”€β”€ logging_guide.md # Logging documentation
β”‚ β”œβ”€β”€ optimal_recipes_structure.md # Recipe data structure guide
β”‚ β”œβ”€β”€ sanitization_guide.md # Input sanitization guide
β”‚ └── unified-provider-configuration.md # Unified provider approach guide
β”‚
└── utils/ # Utility functions
└── __init__.py
```
## βš™οΈ Configuration
### Environment Variables
Copy `.env.example` to `.env` and configure the following:
> 🎯 **Unified Provider Approach**: The `LLM_PROVIDER` setting controls both LLM and embedding models, preventing configuration mismatches. See [`docs/unified-provider-configuration.md`](docs/unified-provider-configuration.md) for details.
#### **Server Configuration**
```bash
PORT=8000 # Server port
HOST=0.0.0.0 # Server host
ENVIRONMENT=development # Environment mode
DEBUG=true # Debug mode
```
#### **Provider Configuration**
Choose one provider for both LLM and embeddings (unified approach):
> 🎯 **NEW: Complete Model Selection Guide**: For detailed comparisons of all models (OpenAI, Google, Anthropic, Ollama, HuggingFace) including latest 2025 models, performance metrics, costs, and scenario-based recommendations, see [`docs/model-selection-guide.md`](docs/model-selection-guide.md)
> ⚑ **Quick Reference**: For one-command model switching, see [`docs/model-quick-reference.md`](docs/model-quick-reference.md)
**OpenAI (Best Value & Latest Models)**
```bash
LLM_PROVIDER=openai
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-5-nano # 🎯 BEST VALUE: $1/month for 30K queries - Modern GPT-5 at nano price
# Alternatives:
# - gpt-4o-mini # Proven choice: $4/month for 30K queries
# - gpt-5 # Premium: $20/month unlimited (Plus plan)
OPENAI_EMBEDDING_MODEL=text-embedding-3-small # Used automatically
```
**Google Gemini (Best Free Tier)**
```bash
LLM_PROVIDER=google
GOOGLE_API_KEY=your_google_api_key_here
GOOGLE_MODEL=gemini-2.5-flash # 🎯 RECOMMENDED: Excellent free tier, then $2/month
# Alternatives:
# - gemini-2.0-flash-lite # Ultra budget: $0.90/month for 30K queries
# - gemini-2.5-pro # Premium: $25/month for 30K queries
GOOGLE_EMBEDDING_MODEL=models/embedding-001 # Used automatically
```
**Anthropic Claude (Best Quality-to-Cost)**
```bash
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=your_anthropic_api_key_here
ANTHROPIC_MODEL=claude-3-5-haiku-20241022 # 🎯 BUDGET WINNER: $4/month for 30K queries
# Alternatives:
# - claude-3-5-sonnet-20241022 # Production standard: $45/month for 30K queries
# - claude-3-opus-20240229 # Premium quality: $225/month for 30K queries
ANTHROPIC_EMBEDDING_MODEL=voyage-large-2 # Used automatically
```
**Ollama (Best for Privacy/Self-Hosting)**
```bash
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b # 🎯 YOUR CURRENT: 4.7GB download, 8GB RAM, excellent balance
# New alternatives:
# - deepseek-r1:7b # Breakthrough reasoning: 4.7GB download, O1-level performance
# - codeqwen:7b # Structured data expert: 4.2GB download, excellent for recipes
# - gemma3:4b # Resource-efficient: 3.3GB download, 6GB RAM
# - mistral-nemo:12b # Balanced performance: 7GB download, 12GB RAM
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Used automatically
```
**HuggingFace (Downloadable Models Only - APIs Unreliable)**
```bash
LLM_PROVIDER=ollama # Use Ollama to run HuggingFace models locally
OLLAMA_MODEL=codeqwen:7b # 🎯 RECOMMENDED: Download HF models via Ollama for reliability
# Other downloadable options:
# - mistral-nemo:12b # Mistral's balanced model
# - nous-hermes2:10.7b # Fine-tuned for instruction following
# - openhermes2.5-mistral:7b # Community favorite
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Used automatically
```
> ⚠️ **Important Change**: HuggingFace APIs have proven unreliable for production. We now recommend downloading HuggingFace models locally via Ollama for consistent performance.
> ⚠️ **HuggingFace Update**: HuggingFace dependencies are no longer required as we recommend using downloadable models via Ollama instead of unreliable APIs. For local HuggingFace models, use Ollama which provides better reliability and performance.
> πŸ“– **Local Model Setup**: See [`docs/opensource-llm-configuration.md`](docs/opensource-llm-configuration.md) for GPU setup, model selection, and performance optimization with Ollama.
> πŸ’‘ **Unified Provider**: The `LLM_PROVIDER` setting automatically configures both the LLM and embedding models, ensuring consistency and preventing mismatched configurations.
#### **Vector Store Configuration**
Choose between ChromaDB (local) or MongoDB Atlas:
**ChromaDB (Default)**
```bash
VECTOR_STORE_PROVIDER=chromadb
DB_COLLECTION_NAME=recipes
DB_PERSIST_DIRECTORY=./data/chromadb_persist
# Set to true to delete and recreate DB on startup (useful for adding new recipes)
DB_REFRESH_ON_START=false
```
**MongoDB Atlas**
```bash
VECTOR_STORE_PROVIDER=mongodb
MONGODB_URI=mongodb+srv://username:password@cluster.mongodb.net/
MONGODB_DATABASE=recipe_bot
MONGODB_COLLECTION=recipes
```
#### **Embedding Configuration**
```bash
# Embedding provider automatically matches LLM_PROVIDER (unified approach)
# No separate configuration needed - handled automatically based on LLM_PROVIDER setting
```
> πŸ’‘ **Unified Provider**: The `LLM_PROVIDER` setting automatically configures both the LLM and embedding models, ensuring consistency and preventing mismatched configurations. See [`docs/model-selection-guide.md`](docs/model-selection-guide.md) for all available options.
## πŸ› οΈ API Endpoints
### Core Endpoints
#### **Health Check**
```bash
GET /health
```
Returns service health and configuration status.
#### **Chat with RAG**
```bash
POST /chat
Content-Type: application/json
{
"message": "What chicken recipes do you have?"
}
```
Full conversational RAG pipeline with memory and vector retrieval.
#### **Simple Demo**
```bash
GET /demo?prompt=Tell me about Italian cuisine
```
Simple LLM completion without RAG for testing.
#### **Clear Memory**
```bash
POST /clear-memory
```
Clears conversation memory for fresh start.
### Example Requests
**Chat Request:**
```bash
curl -X POST "http://localhost:8080/chat"
-H "Content-Type: application/json"
-d '{"message": "What are some quick breakfast recipes?"}'
```
**Demo Request:**
```bash
curl "http://localhost:8080/demo?prompt=What%20is%20your%20favorite%20pasta%20dish?"
```
## πŸ—οΈ Architecture
### Core Components
#### **LLM Service** (`services/llm_service.py`)
- **ConversationalRetrievalChain**: Main RAG pipeline with memory
- **Simple Chat Completion**: Direct LLM responses without RAG
- **Multi-provider Support**: OpenAI, Google, HuggingFace
- **Conversation Memory**: Persistent chat history
#### **Vector Store Service** (`services/vector_store.py`)
- **ChromaDB Integration**: Local vector database
- **MongoDB Atlas Support**: Cloud vector search
- **Document Loading**: Automatic recipe data ingestion
- **Embedding Management**: Multi-provider embedding support
#### **Configuration System** (`config/`)
- **Settings Management**: Environment-based configuration
- **Database Configuration**: Vector store setup
- **Logging Configuration**: Structured logging with rotation
### Data Flow
1. **User Query** β†’ FastAPI endpoint
2. **RAG Pipeline** β†’ Vector similarity search
3. **Context Retrieval** β†’ Top-k relevant recipes
4. **LLM Generation** β†’ Context-aware response
5. **Memory Storage** β†’ Conversation persistence
6. **Response** β†’ JSON formatted reply
## πŸ“Š Logging
Comprehensive logging system with:
- **File Rotation**: 10MB max size, 5 backups
- **Structured Format**: Timestamps, levels, source location
- **Emoji Indicators**: Visual status indicators
- **Error Tracking**: Full stack traces for debugging
**Log Levels:**
- πŸš€ **INFO**: Normal operations
- ⚠️ **WARNING**: Non-critical issues
- ❌ **ERROR**: Failures with stack traces
- πŸ”§ **DEBUG**: Detailed operation steps
**Log Location:** `./logs/recipe_bot.log`
## πŸ“ Data Management
### Recipe Data
- **Location**: `./data/recipes/`
- **Format**: JSON files with structured recipe data
- **Schema**: title, ingredients, directions, tags
- **Auto-loading**: Automatic chunking and vectorization
### Vector Storage
- **ChromaDB**: Local persistence in `./data/chromadb_persist/`
- **MongoDB**: Cloud-based vector search
- **Embeddings**: Configurable embedding models
- **Retrieval**: Top-k similarity search (k=25)
## πŸ”§ Development
### Running in Development
```bash
# Install dependencies
pip install -r requirements.txt
# Set up environment
cp .env.example .env
# Configure your API keys
# Run with auto-reload
uvicorn app:app --reload --host 127.0.0.1 --port 8080
```
### Testing Individual Components
```bash
# Test vector store
python -c "from services.vector_store import vector_store_service; print('Vector store initialized')"
# Test LLM service
python -c "from services.llm_service import llm_service; print('LLM service initialized')"
```
### Adding New Recipes
1. Add JSON files to `./data/recipes/`
2. Set `DB_REFRESH_ON_START=true` in `.env` file
3. Restart the application (ChromaDB will be recreated)
4. Set `DB_REFRESH_ON_START=false` to prevent repeated deletion
5. New recipes are now available for search
**Quick refresh:**
```bash
# Enable refresh, restart, then disable
echo "DB_REFRESH_ON_START=true" >> .env
uvicorn app:app --reload --host 127.0.0.1 --port 8080
# After startup completes:
sed -i 's/DB_REFRESH_ON_START=true/DB_REFRESH_ON_START=false/' .env
```
## πŸš€ Production Deployment
### Environment Setup
```bash
ENVIRONMENT=production
DEBUG=false
LOG_LEVEL=INFO
```
### Docker Deployment
The backend is containerized and ready for deployment on platforms like Hugging Face Spaces.
### Security Features
- **Environment Variables**: Secure API key management
- **CORS Configuration**: Frontend integration protection
- **Input Sanitization**: Context-appropriate validation for recipe queries
- XSS protection through HTML encoding
- Length validation (1-1000 characters)
- Basic harmful pattern removal
- Whitespace normalization
- **Pydantic Validation**: Type safety and automatic sanitization
- **Structured Error Handling**: Safe error responses without data leaks
## πŸ› οΈ Troubleshooting
### Common Issues
**Vector store initialization fails**
- Check API keys for embedding provider
- Verify data folder contains recipe files
- Check ChromaDB permissions
**LLM service fails**
- Verify API key configuration
- Check provider-specific requirements
- Review logs for detailed error messages
**HuggingFace model import errors**
- HuggingFace APIs have proven unreliable for production use
- **Recommended**: Use Ollama to run HuggingFace models locally instead:
```bash
# Install and run HuggingFace models via Ollama
ollama pull codeqwen:7b
ollama pull mistral-nemo:12b
# Set LLM_PROVIDER=ollama in .env
```
- For legacy HuggingFace API setup, uncomment dependencies in `requirements.txt` (not recommended)
- For detailed model comparisons, see [`docs/model-selection-guide.md`](docs/model-selection-guide.md)
**Memory issues**
```bash
# Clear conversation memory
curl -X POST http://localhost:8080/clear-memory
```
### Debug Mode
Set `DEBUG=true` in `.env` for detailed logging and error traces.
### Log Analysis
Check `./logs/recipe_bot.log` for detailed operation logs with emoji indicators for quick status identification.
## πŸ“š Documentation
### Troubleshooting Guides
- **[Embedding Troubleshooting](./docs/embedding-troubleshooting.md)** - Quick fixes for common embedding dimension errors
- **[Embedding Compatibility Guide](./docs/embedding-compatibility-guide.md)** - Comprehensive guide to embedding models and dimensions
- **[Logging Guide](./docs/logging_guide.md)** - Understanding the logging system
### Technical Guides
- **[Architecture Documentation](./docs/architecture.md)** - System architecture overview
- **[API Documentation](./docs/api-documentation.md)** - Detailed API reference
- **[Deployment Guide](./docs/deployment.md)** - Production deployment instructions
### Common Issues
- **Dimension mismatch errors**: See [Embedding Troubleshooting](./docs/embedding-troubleshooting.md)
- **Model loading issues**: Check provider configuration in `.env`
- **Database connection problems**: Verify MongoDB/ChromaDB settings
## πŸ“š Dependencies
### Core Dependencies
- **FastAPI**: Modern web framework
- **uvicorn**: ASGI server
- **pydantic**: Data validation
- **python-dotenv**: Environment management
### AI/ML Dependencies
- **langchain**: LLM framework and chains
- **langchain-openai**: OpenAI integration
- **langchain-google-genai**: Google AI integration
- **sentence-transformers**: Embedding models
- **chromadb**: Vector database
- **pymongo**: MongoDB integration
### Optional Dependencies
- **langchain-huggingface**: HuggingFace integration
- **torch**: PyTorch for local models
## πŸ“„ License
This project is part of the PLG4 Recipe Recommendation Chatbot system.
---
For more detailed documentation, check the `docs/` folder or visit the API documentation at `http://localhost:8080/docs` when running the server.