File size: 16,454 Bytes
c59d808
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
# Recipe Recommendation Chatbot - Backend API

Backend for AI-powered recipe recommendation system built with FastAPI, featuring RAG (Retrieval-Augmented Generation) capabilities, conversational memory, and multi-provider LLM support.

## πŸš€ Quick Start

### Prerequisites
- Python 3.9+
- pip or poetry
- API keys for your chosen LLM provider (OpenAI, Google, or HuggingFace)

### Installation

1. **Clone and navigate to backend**
   ```bash
   git clone <repository-url>
   cd PLG4-Recipe-Recommendation-Chatbot/backend
   ```

2. **Install dependencies**
   ```bash
   pip install -r requirements.txt
   ```
   > πŸ’‘ **Note**: Some packages are commented out by default to keep the installation lightweight:
   > - **HuggingFace dependencies** (`transformers`, `accelerate`, `sentence-transformers`) - Uncomment if using HuggingFace models
   > - **sentence-transformers** (~800MB) - Uncomment for HuggingFace embeddings

3. **Configure environment**
   ```bash
   cp .env.example .env
   # Edit .env with your API keys and configuration
   ```

4. **Run the server**
   ```bash
   # Development mode with auto-reload
   uvicorn app:app --reload --host 127.0.0.1 --port 8080
   
   # Or production mode
   uvicorn app:app --host 127.0.0.1 --port 8080
   ```

5. **Test the API**
   ```bash
   curl http://localhost:8080/health
   ```

6. **HuggingFace Spaces deployment**
   ```
   sh deploy-to-hf.sh <remote>
   ``` 
   where <remote> points to the HuggingFace Spaces repository

## πŸ“ Project Structure

```
backend/
β”œβ”€β”€ app.py                 # FastAPI application entry point
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ .env.example          # Environment configuration template
β”œβ”€β”€ .gitignore            # Git ignore rules
β”‚
β”œβ”€β”€ config/               # Configuration modules
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ settings.py       # Application settings
β”‚   β”œβ”€β”€ database.py       # Database configuration
β”‚   └── logging_config.py # Logging setup
β”‚
β”œβ”€β”€ services/             # Core business logic
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ llm_service.py    # LLM and RAG pipeline
β”‚   └── vector_store.py   # Vector database management
β”‚
β”œβ”€β”€ data/                 # Data storage
β”‚   β”œβ”€β”€ recipes/          # Recipe JSON files
β”‚   β”‚   └── recipe.json   # Sample recipe data
β”‚   └── chromadb_persist/ # ChromaDB persistence
β”‚
β”œβ”€β”€ logs/                 # Application logs
β”‚   └── recipe_bot.log    # Main log file
β”‚
β”œβ”€β”€ docs/                 # Documentation
β”‚   β”œβ”€β”€ model-selection-guide.md      # 🎯 Complete model selection & comparison guide
β”‚   β”œβ”€β”€ model-quick-reference.md      # ⚑ Quick model switching commands  
β”‚   β”œβ”€β”€ chromadb_refresh.md           # ChromaDB refresh guide
β”‚   β”œβ”€β”€ opensource-llm-configuration.md  # Open source LLM setup guide
β”‚   β”œβ”€β”€ logging_guide.md              # Logging documentation
β”‚   β”œβ”€β”€ optimal_recipes_structure.md  # Recipe data structure guide
β”‚   β”œβ”€β”€ sanitization_guide.md         # Input sanitization guide
β”‚   └── unified-provider-configuration.md  # Unified provider approach guide
β”‚
└── utils/                # Utility functions
    └── __init__.py
```

## βš™οΈ Configuration

### Environment Variables

Copy `.env.example` to `.env` and configure the following:

> 🎯 **Unified Provider Approach**: The `LLM_PROVIDER` setting controls both LLM and embedding models, preventing configuration mismatches. See [`docs/unified-provider-configuration.md`](docs/unified-provider-configuration.md) for details.

#### **Server Configuration**
```bash
PORT=8000                 # Server port
HOST=0.0.0.0             # Server host
ENVIRONMENT=development   # Environment mode
DEBUG=true               # Debug mode
```

#### **Provider Configuration**
Choose one provider for both LLM and embeddings (unified approach):

> 🎯 **NEW: Complete Model Selection Guide**: For detailed comparisons of all models (OpenAI, Google, Anthropic, Ollama, HuggingFace) including latest 2025 models, performance metrics, costs, and scenario-based recommendations, see [`docs/model-selection-guide.md`](docs/model-selection-guide.md)

> ⚑ **Quick Reference**: For one-command model switching, see [`docs/model-quick-reference.md`](docs/model-quick-reference.md)

**OpenAI (Best Value & Latest Models)**
```bash
LLM_PROVIDER=openai
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-5-nano             # 🎯 BEST VALUE: $1/month for 30K queries - Modern GPT-5 at nano price
# Alternatives:
# - gpt-4o-mini                     # Proven choice: $4/month for 30K queries
# - gpt-5                           # Premium: $20/month unlimited (Plus plan)
OPENAI_EMBEDDING_MODEL=text-embedding-3-small # Used automatically
```

**Google Gemini (Best Free Tier)**
```bash
LLM_PROVIDER=google
GOOGLE_API_KEY=your_google_api_key_here
GOOGLE_MODEL=gemini-2.5-flash       # 🎯 RECOMMENDED: Excellent free tier, then $2/month
# Alternatives:
# - gemini-2.0-flash-lite           # Ultra budget: $0.90/month for 30K queries
# - gemini-2.5-pro                  # Premium: $25/month for 30K queries
GOOGLE_EMBEDDING_MODEL=models/embedding-001 # Used automatically
```

**Anthropic Claude (Best Quality-to-Cost)**
```bash
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=your_anthropic_api_key_here
ANTHROPIC_MODEL=claude-3-5-haiku-20241022  # 🎯 BUDGET WINNER: $4/month for 30K queries
# Alternatives:
# - claude-3-5-sonnet-20241022      # Production standard: $45/month for 30K queries
# - claude-3-opus-20240229          # Premium quality: $225/month for 30K queries
ANTHROPIC_EMBEDDING_MODEL=voyage-large-2 # Used automatically
```

**Ollama (Best for Privacy/Self-Hosting)**
```bash
LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b            # 🎯 YOUR CURRENT: 4.7GB download, 8GB RAM, excellent balance
# New alternatives: 
# - deepseek-r1:7b                  # Breakthrough reasoning: 4.7GB download, O1-level performance
# - codeqwen:7b                     # Structured data expert: 4.2GB download, excellent for recipes
# - gemma3:4b                       # Resource-efficient: 3.3GB download, 6GB RAM
# - mistral-nemo:12b                # Balanced performance: 7GB download, 12GB RAM
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Used automatically
```

**HuggingFace (Downloadable Models Only - APIs Unreliable)**
```bash
LLM_PROVIDER=ollama  # Use Ollama to run HuggingFace models locally
OLLAMA_MODEL=codeqwen:7b             # 🎯 RECOMMENDED: Download HF models via Ollama for reliability
# Other downloadable options:
# - mistral-nemo:12b                # Mistral's balanced model
# - nous-hermes2:10.7b              # Fine-tuned for instruction following
# - openhermes2.5-mistral:7b        # Community favorite
OLLAMA_EMBEDDING_MODEL=nomic-embed-text # Used automatically
```
> ⚠️ **Important Change**: HuggingFace APIs have proven unreliable for production. We now recommend downloading HuggingFace models locally via Ollama for consistent performance.
> ⚠️ **HuggingFace Update**: HuggingFace dependencies are no longer required as we recommend using downloadable models via Ollama instead of unreliable APIs. For local HuggingFace models, use Ollama which provides better reliability and performance.

> πŸ“– **Local Model Setup**: See [`docs/opensource-llm-configuration.md`](docs/opensource-llm-configuration.md) for GPU setup, model selection, and performance optimization with Ollama.

> πŸ’‘ **Unified Provider**: The `LLM_PROVIDER` setting automatically configures both the LLM and embedding models, ensuring consistency and preventing mismatched configurations.

#### **Vector Store Configuration**
Choose between ChromaDB (local) or MongoDB Atlas:

**ChromaDB (Default)**
```bash
VECTOR_STORE_PROVIDER=chromadb
DB_COLLECTION_NAME=recipes
DB_PERSIST_DIRECTORY=./data/chromadb_persist
# Set to true to delete and recreate DB on startup (useful for adding new recipes)
DB_REFRESH_ON_START=false
```

**MongoDB Atlas**
```bash
VECTOR_STORE_PROVIDER=mongodb
MONGODB_URI=mongodb+srv://username:password@cluster.mongodb.net/
MONGODB_DATABASE=recipe_bot
MONGODB_COLLECTION=recipes
```

#### **Embedding Configuration**
```bash
# Embedding provider automatically matches LLM_PROVIDER (unified approach)
# No separate configuration needed - handled automatically based on LLM_PROVIDER setting
```

> πŸ’‘ **Unified Provider**: The `LLM_PROVIDER` setting automatically configures both the LLM and embedding models, ensuring consistency and preventing mismatched configurations. See [`docs/model-selection-guide.md`](docs/model-selection-guide.md) for all available options.

## πŸ› οΈ API Endpoints

### Core Endpoints

#### **Health Check**
```bash
GET /health
```
Returns service health and configuration status.

#### **Chat with RAG**
```bash
POST /chat
Content-Type: application/json

{
  "message": "What chicken recipes do you have?"
}
```
Full conversational RAG pipeline with memory and vector retrieval.

#### **Simple Demo**
```bash
GET /demo?prompt=Tell me about Italian cuisine
```
Simple LLM completion without RAG for testing.

#### **Clear Memory**
```bash
POST /clear-memory
```
Clears conversation memory for fresh start.

### Example Requests

**Chat Request:**
```bash
curl -X POST "http://localhost:8080/chat" 
  -H "Content-Type: application/json" 
  -d '{"message": "What are some quick breakfast recipes?"}'
```

**Demo Request:**
```bash
curl "http://localhost:8080/demo?prompt=What%20is%20your%20favorite%20pasta%20dish?"
```

## πŸ—οΈ Architecture

### Core Components

#### **LLM Service** (`services/llm_service.py`)
- **ConversationalRetrievalChain**: Main RAG pipeline with memory
- **Simple Chat Completion**: Direct LLM responses without RAG
- **Multi-provider Support**: OpenAI, Google, HuggingFace
- **Conversation Memory**: Persistent chat history

#### **Vector Store Service** (`services/vector_store.py`)
- **ChromaDB Integration**: Local vector database
- **MongoDB Atlas Support**: Cloud vector search
- **Document Loading**: Automatic recipe data ingestion
- **Embedding Management**: Multi-provider embedding support

#### **Configuration System** (`config/`)
- **Settings Management**: Environment-based configuration
- **Database Configuration**: Vector store setup
- **Logging Configuration**: Structured logging with rotation

### Data Flow

1. **User Query** β†’ FastAPI endpoint
2. **RAG Pipeline** β†’ Vector similarity search
3. **Context Retrieval** β†’ Top-k relevant recipes
4. **LLM Generation** β†’ Context-aware response
5. **Memory Storage** β†’ Conversation persistence
6. **Response** β†’ JSON formatted reply

## πŸ“Š Logging

Comprehensive logging system with:

- **File Rotation**: 10MB max size, 5 backups
- **Structured Format**: Timestamps, levels, source location
- **Emoji Indicators**: Visual status indicators
- **Error Tracking**: Full stack traces for debugging

**Log Levels:**
- πŸš€ **INFO**: Normal operations
- ⚠️ **WARNING**: Non-critical issues
- ❌ **ERROR**: Failures with stack traces
- πŸ”§ **DEBUG**: Detailed operation steps

**Log Location:** `./logs/recipe_bot.log`

## πŸ“ Data Management

### Recipe Data
- **Location**: `./data/recipes/`
- **Format**: JSON files with structured recipe data
- **Schema**: title, ingredients, directions, tags
- **Auto-loading**: Automatic chunking and vectorization

### Vector Storage
- **ChromaDB**: Local persistence in `./data/chromadb_persist/`
- **MongoDB**: Cloud-based vector search
- **Embeddings**: Configurable embedding models
- **Retrieval**: Top-k similarity search (k=25)

## πŸ”§ Development

### Running in Development
```bash
# Install dependencies
pip install -r requirements.txt

# Set up environment
cp .env.example .env
# Configure your API keys

# Run with auto-reload
uvicorn app:app --reload --host 127.0.0.1 --port 8080
```

### Testing Individual Components
```bash
# Test vector store
python -c "from services.vector_store import vector_store_service; print('Vector store initialized')"

# Test LLM service
python -c "from services.llm_service import llm_service; print('LLM service initialized')"
```

### Adding New Recipes
1. Add JSON files to `./data/recipes/`
2. Set `DB_REFRESH_ON_START=true` in `.env` file
3. Restart the application (ChromaDB will be recreated)
4. Set `DB_REFRESH_ON_START=false` to prevent repeated deletion
5. New recipes are now available for search

**Quick refresh:**
```bash
# Enable refresh, restart, then disable
echo "DB_REFRESH_ON_START=true" >> .env
uvicorn app:app --reload --host 127.0.0.1 --port 8080
# After startup completes:
sed -i 's/DB_REFRESH_ON_START=true/DB_REFRESH_ON_START=false/' .env
```

## πŸš€ Production Deployment

### Environment Setup
```bash
ENVIRONMENT=production
DEBUG=false
LOG_LEVEL=INFO
```

### Docker Deployment
The backend is containerized and ready for deployment on platforms like Hugging Face Spaces.

### Security Features
- **Environment Variables**: Secure API key management
- **CORS Configuration**: Frontend integration protection  
- **Input Sanitization**: Context-appropriate validation for recipe queries
  - XSS protection through HTML encoding
  - Length validation (1-1000 characters)
  - Basic harmful pattern removal
  - Whitespace normalization
- **Pydantic Validation**: Type safety and automatic sanitization
- **Structured Error Handling**: Safe error responses without data leaks

## πŸ› οΈ Troubleshooting

### Common Issues

**Vector store initialization fails**
- Check API keys for embedding provider
- Verify data folder contains recipe files
- Check ChromaDB permissions

**LLM service fails**
- Verify API key configuration
- Check provider-specific requirements
- Review logs for detailed error messages

**HuggingFace model import errors**
- HuggingFace APIs have proven unreliable for production use
- **Recommended**: Use Ollama to run HuggingFace models locally instead:
  ```bash
  # Install and run HuggingFace models via Ollama
  ollama pull codeqwen:7b
  ollama pull mistral-nemo:12b
  # Set LLM_PROVIDER=ollama in .env
  ```
- For legacy HuggingFace API setup, uncomment dependencies in `requirements.txt` (not recommended)
- For detailed model comparisons, see [`docs/model-selection-guide.md`](docs/model-selection-guide.md)

**Memory issues**
```bash
# Clear conversation memory
curl -X POST http://localhost:8080/clear-memory
```

### Debug Mode
Set `DEBUG=true` in `.env` for detailed logging and error traces.

### Log Analysis
Check `./logs/recipe_bot.log` for detailed operation logs with emoji indicators for quick status identification.

## πŸ“š Documentation

### Troubleshooting Guides
- **[Embedding Troubleshooting](./docs/embedding-troubleshooting.md)** - Quick fixes for common embedding dimension errors
- **[Embedding Compatibility Guide](./docs/embedding-compatibility-guide.md)** - Comprehensive guide to embedding models and dimensions
- **[Logging Guide](./docs/logging_guide.md)** - Understanding the logging system

### Technical Guides
- **[Architecture Documentation](./docs/architecture.md)** - System architecture overview
- **[API Documentation](./docs/api-documentation.md)** - Detailed API reference
- **[Deployment Guide](./docs/deployment.md)** - Production deployment instructions

### Common Issues
- **Dimension mismatch errors**: See [Embedding Troubleshooting](./docs/embedding-troubleshooting.md)
- **Model loading issues**: Check provider configuration in `.env`
- **Database connection problems**: Verify MongoDB/ChromaDB settings

## πŸ“š Dependencies

### Core Dependencies
- **FastAPI**: Modern web framework
- **uvicorn**: ASGI server
- **pydantic**: Data validation
- **python-dotenv**: Environment management

### AI/ML Dependencies
- **langchain**: LLM framework and chains
- **langchain-openai**: OpenAI integration
- **langchain-google-genai**: Google AI integration
- **sentence-transformers**: Embedding models
- **chromadb**: Vector database
- **pymongo**: MongoDB integration

### Optional Dependencies
- **langchain-huggingface**: HuggingFace integration
- **torch**: PyTorch for local models

## πŸ“„ License

This project is part of the PLG4 Recipe Recommendation Chatbot system.

---

For more detailed documentation, check the `docs/` folder or visit the API documentation at `http://localhost:8080/docs` when running the server.