Spaces:

jessejohnson
/

plg4-dev-server

Paused

File size: 7,602 Bytes

c59d808

# Embedding Compatibility Guide

## 🔍 Understanding Embedding Dimensions

When working with vector databases and embeddings, **dimension compatibility** is crucial for successful similarity searches. This guide helps you understand and troubleshoot embedding dimension issues.

## 📊 Common Embedding Models & Their Dimensions

| Provider | Model | Dimensions | Use Case |
|----------|-------|------------|----------|
| **HuggingFace** | `sentence-transformers/all-MiniLM-L6-v2` | **384** | Fast, lightweight, good for most tasks |
| **HuggingFace** | `sentence-transformers/all-mpnet-base-v2` | **768** | Higher quality, larger model |
| **Ollama** | `nomic-embed-text:v1.5` | **768** | Local inference, privacy-focused |
| **Ollama** | `mxbai-embed-large` | **1024** | High-quality local embeddings |
| **OpenAI** | `text-embedding-3-small` | **1536** | Commercial API, good performance |
| **OpenAI** | `text-embedding-3-large` | **3072** | Highest quality, expensive |
| **Google** | `models/embedding-001` | **768** | Google AI integration |

## ⚠️ Common Error: Dimension Mismatch

### Symptoms
```
WARNING - [custom_mongo_vector.py:103] - ⚠️ Error processing document: shapes (768,) and (384,) not aligned
```

### Root Cause
Your **query embeddings** and **stored embeddings** have different dimensions:
- Query: Generated with Model A (e.g., 768 dimensions)
- Stored: Created with Model B (e.g., 384 dimensions)

### Why This Happens
1. You changed embedding models after creating your database
2. Your database was created with a different embedding provider
3. Environment configuration doesn't match the original setup

## 🔧 Solution Strategies

### Strategy 1: Match Your Current Database (Recommended)

**Step 1: Identify stored embedding dimensions**
```bash
# Check your MongoDB collection to see stored embedding dimensions
# Look at the 'ingredients_emb' field length
```

**Step 2: Update .env to match**
```bash
# If stored embeddings are 384-dimensional (common with all-MiniLM-L6-v2)
EMBEDDING_PROVIDER=huggingface
HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2

# If stored embeddings are 768-dimensional
EMBEDDING_PROVIDER=ollama
OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5
```

### Strategy 2: Regenerate Database with New Model

**Step 1: Choose your preferred embedding model**
```bash
# Example: Use Ollama for local inference
EMBEDDING_PROVIDER=ollama
OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5
```

**Step 2: Enable database refresh**
```bash
DB_REFRESH_ON_START=true
```

**Step 3: Restart application**
```bash
uvicorn app:app --reload
```

**Step 4: Disable refresh (Important!)**
```bash
DB_REFRESH_ON_START=false
```

## 🔍 Debugging Embedding Issues

### Check Current Configuration
```bash
# View your current embedding setup
grep -E "EMBEDDING_PROVIDER|_EMBEDDING_MODEL" .env
```

### Monitor Embedding Dimensions
The custom MongoDB vector store now logs dimension information:
```
🔢 Query embedding dimensions: 768
⚠️ Dimension mismatch: query=768D, stored=384D
💡 Consider changing EMBEDDING_PROVIDER to match stored embeddings
```

### Verify Database Content
```python
# Check stored embedding dimensions in MongoDB
collection.find_one({"ingredients_emb": {"$exists": True}})["ingredients_emb"]
# Count the array length to get dimensions
```

## 📋 Environment Configuration Examples

### Example 1: HuggingFace (384D) - Most Common
```bash
# .env configuration for 384-dimensional embeddings
EMBEDDING_PROVIDER=huggingface
HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
HUGGINGFACE_API_TOKEN=your_token_here
```

### Example 2: Ollama (768D) - Local Inference
```bash
# .env configuration for 768-dimensional embeddings
EMBEDDING_PROVIDER=ollama
OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5
OLLAMA_BASE_URL=http://localhost:11434
```

### Example 3: OpenAI (1536D) - Premium Quality
```bash
# .env configuration for 1536-dimensional embeddings
EMBEDDING_PROVIDER=openai
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_API_KEY=your_api_key_here
```

## 🚨 Common Pitfalls

### 1. Mixed Providers
❌ **Don't do this:**
```bash
# Database created with HuggingFace
EMBEDDING_PROVIDER=huggingface  # Original

# Later changed to Ollama without refreshing DB
EMBEDDING_PROVIDER=ollama  # New - causes dimension mismatch!
```

### 2. Forgetting to Disable Refresh
❌ **Don't forget:**
```bash
# After refreshing database, always disable refresh
DB_REFRESH_ON_START=false  # SET THIS BACK TO FALSE!
```

### 3. Model Name Typos
❌ **Watch out for:**
```bash
# Typo in model name will cause failures
OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5  ✅
OLLAMA_EMBEDDING_MODEL=nomic-embed-text       ❌ (missing version)
```

## 📊 Performance Comparison

| Model | Speed | Quality | Dimensions | Local/API | Cost |
|-------|-------|---------|------------|-----------|------|
| `all-MiniLM-L6-v2` | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | 384 | Both | Free |
| `nomic-embed-text:v1.5` | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | 768 | Local | Free |
| `text-embedding-3-small` | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | 1536 | API | $$$ |

## 🔧 Troubleshooting Steps

### Step 1: Check Current Setup
```bash
# 1. Check your environment configuration
cat .env | grep EMBEDDING

# 2. Check vector store provider
cat .env | grep VECTOR_STORE_PROVIDER
```

### Step 2: Test Embedding Generation
```python
# Test script to check embedding dimensions
from services.vector_store import vector_store_service

# Generate a test embedding
test_embedding = vector_store_service.embeddings.embed_query("test")
print(f"Current embedding dimensions: {len(test_embedding)}")
```

### Step 3: Check Database Content
For MongoDB users:
```javascript
// MongoDB shell command to check stored embedding dimensions
db.your_collection.findOne({"ingredients_emb": {"$exists": true}})
```

### Step 4: Apply Fix
Choose one of the strategies above based on your needs.

## 📝 Best Practices

### 1. Document Your Embedding Model
Keep a record of which embedding model you used:
```bash
# Add comments to your .env file
# Database created on 2025-08-27 with all-MiniLM-L6-v2 (384D)
EMBEDDING_PROVIDER=huggingface
HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
```

### 2. Version Control Your Configuration
```bash
# Commit your .env changes with descriptive messages
git add .env
git commit -m "Update embedding model to match database (384D)"
```

### 3. Test After Changes
```bash
# After changing embedding configuration, test a query
curl -X POST "http://localhost:8080/chat" \
  -H "Content-Type: application/json" \
  -d '{"message": "test query"}'
```

## 🆘 Quick Reference

### Error Pattern Recognition
```
shapes (768,) and (384,) not aligned  → Query=768D, Stored=384D
shapes (384,) and (768,) not aligned  → Query=384D, Stored=768D
shapes (1536,) and (384,) not aligned → Query=1536D, Stored=384D
```

### Quick Fixes
| Stored Dimensions | Set EMBEDDING_PROVIDER to |
|-------------------|-------------------------|
| 384 | `huggingface` with `all-MiniLM-L6-v2` |
| 768 | `ollama` with `nomic-embed-text:v1.5` |
| 1536 | `openai` with `text-embedding-3-small` |

---

## 📞 Need Help?

If you're still experiencing issues:

1. Check the application logs for detailed error messages
2. Verify your embedding model is properly installed/accessible
3. Ensure your database connection is working
4. Consider regenerating your vector database if switching models permanently

Remember: **Consistency is key** - your query embeddings and stored embeddings must use the same model and dimensions!