Spaces:

jessejohnson
/

plg4-dev-server

Paused

App Files Files Community

plg4-dev-server / backend /docs /embedding-compatibility-guide.md

Jesse Johnson

New commit for backend deployment: 2025-09-25_13-24-03

c59d808 5 months ago

preview code

raw

history blame contribute delete

7.6 kB

	# Embedding Compatibility Guide

	## 🔍 Understanding Embedding Dimensions

	When working with vector databases and embeddings, dimension compatibility is crucial for successful similarity searches. This guide helps you understand and troubleshoot embedding dimension issues.

	## 📊 Common Embedding Models & Their Dimensions

	\| Provider \| Model \| Dimensions \| Use Case \|
	\|----------\|-------\|------------\|----------\|
	\| HuggingFace \| `sentence-transformers/all-MiniLM-L6-v2` \| 384 \| Fast, lightweight, good for most tasks \|
	\| HuggingFace \| `sentence-transformers/all-mpnet-base-v2` \| 768 \| Higher quality, larger model \|
	\| Ollama \| `nomic-embed-text:v1.5` \| 768 \| Local inference, privacy-focused \|
	\| Ollama \| `mxbai-embed-large` \| 1024 \| High-quality local embeddings \|
	\| OpenAI \| `text-embedding-3-small` \| 1536 \| Commercial API, good performance \|
	\| OpenAI \| `text-embedding-3-large` \| 3072 \| Highest quality, expensive \|
	\| Google \| `models/embedding-001` \| 768 \| Google AI integration \|

	## ⚠️ Common Error: Dimension Mismatch

	### Symptoms
	```
	WARNING - [custom_mongo_vector.py:103] - ⚠️ Error processing document: shapes (768,) and (384,) not aligned
	```

	### Root Cause
	Your query embeddings and stored embeddings have different dimensions:
	- Query: Generated with Model A (e.g., 768 dimensions)
	- Stored: Created with Model B (e.g., 384 dimensions)

	### Why This Happens
	1. You changed embedding models after creating your database
	2. Your database was created with a different embedding provider
	3. Environment configuration doesn't match the original setup

	## 🔧 Solution Strategies

	### Strategy 1: Match Your Current Database (Recommended)

	Step 1: Identify stored embedding dimensions
	```bash
	# Check your MongoDB collection to see stored embedding dimensions
	# Look at the 'ingredients_emb' field length
	```

	Step 2: Update .env to match
	```bash
	# If stored embeddings are 384-dimensional (common with all-MiniLM-L6-v2)
	EMBEDDING_PROVIDER=huggingface
	HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2

	# If stored embeddings are 768-dimensional
	EMBEDDING_PROVIDER=ollama
	OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5
	```

	### Strategy 2: Regenerate Database with New Model

	Step 1: Choose your preferred embedding model
	```bash
	# Example: Use Ollama for local inference
	EMBEDDING_PROVIDER=ollama
	OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5
	```

	Step 2: Enable database refresh
	```bash
	DB_REFRESH_ON_START=true
	```

	Step 3: Restart application
	```bash
	uvicorn app:app --reload
	```

	Step 4: Disable refresh (Important!)
	```bash
	DB_REFRESH_ON_START=false
	```

	## 🔍 Debugging Embedding Issues

	### Check Current Configuration
	```bash
	# View your current embedding setup
	grep -E "EMBEDDING_PROVIDER\|_EMBEDDING_MODEL" .env
	```

	### Monitor Embedding Dimensions
	The custom MongoDB vector store now logs dimension information:
	```
	🔢 Query embedding dimensions: 768
	⚠️ Dimension mismatch: query=768D, stored=384D
	💡 Consider changing EMBEDDING_PROVIDER to match stored embeddings
	```

	### Verify Database Content
	```python
	# Check stored embedding dimensions in MongoDB
	collection.find_one({"ingredients_emb": {"$exists": True}})["ingredients_emb"]
	# Count the array length to get dimensions
	```

	## 📋 Environment Configuration Examples

	### Example 1: HuggingFace (384D) - Most Common
	```bash
	# .env configuration for 384-dimensional embeddings
	EMBEDDING_PROVIDER=huggingface
	HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
	HUGGINGFACE_API_TOKEN=your_token_here
	```

	### Example 2: Ollama (768D) - Local Inference
	```bash
	# .env configuration for 768-dimensional embeddings
	EMBEDDING_PROVIDER=ollama
	OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5
	OLLAMA_BASE_URL=http://localhost:11434
	```

	### Example 3: OpenAI (1536D) - Premium Quality
	```bash
	# .env configuration for 1536-dimensional embeddings
	EMBEDDING_PROVIDER=openai
	OPENAI_EMBEDDING_MODEL=text-embedding-3-small
	OPENAI_API_KEY=your_api_key_here
	```

	## 🚨 Common Pitfalls

	### 1. Mixed Providers
	❌ Don't do this:
	```bash
	# Database created with HuggingFace
	EMBEDDING_PROVIDER=huggingface # Original

	# Later changed to Ollama without refreshing DB
	EMBEDDING_PROVIDER=ollama # New - causes dimension mismatch!
	```

	### 2. Forgetting to Disable Refresh
	❌ Don't forget:
	```bash
	# After refreshing database, always disable refresh
	DB_REFRESH_ON_START=false # SET THIS BACK TO FALSE!
	```

	### 3. Model Name Typos
	❌ Watch out for:
	```bash
	# Typo in model name will cause failures
	OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5 ✅
	OLLAMA_EMBEDDING_MODEL=nomic-embed-text ❌ (missing version)
	```

	## 📊 Performance Comparison

	\| Model \| Speed \| Quality \| Dimensions \| Local/API \| Cost \|
	\|-------\|-------\|---------\|------------\|-----------\|------\|
	\| `all-MiniLM-L6-v2` \| ⭐⭐⭐⭐⭐ \| ⭐⭐⭐ \| 384 \| Both \| Free \|
	\| `nomic-embed-text:v1.5` \| ⭐⭐⭐⭐ \| ⭐⭐⭐⭐ \| 768 \| Local \| Free \|
	\| `text-embedding-3-small` \| ⭐⭐⭐⭐⭐ \| ⭐⭐⭐⭐⭐ \| 1536 \| API \| $$$ \|

	## 🔧 Troubleshooting Steps

	### Step 1: Check Current Setup
	```bash
	# 1. Check your environment configuration
	cat .env \| grep EMBEDDING

	# 2. Check vector store provider
	cat .env \| grep VECTOR_STORE_PROVIDER
	```

	### Step 2: Test Embedding Generation
	```python
	# Test script to check embedding dimensions
	from services.vector_store import vector_store_service

	# Generate a test embedding
	test_embedding = vector_store_service.embeddings.embed_query("test")
	print(f"Current embedding dimensions: {len(test_embedding)}")
	```

	### Step 3: Check Database Content
	For MongoDB users:
	```javascript
	// MongoDB shell command to check stored embedding dimensions
	db.your_collection.findOne({"ingredients_emb": {"$exists": true}})
	```

	### Step 4: Apply Fix
	Choose one of the strategies above based on your needs.

	## 📝 Best Practices

	### 1. Document Your Embedding Model
	Keep a record of which embedding model you used:
	```bash
	# Add comments to your .env file
	# Database created on 2025-08-27 with all-MiniLM-L6-v2 (384D)
	EMBEDDING_PROVIDER=huggingface
	HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
	```

	### 2. Version Control Your Configuration
	```bash
	# Commit your .env changes with descriptive messages
	git add .env
	git commit -m "Update embedding model to match database (384D)"
	```

	### 3. Test After Changes
	```bash
	# After changing embedding configuration, test a query
	curl -X POST "http://localhost:8080/chat" \
	-H "Content-Type: application/json" \
	-d '{"message": "test query"}'
	```

	## 🆘 Quick Reference

	### Error Pattern Recognition
	```
	shapes (768,) and (384,) not aligned → Query=768D, Stored=384D
	shapes (384,) and (768,) not aligned → Query=384D, Stored=768D
	shapes (1536,) and (384,) not aligned → Query=1536D, Stored=384D
	```

	### Quick Fixes
	\| Stored Dimensions \| Set EMBEDDING_PROVIDER to \|
	\|-------------------\|-------------------------\|
	\| 384 \| `huggingface` with `all-MiniLM-L6-v2` \|
	\| 768 \| `ollama` with `nomic-embed-text:v1.5` \|
	\| 1536 \| `openai` with `text-embedding-3-small` \|

	---

	## 📞 Need Help?

	If you're still experiencing issues:

	1. Check the application logs for detailed error messages
	2. Verify your embedding model is properly installed/accessible
	3. Ensure your database connection is working
	4. Consider regenerating your vector database if switching models permanently

	Remember: Consistency is key - your query embeddings and stored embeddings must use the same model and dimensions!