Spaces:

jessejohnson
/

plg4-dev-server

Paused

App Files Files Community

plg4-dev-server / backend /docs /embedding-compatibility-guide.md

Jesse Johnson

New commit for backend deployment: 2025-09-25_13-24-03

c59d808 5 months ago

preview code

raw

history blame contribute delete

7.6 kB

Embedding Compatibility Guide

🔍 Understanding Embedding Dimensions

When working with vector databases and embeddings, dimension compatibility is crucial for successful similarity searches. This guide helps you understand and troubleshoot embedding dimension issues.

📊 Common Embedding Models & Their Dimensions

Provider	Model	Dimensions	Use Case
HuggingFace	`sentence-transformers/all-MiniLM-L6-v2`	384	Fast, lightweight, good for most tasks
HuggingFace	`sentence-transformers/all-mpnet-base-v2`	768	Higher quality, larger model
Ollama	`nomic-embed-text:v1.5`	768	Local inference, privacy-focused
Ollama	`mxbai-embed-large`	1024	High-quality local embeddings
OpenAI	`text-embedding-3-small`	1536	Commercial API, good performance
OpenAI	`text-embedding-3-large`	3072	Highest quality, expensive
Google	`models/embedding-001`	768	Google AI integration

⚠️ Common Error: Dimension Mismatch

Symptoms

WARNING - [custom_mongo_vector.py:103] - ⚠️ Error processing document: shapes (768,) and (384,) not aligned

Root Cause

Your query embeddings and stored embeddings have different dimensions:

Query: Generated with Model A (e.g., 768 dimensions)
Stored: Created with Model B (e.g., 384 dimensions)

Why This Happens

You changed embedding models after creating your database
Your database was created with a different embedding provider
Environment configuration doesn't match the original setup

🔧 Solution Strategies

Strategy 1: Match Your Current Database (Recommended)

Step 1: Identify stored embedding dimensions

# Check your MongoDB collection to see stored embedding dimensions
# Look at the 'ingredients_emb' field length

Step 2: Update .env to match

# If stored embeddings are 384-dimensional (common with all-MiniLM-L6-v2)
EMBEDDING_PROVIDER=huggingface
HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2

# If stored embeddings are 768-dimensional
EMBEDDING_PROVIDER=ollama
OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5

Strategy 2: Regenerate Database with New Model

Step 1: Choose your preferred embedding model

# Example: Use Ollama for local inference
EMBEDDING_PROVIDER=ollama
OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5

Step 2: Enable database refresh

DB_REFRESH_ON_START=true

Step 3: Restart application

uvicorn app:app --reload

Step 4: Disable refresh (Important!)

DB_REFRESH_ON_START=false

🔍 Debugging Embedding Issues

Check Current Configuration

# View your current embedding setup
grep -E "EMBEDDING_PROVIDER|_EMBEDDING_MODEL" .env

Monitor Embedding Dimensions

The custom MongoDB vector store now logs dimension information:

🔢 Query embedding dimensions: 768
⚠️ Dimension mismatch: query=768D, stored=384D
💡 Consider changing EMBEDDING_PROVIDER to match stored embeddings

Verify Database Content

# Check stored embedding dimensions in MongoDB
collection.find_one({"ingredients_emb": {"$exists": True}})["ingredients_emb"]
# Count the array length to get dimensions

📋 Environment Configuration Examples

Example 1: HuggingFace (384D) - Most Common

# .env configuration for 384-dimensional embeddings
EMBEDDING_PROVIDER=huggingface
HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
HUGGINGFACE_API_TOKEN=your_token_here

Example 2: Ollama (768D) - Local Inference

# .env configuration for 768-dimensional embeddings
EMBEDDING_PROVIDER=ollama
OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5
OLLAMA_BASE_URL=http://localhost:11434

Example 3: OpenAI (1536D) - Premium Quality

# .env configuration for 1536-dimensional embeddings
EMBEDDING_PROVIDER=openai
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_API_KEY=your_api_key_here

🚨 Common Pitfalls

1. Mixed Providers

❌ Don't do this:

# Database created with HuggingFace
EMBEDDING_PROVIDER=huggingface  # Original

# Later changed to Ollama without refreshing DB
EMBEDDING_PROVIDER=ollama  # New - causes dimension mismatch!

2. Forgetting to Disable Refresh

❌ Don't forget:

# After refreshing database, always disable refresh
DB_REFRESH_ON_START=false  # SET THIS BACK TO FALSE!

3. Model Name Typos

❌ Watch out for:

# Typo in model name will cause failures
OLLAMA_EMBEDDING_MODEL=nomic-embed-text:v1.5  ✅
OLLAMA_EMBEDDING_MODEL=nomic-embed-text       ❌ (missing version)

📊 Performance Comparison

Model	Speed	Quality	Dimensions	Local/API	Cost
`all-MiniLM-L6-v2`	⭐⭐⭐⭐⭐	⭐⭐⭐	384	Both	Free
`nomic-embed-text:v1.5`	⭐⭐⭐⭐	⭐⭐⭐⭐	768	Local	Free
`text-embedding-3-small`	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	1536	API	$$$

🔧 Troubleshooting Steps

Step 1: Check Current Setup

# 1. Check your environment configuration
cat .env | grep EMBEDDING

# 2. Check vector store provider
cat .env | grep VECTOR_STORE_PROVIDER

Step 2: Test Embedding Generation

# Test script to check embedding dimensions
from services.vector_store import vector_store_service

# Generate a test embedding
test_embedding = vector_store_service.embeddings.embed_query("test")
print(f"Current embedding dimensions: {len(test_embedding)}")

Step 3: Check Database Content

For MongoDB users:

// MongoDB shell command to check stored embedding dimensions
db.your_collection.findOne({"ingredients_emb": {"$exists": true}})

Step 4: Apply Fix

Choose one of the strategies above based on your needs.

📝 Best Practices

1. Document Your Embedding Model

Keep a record of which embedding model you used:

# Add comments to your .env file
# Database created on 2025-08-27 with all-MiniLM-L6-v2 (384D)
EMBEDDING_PROVIDER=huggingface
HUGGINGFACE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2

2. Version Control Your Configuration

# Commit your .env changes with descriptive messages
git add .env
git commit -m "Update embedding model to match database (384D)"

3. Test After Changes

# After changing embedding configuration, test a query
curl -X POST "http://localhost:8080/chat" \
  -H "Content-Type: application/json" \
  -d '{"message": "test query"}'

🆘 Quick Reference

Error Pattern Recognition

shapes (768,) and (384,) not aligned  → Query=768D, Stored=384D
shapes (384,) and (768,) not aligned  → Query=384D, Stored=768D
shapes (1536,) and (384,) not aligned → Query=1536D, Stored=384D

Quick Fixes

Stored Dimensions	Set EMBEDDING_PROVIDER to
384	`huggingface` with `all-MiniLM-L6-v2`
768	`ollama` with `nomic-embed-text:v1.5`
1536	`openai` with `text-embedding-3-small`

📞 Need Help?

If you're still experiencing issues:

Check the application logs for detailed error messages
Verify your embedding model is properly installed/accessible
Ensure your database connection is working
Consider regenerating your vector database if switching models permanently

Remember: Consistency is key - your query embeddings and stored embeddings must use the same model and dimensions!