ecomcp / docs /LLAMA_INDEX_GUIDE.md
vinhnx90's picture
feat: Implement LlamaIndex integration with new core modules for knowledge base, document loading, vector search, and comprehensive documentation and tests.
108d8af

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

LlamaIndex Integration Guide

Complete guide to the knowledge base indexing and retrieval system powered by LlamaIndex.

Overview

The LlamaIndex integration provides:

  • Knowledge Base Indexing: Foundation for indexing documents and products
  • Vector Similarity Search: Semantic search across indexed content
  • Document Retrieval: Easy retrieval of relevant documents

Components

1. Core Modules

KnowledgeBase (knowledge_base.py)

Low-level interface for index management.

from src.core import KnowledgeBase, IndexConfig

# Initialize with custom config
config = IndexConfig(
    embedding_model="text-embedding-3-small",
    chunk_size=1024,
    use_pinecone=False,
)

kb = KnowledgeBase(config)

# Index documents
kb.index_documents("./docs")

# Search
results = kb.search("your query", top_k=5)

# Query with QA
response = kb.query("What is the main feature?")

DocumentLoader (document_loader.py)

Load documents from various sources.

from src.core import DocumentLoader

# Load from directory
docs = DocumentLoader.load_markdown_documents("./docs")
docs += DocumentLoader.load_text_documents("./docs")

# Load products
products = [
    {
        "id": "prod_001",
        "name": "Product Name",
        "description": "Description",
        "price": "$99",
        "category": "Category",
        "features": ["Feature 1", "Feature 2"],
    }
]
product_docs = DocumentLoader.create_product_documents(products)

# Load from URLs
urls = ["https://example.com/page1", "https://example.com/page2"]
url_docs = DocumentLoader.load_documents_from_urls(urls)

# Load all at once
all_docs = DocumentLoader.load_all_documents(
    docs_dir="./docs",
    products=products,
    urls=urls,
)

VectorSearchEngine (vector_search.py)

High-level search interface with advanced features.

from src.core import VectorSearchEngine

search_engine = VectorSearchEngine(kb)

# Basic search
results = search_engine.search("query", top_k=5)

# Product search only
products = search_engine.search_products("laptop", top_k=10)

# Documentation search only
docs = search_engine.search_documentation("how to setup", top_k=5)

# Semantic search with threshold
results = search_engine.semantic_search(
    "installation guide",
    top_k=5,
    similarity_threshold=0.5,
)

# Hierarchical search across types
results = search_engine.hierarchical_search("e-commerce")
# Returns: {"products": [...], "documentation": [...]}

# Weighted combined search
results = search_engine.combined_search(
    "shopping platform",
    weights={"product": 0.6, "documentation": 0.4},
)

# Contextual search
results = search_engine.contextual_search(
    "laptop",
    context={"category": "electronics", "price_range": "$1000-2000"},
    top_k=5,
)

# Get recommendations
recs = search_engine.get_recommendations("laptop under $1000", limit=5)

2. High-Level Integration

EcoMCPKnowledgeBase (llama_integration.py)

Complete integration for EcoMCP application.

from src.core import EcoMCPKnowledgeBase, initialize_knowledge_base

# Initialize
kb = EcoMCPKnowledgeBase()

# Auto-initialize with documents
kb.initialize("./docs")

# Add products
kb.add_products(products)

# Add URLs
kb.add_urls(["https://example.com"])

# Search
results = kb.search("query", top_k=5)

# Search specific types
products = kb.search_products("laptop", top_k=10)
docs = kb.search_documentation("deploy", top_k=5)

# Get recommendations
recs = kb.get_recommendations("gaming laptop", limit=5)

# Natural language query
answer = kb.query("What is the platform about?")

# Save and load
kb.save("./kb_index")
kb.load("./kb_index")

# Get stats
stats = kb.get_stats()

3. Global Singleton Pattern

from src.core import initialize_knowledge_base, get_knowledge_base

# Initialize globally
kb = initialize_knowledge_base("./docs")

# Access from anywhere
kb = get_knowledge_base()
results = kb.search("query")

Configuration

IndexConfig Options

config = IndexConfig(
    # Embedding model (OpenAI)
    embedding_model="text-embedding-3-small",  # or "text-embedding-3-large"
    
    # Chunking settings
    chunk_size=1024,           # Size of text chunks
    chunk_overlap=20,          # Overlap between chunks
    
    # Vector store backend
    use_pinecone=False,        # True to use Pinecone
    pinecone_index_name="ecomcp-knowledge",
    pinecone_dimension=1536,
)

Installation

Add to requirements.txt:

llama-index>=0.9.0
llama-index-embeddings-openai>=0.1.0
llama-index-vector-stores-pinecone>=0.1.0

Environment variables:

OPENAI_API_KEY=sk-...
PINECONE_API_KEY=...  # Optional, only if using Pinecone

Usage Examples

Example 1: Basic Document Indexing

from src.core import EcoMCPKnowledgeBase

kb = EcoMCPKnowledgeBase()
kb.initialize("./docs")

# Search
results = kb.search("deployment guide", top_k=3)
for result in results:
    print(f"Score: {result.score:.2f}")
    print(f"Content: {result.content[:200]}")

Example 2: Product Recommendation

from src.core import EcoMCPKnowledgeBase

kb = EcoMCPKnowledgeBase()

products = [
    {
        "id": "1",
        "name": "Wireless Headphones",
        "description": "Noise-canceling",
        "price": "$299",
        "category": "Electronics",
        "features": ["ANC", "30h Battery"],
        "tags": ["audio", "wireless"]
    },
    # ... more products
]

kb.add_products(products)

# Get recommendations
recs = kb.get_recommendations("best headphones for music", limit=3)
for rec in recs:
    print(f"Rank: {rec['rank']}")
    print(f"Confidence: {rec['confidence']:.2f}")

Example 3: Semantic Search with Filtering

from src.core import VectorSearchEngine

search = VectorSearchEngine(kb)

# Search with context
results = search.contextual_search(
    "laptop computer",
    context={
        "category": "computers",
        "price_range": "$500-1000",
        "processor": "Intel"
    },
    top_k=5
)

Example 4: Knowledge Base Persistence

from src.core import EcoMCPKnowledgeBase

# Create and save
kb1 = EcoMCPKnowledgeBase()
kb1.initialize("./docs")
kb1.save("./kb_backup")

# Load later
kb2 = EcoMCPKnowledgeBase()
kb2.load("./kb_backup")

# Use immediately
results = kb2.search("something")

Integration with Server

In Your Server/MCP Implementation

from src.core import initialize_knowledge_base, get_knowledge_base

# During startup
def initialize_app():
    kb = initialize_knowledge_base("./docs")
    kb.add_products(get_all_products())  # Your product source

# In your handlers
def search_handler(query: str):
    kb = get_knowledge_base()
    results = kb.search(query)
    return results

def recommend_handler(user_query: str):
    kb = get_knowledge_base()
    recommendations = kb.get_recommendations(user_query)
    return recommendations

Advanced Features

Custom Metadata

from llama_index.core.schema import Document

doc = Document(
    text="Content here",
    metadata={
        "source": "custom_source",
        "author": "John Doe",
        "date": "2024-01-01",
        "category": "tutorial",
    }
)
kb.kb.add_documents([doc])

Pinecone Integration

config = IndexConfig(use_pinecone=True)
kb = EcoMCPKnowledgeBase(config=config)

# Automatically creates/uses Pinecone index
kb.initialize("./docs")

Custom Query Engine

# Low-level query with custom settings
query_engine = kb.kb.index.as_query_engine(
    similarity_top_k=10,
    response_mode="compact"  # or "tree_summarize", "refine"
)
response = query_engine.query("Your question")

Performance Tips

  1. Chunk Size: Larger chunks (2048) for long documents, smaller (512) for varied content
  2. Vector Store: Use Pinecone for production deployments
  3. Batch Processing: Index documents in batches for large datasets
  4. Caching: Load from disk instead of re-indexing frequently
  5. Top-K: Start with top_k=5, adjust based on relevance

Troubleshooting

No OpenAI API Key

Error: OPENAI_API_KEY not set
Solution: Set export OPENAI_API_KEY=sk-... in environment

Pinecone Connection Failed

Error: Pinecone connection failed
Solution: Check PINECONE_API_KEY and network connectivity
Falls back to in-memory indexing automatically

Out of Memory with Large Datasets

Solution: 
- Reduce chunk_size in IndexConfig
- Process documents in batches
- Use Pinecone backend (scales to millions of documents)

Testing

Run tests:

pytest tests/test_llama_integration.py -v

API Reference

See src/core/ for detailed API documentation in docstrings.

Files Structure

src/core/
β”œβ”€β”€ __init__.py                 # Package exports
β”œβ”€β”€ knowledge_base.py          # Core KnowledgeBase class
β”œβ”€β”€ document_loader.py         # Document loading utilities
β”œβ”€β”€ vector_search.py           # VectorSearchEngine with advanced features
β”œβ”€β”€ llama_integration.py       # EcoMCP integration wrapper
└── examples.py                # Usage examples

Related Documentation