Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available: 6.13.0
metadata
title: Openai Chatbot Mcp
emoji: π₯
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
short_description: doc specific rag + document reading + mcp for at will rag
working version: 942ca5c4fb6af9f69a11234d481dab1cbfa3083e
OpenAI Chatbot MCP v2 - Direct Embedding Search
Advanced OpenAI chatbot with direct embedding search, version-specific document retrieval, MCP-style tools, and page-level document access.
Features
- Direct Embedding Search: No AI assistants - just raw document chunks with cosine similarity
- Multi-Version Support: Separate embedding stores for each product version
- Automatic Dual Querying: Queries both version-specific and general FAQ stores
- MCP-Style Tools:
- Vector Search Tool: AI can search any version at will
- Multi-Version Search: Compare features across versions
- Document Reader Tool: Read specific pages or table of contents
- Transparent Results: Shows similarity scores and exact chunks being used
- Real-time Streaming: Responses streamed with token counts
- Cost Tracking: Token usage and cost calculation per API call
- Multiple Models: Support for GPT-4.1, GPT-4.1 Mini, GPT-4.1 Nano, O4 Mini
Architecture
This implementation uses direct embedding search instead of OpenAI's Assistant API:
- Generate embeddings for document chunks using OpenAI's embedding API
- Store embeddings locally in JSON files
- Calculate cosine similarity for search
- Return actual document chunks, not AI interpretations
Quick Start
1. Environment Variables
Set your OpenAI API key:
export OPENAI_API_KEY=your_api_key_here
2. Generate Embeddings
Process your PDFs and generate embeddings:
python scripts/generate_embeddings.py
This will:
- Process all PDFs in
/Users/jsv/Work/ataya/concert-master/pdfs - Create chunks of ~1000 tokens with 200 token overlap
- Generate embeddings for each chunk
- Save to
data/embeddings/{version}.json
3. Run the Application
python app.py
Access at: http://localhost:7860
Project Structure
βββ backend/
β βββ embeddings.py # Core embedding search functionality
β βββ chunk_processor.py # PDF chunking with hierarchical approach
β βββ vector_store_manager.py # Manages embedding search
β βββ chatbot_backend.py # Main backend orchestrator
β βββ document_reader.py # Page-level document access
βββ scripts/
β βββ generate_embeddings.py # Generate embeddings from PDFs
β βββ test_direct_search.py # Test the search functionality
βββ data/
β βββ embeddings/ # JSON files with embeddings
β βββ harmony_1_8.json
β βββ harmony_1_6.json
β βββ general_faq.json
βββ tools/
βββ vector_search_tool.py
βββ document_reader_tool.py
Configuration
- Chunk size: 1000 tokens (adjustable in chunk_processor.py)
- Overlap: 200 tokens for context continuity
- Embedding model: text-embedding-3-small
- Top K results: 5 (adjustable per query)
Usage Examples
Ask about specific features:
- "How do I configure SSL certificates?"
- Shows: Query, retrieved chunks with sources and similarity scores
Compare versions:
- "What's new in version 1.8?"
- Uses multi-version search tool to compare
Read specific documentation:
- "Show me the installation guide"
- Uses document reader tool for page access
Benefits of Direct Search
- Transparency: See exact chunks and similarity scores
- Control: Tune chunk size, overlap, similarity threshold
- Cost: ~$0.00002 per query (embeddings) vs assistant API costs
- Speed: No assistant creation/deletion overhead
- Flexibility: Easy to switch embedding providers
Testing
Test the direct search functionality:
python scripts/test_direct_search.py
Customization
Adding New Documents
- Place PDFs in the appropriate directory
- Run
python scripts/generate_embeddings.py - New embeddings will be created automatically
Adjusting Search Parameters
Edit backend/chunk_processor.py:
chunk_size: Default 1000 tokensoverlap: Default 200 tokens
Edit backend/embeddings.py:
model: Default "text-embedding-3-small"- Similarity calculation method
Troubleshooting
- No results found: Run
generate_embeddings.pyto create embeddings - Low similarity scores: Adjust chunk size or try different queries
- Missing versions: Check PDF naming convention in chunk_processor.py