openai-chatbot-mcp / README.md
Julian Vanecek
useless change to enable huggingface to run latest push
3cbeba3

A newer version of the Gradio SDK is available: 6.13.0

Upgrade
metadata
title: Openai Chatbot Mcp
emoji: πŸ”₯
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
short_description: doc specific rag + document reading + mcp for at will rag

working version: 942ca5c4fb6af9f69a11234d481dab1cbfa3083e

OpenAI Chatbot MCP v2 - Direct Embedding Search

Advanced OpenAI chatbot with direct embedding search, version-specific document retrieval, MCP-style tools, and page-level document access.

Features

  • Direct Embedding Search: No AI assistants - just raw document chunks with cosine similarity
  • Multi-Version Support: Separate embedding stores for each product version
  • Automatic Dual Querying: Queries both version-specific and general FAQ stores
  • MCP-Style Tools:
    • Vector Search Tool: AI can search any version at will
    • Multi-Version Search: Compare features across versions
    • Document Reader Tool: Read specific pages or table of contents
  • Transparent Results: Shows similarity scores and exact chunks being used
  • Real-time Streaming: Responses streamed with token counts
  • Cost Tracking: Token usage and cost calculation per API call
  • Multiple Models: Support for GPT-4.1, GPT-4.1 Mini, GPT-4.1 Nano, O4 Mini

Architecture

This implementation uses direct embedding search instead of OpenAI's Assistant API:

  • Generate embeddings for document chunks using OpenAI's embedding API
  • Store embeddings locally in JSON files
  • Calculate cosine similarity for search
  • Return actual document chunks, not AI interpretations

Quick Start

1. Environment Variables

Set your OpenAI API key:

export OPENAI_API_KEY=your_api_key_here

2. Generate Embeddings

Process your PDFs and generate embeddings:

python scripts/generate_embeddings.py

This will:

  • Process all PDFs in /Users/jsv/Work/ataya/concert-master/pdfs
  • Create chunks of ~1000 tokens with 200 token overlap
  • Generate embeddings for each chunk
  • Save to data/embeddings/{version}.json

3. Run the Application

python app.py

Access at: http://localhost:7860

Project Structure

β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ embeddings.py          # Core embedding search functionality
β”‚   β”œβ”€β”€ chunk_processor.py     # PDF chunking with hierarchical approach
β”‚   β”œβ”€β”€ vector_store_manager.py # Manages embedding search
β”‚   β”œβ”€β”€ chatbot_backend.py     # Main backend orchestrator
β”‚   └── document_reader.py     # Page-level document access
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ generate_embeddings.py # Generate embeddings from PDFs
β”‚   └── test_direct_search.py  # Test the search functionality
β”œβ”€β”€ data/
β”‚   └── embeddings/           # JSON files with embeddings
β”‚       β”œβ”€β”€ harmony_1_8.json
β”‚       β”œβ”€β”€ harmony_1_6.json
β”‚       └── general_faq.json
└── tools/
    β”œβ”€β”€ vector_search_tool.py
    └── document_reader_tool.py

Configuration

  • Chunk size: 1000 tokens (adjustable in chunk_processor.py)
  • Overlap: 200 tokens for context continuity
  • Embedding model: text-embedding-3-small
  • Top K results: 5 (adjustable per query)

Usage Examples

  1. Ask about specific features:

    • "How do I configure SSL certificates?"
    • Shows: Query, retrieved chunks with sources and similarity scores
  2. Compare versions:

    • "What's new in version 1.8?"
    • Uses multi-version search tool to compare
  3. Read specific documentation:

    • "Show me the installation guide"
    • Uses document reader tool for page access

Benefits of Direct Search

  1. Transparency: See exact chunks and similarity scores
  2. Control: Tune chunk size, overlap, similarity threshold
  3. Cost: ~$0.00002 per query (embeddings) vs assistant API costs
  4. Speed: No assistant creation/deletion overhead
  5. Flexibility: Easy to switch embedding providers

Testing

Test the direct search functionality:

python scripts/test_direct_search.py

Customization

Adding New Documents

  1. Place PDFs in the appropriate directory
  2. Run python scripts/generate_embeddings.py
  3. New embeddings will be created automatically

Adjusting Search Parameters

Edit backend/chunk_processor.py:

  • chunk_size: Default 1000 tokens
  • overlap: Default 200 tokens

Edit backend/embeddings.py:

  • model: Default "text-embedding-3-small"
  • Similarity calculation method

Troubleshooting

  • No results found: Run generate_embeddings.py to create embeddings
  • Low similarity scores: Adjust chunk size or try different queries
  • Missing versions: Check PDF naming convention in chunk_processor.py