Spaces:

bitsinthesky
/

openai-chatbot-mcp

Sleeping

App Files Files Community

openai-chatbot-mcp / README.md

Julian Vanecek

useless change to enable huggingface to run latest push

3cbeba3 10 months ago

preview code

raw

history blame contribute delete

4.68 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

metadata

title: Openai Chatbot Mcp
emoji: 🔥
colorFrom: purple
colorTo: green
sdk: gradio
sdk_version: 5.35.0
app_file: app.py
pinned: false
short_description: doc specific rag + document reading + mcp for at will rag

working version: 942ca5c4fb6af9f69a11234d481dab1cbfa3083e

OpenAI Chatbot MCP v2 - Direct Embedding Search

Advanced OpenAI chatbot with direct embedding search, version-specific document retrieval, MCP-style tools, and page-level document access.

Features

Direct Embedding Search: No AI assistants - just raw document chunks with cosine similarity
Multi-Version Support: Separate embedding stores for each product version
Automatic Dual Querying: Queries both version-specific and general FAQ stores
MCP-Style Tools:
- Vector Search Tool: AI can search any version at will
- Multi-Version Search: Compare features across versions
- Document Reader Tool: Read specific pages or table of contents
Transparent Results: Shows similarity scores and exact chunks being used
Real-time Streaming: Responses streamed with token counts
Cost Tracking: Token usage and cost calculation per API call
Multiple Models: Support for GPT-4.1, GPT-4.1 Mini, GPT-4.1 Nano, O4 Mini

Architecture

This implementation uses direct embedding search instead of OpenAI's Assistant API:

Generate embeddings for document chunks using OpenAI's embedding API
Store embeddings locally in JSON files
Calculate cosine similarity for search
Return actual document chunks, not AI interpretations

Quick Start

1. Environment Variables

Set your OpenAI API key:

export OPENAI_API_KEY=your_api_key_here

2. Generate Embeddings

Process your PDFs and generate embeddings:

python scripts/generate_embeddings.py

This will:

Process all PDFs in /Users/jsv/Work/ataya/concert-master/pdfs
Create chunks of ~1000 tokens with 200 token overlap
Generate embeddings for each chunk
Save to data/embeddings/{version}.json

3. Run the Application

python app.py

Access at: http://localhost:7860

Project Structure

├── backend/
│   ├── embeddings.py          # Core embedding search functionality
│   ├── chunk_processor.py     # PDF chunking with hierarchical approach
│   ├── vector_store_manager.py # Manages embedding search
│   ├── chatbot_backend.py     # Main backend orchestrator
│   └── document_reader.py     # Page-level document access
├── scripts/
│   ├── generate_embeddings.py # Generate embeddings from PDFs
│   └── test_direct_search.py  # Test the search functionality
├── data/
│   └── embeddings/           # JSON files with embeddings
│       ├── harmony_1_8.json
│       ├── harmony_1_6.json
│       └── general_faq.json
└── tools/
    ├── vector_search_tool.py
    └── document_reader_tool.py

Configuration

Chunk size: 1000 tokens (adjustable in chunk_processor.py)
Overlap: 200 tokens for context continuity
Embedding model: text-embedding-3-small
Top K results: 5 (adjustable per query)

Usage Examples

Ask about specific features:
- "How do I configure SSL certificates?"
- Shows: Query, retrieved chunks with sources and similarity scores
Compare versions:
- "What's new in version 1.8?"
- Uses multi-version search tool to compare
Read specific documentation:
- "Show me the installation guide"
- Uses document reader tool for page access

Benefits of Direct Search

Transparency: See exact chunks and similarity scores
Control: Tune chunk size, overlap, similarity threshold
Cost: ~$0.00002 per query (embeddings) vs assistant API costs
Speed: No assistant creation/deletion overhead
Flexibility: Easy to switch embedding providers

Testing

Test the direct search functionality:

python scripts/test_direct_search.py

Customization

Adding New Documents

Place PDFs in the appropriate directory
Run python scripts/generate_embeddings.py
New embeddings will be created automatically

Adjusting Search Parameters

Edit backend/chunk_processor.py:

chunk_size: Default 1000 tokens
overlap: Default 200 tokens

Edit backend/embeddings.py:

model: Default "text-embedding-3-small"
Similarity calculation method

Troubleshooting

No results found: Run generate_embeddings.py to create embeddings
Low similarity scores: Adjust chunk size or try different queries
Missing versions: Check PDF naming convention in chunk_processor.py