Spaces:
Sleeping
Sleeping
| title: ChatKit Backend | |
| emoji: π€ | |
| colorFrom: blue | |
| colorTo: indigo | |
| sdk: docker | |
| pinned: false | |
| app_port: 7860 | |
| # ChatKit Python Backend | |
| This FastAPI service implements an advanced multi-agent support system using the **OpenAI Agents SDK** and **OpenAI ChatKit**. It provides specialized support agents for different platforms (Kimi, DeepSeek, Google) with integrated RAG and persistent memory. | |
| ## ποΈ Architecture Overview | |
| The system is built on a modular, agentic architecture designed for high-performance customer support. | |
| ### Component Diagram | |
| ```mermaid | |
| graph TD | |
| Client[ChatKit Frontend] <--> API[FastAPI Orchestrator] | |
| subgraph "Agent Layer" | |
| API <--> KimiAgent[Kimi Agent] | |
| API <--> DeepSeekAgent[DeepSeek Agent] | |
| API <--> GoogleAgent[Google Agent] | |
| API <--> Summ[Summarizer Agent] | |
| end | |
| subgraph "Tools & Intelligence" | |
| KimiAgent & DeepSeekAgent & GoogleAgent --> RAG[LlamaIndex RAG Tool] | |
| KimiAgent & DeepSeekAgent & GoogleAgent --> Facts[Fact Recording Tool] | |
| RAG --> Chroma[ChromaDB Vector Store] | |
| RAG --> WebsiteData[Website Knowledge Base] | |
| end | |
| subgraph "Persistence" | |
| API --> SQLiteThreads[SQLite: Chat Threads] | |
| API --> SQLiteState[SQLite: User State] | |
| end | |
| subgraph "External Providers" | |
| KimiAgent --> Groq[Groq API] | |
| DeepSeekAgent --> OpenRouter[OpenRouter API] | |
| GoogleAgent --> OR_Gemini[OpenRouter/Gemini] | |
| end | |
| ``` | |
| ### Key Components | |
| 1. **FastAPI Orchestrator**: Handles request routing, SSE streaming, and handoffs between Agents and ChatKit. | |
| 2. **OpenAI Agents SDK**: Provides the logic for agent loops, tool calling, and handoffs. | |
| 3. **ChatKit Server**: Manages the ChatKit protocol, ensuring real-time UI updates (widgets, thoughts, tool results). | |
| 4. **Vector RAG Engine**: Uses LlamaIndex and ChromaDB to query school-specific services from scraped website data. | |
| 5. **Multi-Modal**: Integrated Groq Whisper for audio transcriptions. | |
| --- | |
| ## π€ Model Configuration | |
| The system utilizes a specialized mix of state-of-the-art models to balance performance, cost, and reasoning capabilities. | |
| | Agent / Service | Model Name | Primary Provider | API Class | | |
| | :--- | :--- | :--- | :--- | | |
| | **Kimi Agent** | `moonshotai/kimi-k2-instruct-0905` | **Groq** | `OpenAIResponsesModel` | | |
| | **DeepSeek Agent** | `deepseek/deepseek-chat` | **OpenRouter** | `OpenAIChatCompletionsModel` | | |
| | **Google Agent** | `google/gemini-2.5-flash` | **OpenRouter** | `OpenAIChatCompletionsModel` | | |
| | **Summarizer Agent** | `meta-llama/llama-4-scout-17b-16e-instruct` | **Groq** | `OpenAIResponsesModel` | | |
| | **Audio Transcription** | `whisper-large-v3-turbo` | **Groq** | Native Deepgram/Whisper | | |
| --- | |
| ## π Assumptions & Limitations | |
| ### Assumptions | |
| - **API Availability**: The system assumes stable connections to Groq and OpenRouter. | |
| - **Static Knowledge**: The RAG system assumes the vector database is pre-built from the school's website data (accessible in the `website/` folder). | |
| - **Single Instance**: Currently architected for single-instance deployment (SQLite persistence). | |
| ### Limitations | |
| - **Responses API Compatibility**: Only Groq natively supports the full `OpenAIResponsesModel` required for advanced thread state. OpenRouter models use `OpenAIChatCompletionsModel` with a custom persistence bridge. | |
| - **Concurrency**: SQLite is configured in WAL mode, but extremely high concurrent traffic would require a transition to PostgreSQL. | |
| - **Context Limits**: While the Summarizer Agent mitigates context bloat, extremely complex multi-turn sessions still rely on the provider's context window (e.g., 128k for DeepSeek/Gemini). | |
| --- | |
| ## π° Cost Estimation (Rough Calculation) | |
| *Calculations based on 1,000 user queries with an average of 2,000 tokens per turn (1,500 input / 500 output).* | |
| | Model (Provider) | Input Cost (per 1k) | Output Cost (per 1k) | **Estimated Total / 1,000 Queries** | | |
| | :--- | :--- | :--- | :--- | | |
| | **DeepSeek Chat** (OpenRouter) | $0.021 | $0.035 | **$0.056** | | |
| | **Gemini 2.5 Flash** (OpenRouter) | $0.150 | $0.200 | **$0.350** | | |
| | **Kimi** (Groq) | Free Tier / $0 | Free Tier / $0 | **$0.000** | | |
| *Note: Costs are based on current OpenRouter pricing (Jan 2025) and Groq's high-speed free tier. DeepSeek remains the most cost-effective provider for reasoning-intensive tasks.* | |
| --- | |
| ## π Getting Started | |
| To enable the realtime assistant you need to install both the ChatKit Python package and the OpenAI SDK, then provide an `OPENAI_API_KEY` environment variable. | |
| ```bash | |
| uv sync | |
| export OPENAI_API_KEY=sk-proj-... | |
| uv run uvicorn app.main:app --reload | |
| ``` | |