Spaces:
Sleeping
title: ChatKit Backend
emoji: π€
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
app_port: 7860
ChatKit Python Backend
This FastAPI service implements an advanced multi-agent support system using the OpenAI Agents SDK and OpenAI ChatKit. It provides specialized support agents for different platforms (Kimi, DeepSeek, Google) with integrated RAG and persistent memory.
ποΈ Architecture Overview
The system is built on a modular, agentic architecture designed for high-performance customer support.
Component Diagram
graph TD
Client[ChatKit Frontend] <--> API[FastAPI Orchestrator]
subgraph "Agent Layer"
API <--> KimiAgent[Kimi Agent]
API <--> DeepSeekAgent[DeepSeek Agent]
API <--> GoogleAgent[Google Agent]
API <--> Summ[Summarizer Agent]
end
subgraph "Tools & Intelligence"
KimiAgent & DeepSeekAgent & GoogleAgent --> RAG[LlamaIndex RAG Tool]
KimiAgent & DeepSeekAgent & GoogleAgent --> Facts[Fact Recording Tool]
RAG --> Chroma[ChromaDB Vector Store]
RAG --> WebsiteData[Website Knowledge Base]
end
subgraph "Persistence"
API --> SQLiteThreads[SQLite: Chat Threads]
API --> SQLiteState[SQLite: User State]
end
subgraph "External Providers"
KimiAgent --> Groq[Groq API]
DeepSeekAgent --> OpenRouter[OpenRouter API]
GoogleAgent --> OR_Gemini[OpenRouter/Gemini]
end
Key Components
- FastAPI Orchestrator: Handles request routing, SSE streaming, and handoffs between Agents and ChatKit.
- OpenAI Agents SDK: Provides the logic for agent loops, tool calling, and handoffs.
- ChatKit Server: Manages the ChatKit protocol, ensuring real-time UI updates (widgets, thoughts, tool results).
- Vector RAG Engine: Uses LlamaIndex and ChromaDB to query school-specific services from scraped website data.
- Multi-Modal: Integrated Groq Whisper for audio transcriptions.
π€ Model Configuration
The system utilizes a specialized mix of state-of-the-art models to balance performance, cost, and reasoning capabilities.
| Agent / Service | Model Name | Primary Provider | API Class |
|---|---|---|---|
| Kimi Agent | moonshotai/kimi-k2-instruct-0905 |
Groq | OpenAIResponsesModel |
| DeepSeek Agent | deepseek/deepseek-chat |
OpenRouter | OpenAIChatCompletionsModel |
| Google Agent | google/gemini-2.5-flash |
OpenRouter | OpenAIChatCompletionsModel |
| Summarizer Agent | meta-llama/llama-4-scout-17b-16e-instruct |
Groq | OpenAIResponsesModel |
| Audio Transcription | whisper-large-v3-turbo |
Groq | Native Deepgram/Whisper |
π Assumptions & Limitations
Assumptions
- API Availability: The system assumes stable connections to Groq and OpenRouter.
- Static Knowledge: The RAG system assumes the vector database is pre-built from the school's website data (accessible in the
website/folder). - Single Instance: Currently architected for single-instance deployment (SQLite persistence).
Limitations
- Responses API Compatibility: Only Groq natively supports the full
OpenAIResponsesModelrequired for advanced thread state. OpenRouter models useOpenAIChatCompletionsModelwith a custom persistence bridge. - Concurrency: SQLite is configured in WAL mode, but extremely high concurrent traffic would require a transition to PostgreSQL.
- Context Limits: While the Summarizer Agent mitigates context bloat, extremely complex multi-turn sessions still rely on the provider's context window (e.g., 128k for DeepSeek/Gemini).
π° Cost Estimation (Rough Calculation)
Calculations based on 1,000 user queries with an average of 2,000 tokens per turn (1,500 input / 500 output).
| Model (Provider) | Input Cost (per 1k) | Output Cost (per 1k) | Estimated Total / 1,000 Queries |
|---|---|---|---|
| DeepSeek Chat (OpenRouter) | $0.021 | $0.035 | $0.056 |
| Gemini 2.5 Flash (OpenRouter) | $0.150 | $0.200 | $0.350 |
| Kimi (Groq) | Free Tier / $0 | Free Tier / $0 | $0.000 |
Note: Costs are based on current OpenRouter pricing (Jan 2025) and Groq's high-speed free tier. DeepSeek remains the most cost-effective provider for reasoning-intensive tasks.
π Getting Started
To enable the realtime assistant you need to install both the ChatKit Python package and the OpenAI SDK, then provide an OPENAI_API_KEY environment variable.
uv sync
export OPENAI_API_KEY=sk-proj-...
uv run uvicorn app.main:app --reload