--- title: Sema Chat API emoji: ๐Ÿ’ฌ colorFrom: blue colorTo: green sdk: docker pinned: false license: mit short_description: Chat with llms --- # Sema Chat API ๐Ÿ’ฌ Modern chatbot API with streaming capabilities, flexible model backends, and production-ready features. Built with FastAPI and designed for rapid GenAI advancements. ## ๐Ÿš€ Quick Start with Gemma ### Option 1: Automated HuggingFace Spaces Deployment ```bash cd backend/sema-chat ./setup_huggingface.sh ``` ### Option 2: Manual Local Setup ```bash cd backend/sema-chat pip install -r requirements.txt # Copy and configure environment cp .env.example .env # For Gemma via Google AI Studio (Recommended) # Edit .env: MODEL_TYPE=google MODEL_NAME=gemma-2-9b-it GOOGLE_API_KEY=your_google_api_key # Run the API uvicorn app.main:app --reload --host 0.0.0.0 --port 7860 ``` ### Option 3: Local Gemma (Free, No API Key) ```bash # Edit .env: MODEL_TYPE=local MODEL_NAME=google/gemma-2b-it DEVICE=cpu # Run (will download model on first run) uvicorn app.main:app --reload --host 0.0.0.0 --port 7860 ``` ## ๐ŸŒ Access Your API Once running, access: - **Swagger UI**: http://localhost:7860/ - **Health Check**: http://localhost:7860/api/v1/health - **Chat Endpoint**: http://localhost:7860/api/v1/chat ## ๐Ÿงช Quick Test ```bash # Test chat curl -X POST "http://localhost:7860/api/v1/chat" \ -H "Content-Type: application/json" \ -d '{ "message": "Hello! Can you introduce yourself?", "session_id": "test-session" }' # Test streaming curl -N -H "Accept: text/event-stream" \ "http://localhost:7860/api/v1/chat/stream?message=Tell%20me%20about%20AI&session_id=test" ``` ## ๐ŸŽฏ Features ### Core Capabilities - โœ… **Real-time Streaming**: Server-Sent Events and WebSocket support - โœ… **Multiple Model Backends**: Local, HuggingFace API, OpenAI, Anthropic, Google AI, MiniMax - โœ… **Session Management**: Persistent conversation contexts - โœ… **Rate Limiting**: Built-in protection with configurable limits - โœ… **Health Monitoring**: Comprehensive health checks and metrics ### Supported Models - **Local**: TinyLlama, DialoGPT, Gemma, Qwen - **Google AI**: Gemma-2-9b-it, Gemini-1.5-flash, Gemini-1.5-pro - **OpenAI**: GPT-3.5-turbo, GPT-4, GPT-4-turbo - **Anthropic**: Claude-3-haiku, Claude-3-sonnet, Claude-3-opus - **HuggingFace API**: Any model via Inference API - **MiniMax**: M1 model with reasoning capabilities ## ๐Ÿ”ง Configuration ### Environment Variables ```bash # Model Backend (local, google, openai, anthropic, hf_api, minimax) MODEL_TYPE=google MODEL_NAME=gemma-2-9b-it # API Keys (as needed) GOOGLE_API_KEY=your_key OPENAI_API_KEY=your_key ANTHROPIC_API_KEY=your_key HF_API_TOKEN=your_token MINIMAX_API_KEY=your_key # Generation Settings TEMPERATURE=0.7 MAX_NEW_TOKENS=512 TOP_P=0.9 # Server Settings HOST=0.0.0.0 PORT=7860 DEBUG=false ``` ## ๐Ÿ“š Documentation - **[Configuration Guide](CONFIGURATION_GUIDE.md)** - Detailed setup for all backends - **[HuggingFace Deployment](HUGGINGFACE_DEPLOYMENT.md)** - Step-by-step deployment guide - **[API Documentation](http://localhost:7860/)** - Interactive Swagger UI ## ๐Ÿงช Testing ```bash # Run comprehensive tests python tests/test_api.py # Test different backends python examples/test_backends.py # Test specific backend python examples/test_backends.py --backend google ``` ## ๐Ÿš€ Deployment ### HuggingFace Spaces (Recommended) 1. Run the setup script: `./setup_huggingface.sh` 2. Create your Space on HuggingFace 3. Push the generated code 4. Set environment variables in Space settings 5. Your API will be live at: `https://username-spacename.hf.space/` ### Docker ```bash docker build -t sema-chat-api . docker run -e MODEL_TYPE=google \ -e GOOGLE_API_KEY=your_key \ -p 7860:7860 \ sema-chat-api ``` ## ๐Ÿ”— API Endpoints ### Chat - **`POST /api/v1/chat`** - Send chat message - **`GET /api/v1/chat/stream`** - Streaming chat (SSE) - **`WebSocket /api/v1/chat/ws`** - Real-time WebSocket chat ### Sessions - **`GET /api/v1/sessions/{id}`** - Get conversation history - **`DELETE /api/v1/sessions/{id}`** - Clear conversation - **`GET /api/v1/sessions`** - List active sessions ### System - **`GET /api/v1/health`** - Comprehensive health check - **`GET /api/v1/model/info`** - Current model information - **`GET /api/v1/status`** - Basic status ## ๐Ÿ’ก Why This Architecture? 1. **Future-Proof**: Modular design adapts to rapid GenAI advancements 2. **Flexible**: Switch between local models and APIs with environment variables 3. **Production-Ready**: Rate limiting, monitoring, error handling built-in 4. **Cost-Effective**: Start free with local models, scale with APIs 5. **Developer-Friendly**: Comprehensive docs, tests, and examples ## ๐Ÿ› ๏ธ Development ### Project Structure ``` app/ โ”œโ”€โ”€ main.py # FastAPI application โ”œโ”€โ”€ api/v1/endpoints.py # API routes โ”œโ”€โ”€ core/ โ”‚ โ”œโ”€โ”€ config.py # Environment-based configuration โ”‚ โ””โ”€โ”€ logging.py # Structured logging โ”œโ”€โ”€ models/schemas.py # Pydantic request/response models โ”œโ”€โ”€ services/ โ”‚ โ”œโ”€โ”€ chat_manager.py # Chat orchestration โ”‚ โ”œโ”€โ”€ model_manager.py # Backend selection โ”‚ โ”œโ”€โ”€ session_manager.py # Conversation management โ”‚ โ””โ”€โ”€ model_backends/ # Model implementations โ””โ”€โ”€ utils/helpers.py # Utility functions ``` ### Adding New Backends 1. Create new backend in `app/services/model_backends/` 2. Inherit from `ModelBackend` base class 3. Implement required methods 4. Add to `ModelManager._create_backend()` 5. Update configuration and documentation ## ๐Ÿค Contributing 1. Fork the repository 2. Create a feature branch 3. Add tests for new functionality 4. Ensure all tests pass 5. Submit a pull request ## ๐Ÿ“„ License MIT License - see LICENSE file for details. ## ๐Ÿ™ Acknowledgments - **HuggingFace** for model hosting and Spaces platform - **Google** for Gemma models and AI Studio - **FastAPI** for the excellent web framework - **OpenAI, Anthropic, MiniMax** for their APIs --- **Ready to chat? Deploy your Sema Chat API today! ๐Ÿš€๐Ÿ’ฌ**