--- title: Ollama Compatible API emoji: 🦙 colorFrom: green colorTo: blue sdk: docker pinned: false license: mit --- # Ollama Compatible API Full Ollama-compatible API proxy for **deepseek-r1:1.5b** model. Works seamlessly with Open WebUI and other Ollama clients. ## 🔗 API Endpoint ``` https://your-space-name.hf.space ``` ## 🎯 Open WebUI Configuration ### Step 1: Add Connection 1. Open **Open WebUI** 2. Go to **Settings** → **Connections** 3. Click **Add Connection** ### Step 2: Configure Ollama API - **Type**: Ollama API - **URL**: `https://your-space-name.hf.space` - **API Key**: Leave empty (no authentication required) ### Step 3: Test Connection Click **Test Connection** - should show "Connected" with available models. ## 📡 Available Endpoints ### GET `/api/tags` List all available models (Ollama compatible) **Example:** ```bash curl https://your-space-name.hf.space/api/tags ``` **Response:** ```json { "models": [ { "name": "deepseek-r1:1.5b", "modified_at": "2024-01-01T00:00:00Z", "size": 1500000000 } ] } ``` ### POST `/api/generate` Generate completion (Ollama compatible) **Example:** ```bash curl -X POST https://your-space-name.hf.space/api/generate \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-r1:1.5b", "prompt": "Why is the sky blue?", "stream": true }' ``` ### POST `/api/chat` Chat completion (Ollama compatible) **Example:** ```bash curl -X POST https://your-space-name.hf.space/api/chat \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-r1:1.5b", "messages": [ {"role": "user", "content": "Hello!"} ], "stream": true }' ``` ## 🐍 Python Client Example ```python import httpx import json API_URL = "https://your-space-name.hf.space" # Generate completion with httpx.stream( "POST", f"{API_URL}/api/generate", json={ "model": "deepseek-r1:1.5b", "prompt": "What is AI?", "stream": True }, timeout=300 ) as response: for line in response.iter_lines(): if line: data = json.loads(line) print(data.get("response", ""), end="", flush=True) ``` ## 🔧 Ollama CLI Compatible You can use the official Ollama CLI by setting the base URL: ```bash export OLLAMA_HOST=https://your-space-name.hf.space ollama list ollama run deepseek-r1:1.5b "Hello!" ``` ## ⚡ Features - **Full Ollama API compatibility** - works with any Ollama client - **Real-time streaming** - low latency token-by-token generation - **No caching** - fresh responses every time - **CORS enabled** - works from browser applications - **Open WebUI ready** - plug and play integration - **No authentication** - public access (add auth if needed) ## 🚀 Performance Optimizations - Async I/O for non-blocking operations - Connection pooling with httpx - Flash attention enabled - Optimized batch processing - No access logs for reduced overhead ## 📊 Model Information - **Model**: deepseek-r1:1.5b - **Parameters**: ~1.5 billion - **Optimized for**: Fast inference, low latency - **Context window**: 2048 tokens ## 🛠️ Technical Stack - **Proxy**: FastAPI (Python) - **Backend**: Ollama - **Model**: deepseek-r1:1.5b - **Server**: Uvicorn ASGI ## 🔒 Security Note This Space has **no authentication** by default. If you need to restrict access: 1. Fork this Space 2. Add API key middleware in `app.py` 3. Configure authentication in Open WebUI ## 📝 License MIT License --- **Endpoint**: https://your-space-name.hf.space **Model**: deepseek-r1:1.5b **Compatible with**: Open WebUI, Ollama CLI, and all Ollama clients