---
title: Ollama Compatible API
emoji: 🦙
colorFrom: green
colorTo: blue
sdk: docker
pinned: false
license: mit
---

# Ollama Compatible API

Full Ollama-compatible API proxy for **deepseek-r1:1.5b** model. Works seamlessly with Open WebUI and other Ollama clients.

## 🔗 API Endpoint

```
https://your-space-name.hf.space
```

## 🎯 Open WebUI Configuration

### Step 1: Add Connection
1. Open **Open WebUI**
2. Go to **Settings** → **Connections**
3. Click **Add Connection**

### Step 2: Configure Ollama API
- **Type**: Ollama API
- **URL**: `https://your-space-name.hf.space`
- **API Key**: Leave empty (no authentication required)

### Step 3: Test Connection
Click **Test Connection** - should show "Connected" with available models.

## 📡 Available Endpoints

### GET `/api/tags`
List all available models (Ollama compatible)

**Example:**
```bash
curl https://your-space-name.hf.space/api/tags
```

**Response:**
```json
{
  "models": [
    {
      "name": "deepseek-r1:1.5b",
      "modified_at": "2024-01-01T00:00:00Z",
      "size": 1500000000
    }
  ]
}
```

### POST `/api/generate`
Generate completion (Ollama compatible)

**Example:**
```bash
curl -X POST https://your-space-name.hf.space/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-r1:1.5b",
    "prompt": "Why is the sky blue?",
    "stream": true
  }'
```

### POST `/api/chat`
Chat completion (Ollama compatible)

**Example:**
```bash
curl -X POST https://your-space-name.hf.space/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-r1:1.5b",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ],
    "stream": true
  }'
```

## 🐍 Python Client Example

```python
import httpx
import json

API_URL = "https://your-space-name.hf.space"

# Generate completion
with httpx.stream(
    "POST",
    f"{API_URL}/api/generate",
    json={
        "model": "deepseek-r1:1.5b",
        "prompt": "What is AI?",
        "stream": True
    },
    timeout=300
) as response:
    for line in response.iter_lines():
        if line:
            data = json.loads(line)
            print(data.get("response", ""), end="", flush=True)
```

## 🔧 Ollama CLI Compatible

You can use the official Ollama CLI by setting the base URL:

```bash
export OLLAMA_HOST=https://your-space-name.hf.space
ollama list
ollama run deepseek-r1:1.5b "Hello!"
```

## ⚡ Features

- **Full Ollama API compatibility** - works with any Ollama client
- **Real-time streaming** - low latency token-by-token generation
- **No caching** - fresh responses every time
- **CORS enabled** - works from browser applications
- **Open WebUI ready** - plug and play integration
- **No authentication** - public access (add auth if needed)

## 🚀 Performance Optimizations

- Async I/O for non-blocking operations
- Connection pooling with httpx
- Flash attention enabled
- Optimized batch processing
- No access logs for reduced overhead

## 📊 Model Information

- **Model**: deepseek-r1:1.5b
- **Parameters**: ~1.5 billion
- **Optimized for**: Fast inference, low latency
- **Context window**: 2048 tokens

## 🛠️ Technical Stack

- **Proxy**: FastAPI (Python)
- **Backend**: Ollama
- **Model**: deepseek-r1:1.5b
- **Server**: Uvicorn ASGI

## 🔒 Security Note

This Space has **no authentication** by default. If you need to restrict access:
1. Fork this Space
2. Add API key middleware in `app.py`
3. Configure authentication in Open WebUI

## 📝 License

MIT License

---

**Endpoint**: https://your-space-name.hf.space  
**Model**: deepseek-r1:1.5b  
**Compatible with**: Open WebUI, Ollama CLI, and all Ollama clients