Spaces:

oki692
/

ollama-api-lfm

Sleeping

App Files Files Community

ollama-api-lfm / README.md

oki692

Upload README.md with huggingface_hub

fb87878 verified 20 days ago

preview code

raw

history blame contribute delete

3.66 kB

metadata

title: Ollama Compatible API
emoji: 🦙
colorFrom: green
colorTo: blue
sdk: docker
pinned: false
license: mit

Ollama Compatible API

Full Ollama-compatible API proxy for deepseek-r1:1.5b model. Works seamlessly with Open WebUI and other Ollama clients.

🔗 API Endpoint

https://your-space-name.hf.space

🎯 Open WebUI Configuration

Step 1: Add Connection

Open Open WebUI
Go to Settings → Connections
Click Add Connection

Step 2: Configure Ollama API

Type: Ollama API
URL: https://your-space-name.hf.space
API Key: Leave empty (no authentication required)

Step 3: Test Connection

Click Test Connection - should show "Connected" with available models.

📡 Available Endpoints

GET `/api/tags`

List all available models (Ollama compatible)

Example:

curl https://your-space-name.hf.space/api/tags

Response:

{
  "models": [
    {
      "name": "deepseek-r1:1.5b",
      "modified_at": "2024-01-01T00:00:00Z",
      "size": 1500000000
    }
  ]
}

POST `/api/generate`

Generate completion (Ollama compatible)

Example:

curl -X POST https://your-space-name.hf.space/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-r1:1.5b",
    "prompt": "Why is the sky blue?",
    "stream": true
  }'

POST `/api/chat`

Chat completion (Ollama compatible)

Example:

curl -X POST https://your-space-name.hf.space/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-r1:1.5b",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ],
    "stream": true
  }'

🐍 Python Client Example

import httpx
import json

API_URL = "https://your-space-name.hf.space"

# Generate completion
with httpx.stream(
    "POST",
    f"{API_URL}/api/generate",
    json={
        "model": "deepseek-r1:1.5b",
        "prompt": "What is AI?",
        "stream": True
    },
    timeout=300
) as response:
    for line in response.iter_lines():
        if line:
            data = json.loads(line)
            print(data.get("response", ""), end="", flush=True)

🔧 Ollama CLI Compatible

You can use the official Ollama CLI by setting the base URL:

export OLLAMA_HOST=https://your-space-name.hf.space
ollama list
ollama run deepseek-r1:1.5b "Hello!"

⚡ Features

Full Ollama API compatibility - works with any Ollama client
Real-time streaming - low latency token-by-token generation
No caching - fresh responses every time
CORS enabled - works from browser applications
Open WebUI ready - plug and play integration
No authentication - public access (add auth if needed)

🚀 Performance Optimizations

Async I/O for non-blocking operations
Connection pooling with httpx
Flash attention enabled
Optimized batch processing
No access logs for reduced overhead

📊 Model Information

Model: deepseek-r1:1.5b
Parameters: ~1.5 billion
Optimized for: Fast inference, low latency
Context window: 2048 tokens

🛠️ Technical Stack

Proxy: FastAPI (Python)
Backend: Ollama
Model: deepseek-r1:1.5b
Server: Uvicorn ASGI

🔒 Security Note

This Space has no authentication by default. If you need to restrict access:

Fork this Space
Add API key middleware in app.py
Configure authentication in Open WebUI

📝 License

MIT License

Endpoint: https://your-space-name.hf.space
Model: deepseek-r1:1.5b
Compatible with: Open WebUI, Ollama CLI, and all Ollama clients

Ollama Compatible API

🔗 API Endpoint

🎯 Open WebUI Configuration

Step 1: Add Connection

Step 2: Configure Ollama API

Step 3: Test Connection

📡 Available Endpoints

GET /api/tags

POST /api/generate

POST /api/chat

🐍 Python Client Example

🔧 Ollama CLI Compatible

⚡ Features

🚀 Performance Optimizations

📊 Model Information

🛠️ Technical Stack

🔒 Security Note

📝 License

GET `/api/tags`

POST `/api/generate`

POST `/api/chat`