Anthropic-Compatible API

Self-hosted Claude-compatible endpoint powered by Qwen2.5-Coder-7B

API Status

Checking...

Queue Status

Loading...

Cache Stats

Loading...

Available Models

Choose based on your needs: 7B for quality, 1.5B for speed (3x faster)

Loading models...

Quick Start

Claude Code CLI

# Set environment variables
export ANTHROPIC_API_KEY="any-key"
export ANTHROPIC_BASE_URL="https://likhonsheikh-anthropic-compatible-api.hf.space/anthropic"

# Run Claude Code with custom model
claude --model qwen2.5-coder-7b "Write a hello world in Python"

Python SDK

import anthropic

client = anthropic.Anthropic(
    api_key="any-key",
    base_url="https://likhonsheikh-anthropic-compatible-api.hf.space/anthropic"
)

message = client.messages.create(
    model="qwen2.5-coder-7b",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)

cURL

curl -X POST "https://likhonsheikh-anthropic-compatible-api.hf.space/anthropic/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: any-key" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "qwen2.5-coder-7b",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Try it Now

Live Logs

Click "Refresh" to load logs...

API Endpoints

Method Endpoint Description
GET/Health check with full status
GET/healthSimple health check
GET/logsView API logs
GET/queue/statusRequest queue statistics
GET/models/statusLoaded models info
POST/anthropic/v1/messagesAnthropic Messages API
POST/v1/chat/completionsOpenAI Chat API
GET/anthropic/v1/modelsList available models

Built with llama.cpp + FastAPI | Model: Qwen2.5-Coder-7B-Instruct (Q4_K_M)

Open source and self-hostable