Spaces:

likhonsheikh
/

anthropic-compatible-api

Running

App Files Files Community

anthropic-compatible-api / README.md

likhonsheikh

Upload README.md with huggingface_hub

7989a2d verified 3 days ago

preview code

raw

history blame

2.19 kB

metadata

title: Anthropic Compatible API
emoji: 🤖
colorFrom: purple
colorTo: blue
sdk: docker
pinned: false
license: apache-2.0

Anthropic-Compatible API

A lightweight, CPU-based API endpoint that provides Anthropic Messages API compatibility using the SmolLM2-135M model.

Features

✅ Full Anthropic Messages API compatibility
✅ Streaming support (SSE)
✅ Token counting endpoint
✅ Ultra-lightweight (135M parameters)
✅ CPU-optimized
✅ No GPU required

API Endpoints

Create Message

POST /v1/messages

Example Request

curl -X POST "https://YOUR_SPACE.hf.space/v1/messages" \
  -H "Content-Type: application/json" \
  -H "x-api-key: your-api-key" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "smollm2-135m",
    "max_tokens": 256,
    "messages": [
      {"role": "user", "content": "Hello, how are you?"}
    ]
  }'

Streaming Example

curl -X POST "https://YOUR_SPACE.hf.space/v1/messages" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "smollm2-135m",
    "max_tokens": 256,
    "stream": true,
    "messages": [
      {"role": "user", "content": "Tell me a short story"}
    ]
  }'

SDK Compatibility

Python

import anthropic

client = anthropic.Anthropic(
    api_key="any-key",
    base_url="https://YOUR_SPACE.hf.space"
)

message = client.messages.create(
    model="smollm2-135m",
    max_tokens=256,
    messages=[{"role": "user", "content": "Hello!"}]
)
print(message.content[0].text)

TypeScript/JavaScript

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: 'any-key',
  baseURL: 'https://YOUR_SPACE.hf.space'
});

const message = await client.messages.create({
  model: 'smollm2-135m',
  max_tokens: 256,
  messages: [{ role: 'user', content: 'Hello!' }]
});
console.log(message.content[0].text);

Model Info

Model: HuggingFaceTB/SmolLM2-135M-Instruct
Parameters: 135 Million
Optimized for: CPU inference
Context Length: 2048 tokens

Rate Limits

This is a free CPU-based endpoint. Please be mindful of usage.

Built with ❤️ by Matrix Agent