File size: 1,719 Bytes
2fc471e 2893ee9 2fc471e 2893ee9 2fc471e 2893ee9 2fc471e 2893ee9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | ---
title: Darwin-35B-A3B-Opus API
emoji: π§¬
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: OpenAI-compatible FastAPI for Darwin-35B-A3B-Opus (INT4)
---
# Darwin-35B-A3B-Opus API
Self-hosted OpenAI-compatible FastAPI server for [FINAL-Bench/Darwin-35B-A3B-Opus](https://huggingface.co/FINAL-Bench/Darwin-35B-A3B-Opus).
- **35B MoE / 3B active** β Qwen3.5-MoE based
- **INT4 quantized** (~18 GB) β fits on L4/A10G/L40S
- **OpenAI-compatible** endpoints + SSE streaming
- **Bearer auth** (configurable via `API_KEYS` secret)
## Endpoints
- `GET /` β Landing page with examples
- `GET /health` β Health + load status
- `GET /v1/models` β List models
- `POST /v1/chat/completions` β Chat (OpenAI compat)
## Configuration (HF Space secrets)
| Secret | Required | Description |
|--------|----------|-------------|
| `HF_TOKEN` | optional | HF token for private/gated models |
| `API_KEYS` | optional | Comma-separated bearer keys (empty = public) |
| `QUANT_MODE` | optional | `int4` (default), `int8`, `bf16` |
| `MODEL_ID` | optional | HF model id (default: `FINAL-Bench/Darwin-35B-A3B-Opus`) |
## Hardware
Recommended:
- **L4 (24GB)** β INT4 β
- **A10G-small (24GB)** β INT4 β
- **L40S (48GB)** β INT4 β
or INT8 β
- **A100 (80GB)** β any mode including BF16
## Example
```python
from openai import OpenAI
client = OpenAI(
api_key="YOUR_KEY",
base_url="https://final-bench-darwin-35b-a3b-opus-api.hf.space/v1",
)
resp = client.chat.completions.create(
model="Darwin-35B-A3B-Opus",
messages=[{"role":"user","content":"Explain GPQA"}],
max_tokens=300,
)
print(resp.choices[0].message.content)
```
|