File size: 1,719 Bytes
2fc471e
2893ee9
 
 
 
2fc471e
2893ee9
2fc471e
2893ee9
 
2fc471e
 
2893ee9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
title: Darwin-35B-A3B-Opus API
emoji: 🧬
colorFrom: indigo
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: OpenAI-compatible FastAPI for Darwin-35B-A3B-Opus (INT4)
---

# Darwin-35B-A3B-Opus API

Self-hosted OpenAI-compatible FastAPI server for [FINAL-Bench/Darwin-35B-A3B-Opus](https://huggingface.co/FINAL-Bench/Darwin-35B-A3B-Opus).

- **35B MoE / 3B active** β€” Qwen3.5-MoE based
- **INT4 quantized** (~18 GB) β€” fits on L4/A10G/L40S
- **OpenAI-compatible** endpoints + SSE streaming
- **Bearer auth** (configurable via `API_KEYS` secret)

## Endpoints

- `GET /` β€” Landing page with examples
- `GET /health` β€” Health + load status
- `GET /v1/models` β€” List models
- `POST /v1/chat/completions` β€” Chat (OpenAI compat)

## Configuration (HF Space secrets)

| Secret | Required | Description |
|--------|----------|-------------|
| `HF_TOKEN` | optional | HF token for private/gated models |
| `API_KEYS` | optional | Comma-separated bearer keys (empty = public) |
| `QUANT_MODE` | optional | `int4` (default), `int8`, `bf16` |
| `MODEL_ID` | optional | HF model id (default: `FINAL-Bench/Darwin-35B-A3B-Opus`) |

## Hardware

Recommended:
- **L4 (24GB)** β€” INT4 βœ…
- **A10G-small (24GB)** β€” INT4 βœ…
- **L40S (48GB)** β€” INT4 βœ… or INT8 βœ…
- **A100 (80GB)** β€” any mode including BF16

## Example

```python
from openai import OpenAI
client = OpenAI(
    api_key="YOUR_KEY",
    base_url="https://final-bench-darwin-35b-a3b-opus-api.hf.space/v1",
)
resp = client.chat.completions.create(
    model="Darwin-35B-A3B-Opus",
    messages=[{"role":"user","content":"Explain GPQA"}],
    max_tokens=300,
)
print(resp.choices[0].message.content)
```