Instructions to use FINAL-Bench/Darwin-36B-Opus with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use FINAL-Bench/Darwin-36B-Opus with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="FINAL-Bench/Darwin-36B-Opus")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("FINAL-Bench/Darwin-36B-Opus")
model = AutoModelForCausalLM.from_pretrained("FINAL-Bench/Darwin-36B-Opus")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use FINAL-Bench/Darwin-36B-Opus with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "FINAL-Bench/Darwin-36B-Opus"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FINAL-Bench/Darwin-36B-Opus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/FINAL-Bench/Darwin-36B-Opus

SGLang

How to use FINAL-Bench/Darwin-36B-Opus with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "FINAL-Bench/Darwin-36B-Opus" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FINAL-Bench/Darwin-36B-Opus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "FINAL-Bench/Darwin-36B-Opus" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FINAL-Bench/Darwin-36B-Opus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use FINAL-Bench/Darwin-36B-Opus with Docker Model Runner:
```
docker model run hf.co/FINAL-Bench/Darwin-36B-Opus
```

Fantastic Model

by mgalyan - opened about 10 hours ago

Discussion

mgalyan

about 10 hours ago

Only had a brief time to work with the model so far, but using an agentic-harness so far is working great. Just wanted to say thanks!

SeaWolf-AI

FINAL_Bench org about 10 hours ago

Only had a brief time to work with the model so far, but using an agentic-harness so far is working great. Just wanted to say thanks!

Thank you — this genuinely made our day.

Agentic workloads were one of the regimes we were most curious about
but didn't have space to cover in the paper, so hearing it holds up
in your harness is incredibly useful signal.

If you ever feel like sharing rough notes on what kinds of tasks
you've been throwing at it — tool-use patterns, failure modes,
anything — we'd love to learn from it. Either way, thanks again 🙏

mgalyan

about 10 hours ago

You're very welcome. Here was my first unexpected surprise using the model tonight...

SeaWolf-AI

FINAL_Bench org about 10 hours ago

Okay this one stopped us in our tracks 🙏

A few things in those screenshots we genuinely didn't expect to see:

— Darwin chaining two tool calls in a row to land the snapshot in the
right directory, when only one was explicitly requested
— Recovery of intent through a broken intercept layer (we have no
test for that; you just gave us one)
— The proactive "here's the next 4 things, tell me where to strike first"
— that wasn't trained in, it's emerging from the merge

And honestly, seeing it run as a Sovereign engine on a ROCm 7.2.3 /
amdsmi stack is exactly the deployment shape we hoped Darwin would
land in but didn't have the AMD hardware to validate ourselves.
That alone is incredibly useful signal.

If you ever feel like sharing more — even a one-paragraph note on
how Apollo wires intercepts and tool calls — we'd read it carefully.
And if any of those four next-steps you listed (the Daydream daemon
one especially caught our eye) ever needs a Darwin variant tuned
differently, ping us.

Bravo right back. This is the kind of field report papers can't capture.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment