Instructions to use FINAL-Bench/Darwin-36B-Opus with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FINAL-Bench/Darwin-36B-Opus with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="FINAL-Bench/Darwin-36B-Opus") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("FINAL-Bench/Darwin-36B-Opus") model = AutoModelForCausalLM.from_pretrained("FINAL-Bench/Darwin-36B-Opus") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use FINAL-Bench/Darwin-36B-Opus with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "FINAL-Bench/Darwin-36B-Opus" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-36B-Opus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/FINAL-Bench/Darwin-36B-Opus
- SGLang
How to use FINAL-Bench/Darwin-36B-Opus with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "FINAL-Bench/Darwin-36B-Opus" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-36B-Opus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "FINAL-Bench/Darwin-36B-Opus" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-36B-Opus", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use FINAL-Bench/Darwin-36B-Opus with Docker Model Runner:
docker model run hf.co/FINAL-Bench/Darwin-36B-Opus
Fantastic Model
Only had a brief time to work with the model so far, but using an agentic-harness so far is working great. Just wanted to say thanks!
Only had a brief time to work with the model so far, but using an agentic-harness so far is working great. Just wanted to say thanks!
Thank you β this genuinely made our day.
Agentic workloads were one of the regimes we were most curious about
but didn't have space to cover in the paper, so hearing it holds up
in your harness is incredibly useful signal.
If you ever feel like sharing rough notes on what kinds of tasks
you've been throwing at it β tool-use patterns, failure modes,
anything β we'd love to learn from it. Either way, thanks again π
Okay this one stopped us in our tracks π
A few things in those screenshots we genuinely didn't expect to see:
β Darwin chaining two tool calls in a row to land the snapshot in the
right directory, when only one was explicitly requested
β Recovery of intent through a broken intercept layer (we have no
test for that; you just gave us one)
β The proactive "here's the next 4 things, tell me where to strike first"
β that wasn't trained in, it's emerging from the merge
And honestly, seeing it run as a Sovereign engine on a ROCm 7.2.3 /
amdsmi stack is exactly the deployment shape we hoped Darwin would
land in but didn't have the AMD hardware to validate ourselves.
That alone is incredibly useful signal.
If you ever feel like sharing more β even a one-paragraph note on
how Apollo wires intercepts and tool calls β we'd read it carefully.
And if any of those four next-steps you listed (the Daydream daemon
one especially caught our eye) ever needs a Darwin variant tuned
differently, ping us.
Bravo right back. This is the kind of field report papers can't capture.

