Instructions to use FINAL-Bench/Darwin-28B-Opus with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use FINAL-Bench/Darwin-28B-Opus with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="FINAL-Bench/Darwin-28B-Opus")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("FINAL-Bench/Darwin-28B-Opus")
model = AutoModelForMultimodalLM.from_pretrained("FINAL-Bench/Darwin-28B-Opus")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Local Apps Settings

vLLM

How to use FINAL-Bench/Darwin-28B-Opus with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "FINAL-Bench/Darwin-28B-Opus"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FINAL-Bench/Darwin-28B-Opus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/FINAL-Bench/Darwin-28B-Opus

SGLang

How to use FINAL-Bench/Darwin-28B-Opus with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "FINAL-Bench/Darwin-28B-Opus" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FINAL-Bench/Darwin-28B-Opus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "FINAL-Bench/Darwin-28B-Opus" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FINAL-Bench/Darwin-28B-Opus",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use FINAL-Bench/Darwin-28B-Opus with Docker Model Runner:
```
docker model run hf.co/FINAL-Bench/Darwin-28B-Opus
```

Missing preprocessor_config.json when deploying Darwin-28B models

by sothouth - opened 16 days ago

Discussion

sothouth

16 days ago

Hello, I encountered some issues while trying to deploy this model.

I can deploy and use FINAL-Bench/Darwin-27B-Opus smoothly. Based on that setup, I tried deploying both FINAL-Bench/Darwin-28B-REASON and FINAL-Bench/Darwin-28B-Opus with commands similar to the following:

sglang serve --served-model-name default \
--model-path FINAL-Bench/Darwin-28B-Opus \
--reasoning-parser qwen3 --tool-call-parser qwen3_coder \
--mem-fraction-static 0.90 --context-length 262144 \
--tokenizer-worker-num 8 --schedule-policy lpm \
--cuda-graph-max-bs 16 --max-running-requests 16 \
--chunked-prefill-size 16384 --enable-mixed-chunk \
--tp-size 4 --pp-size 1 --dp-size 1 --ep-size 1 \
--trust-remote-code --enable-multimodal \
--host 0.0.0.0 --port 38011

However, I then get the following error:

OSError: Can't load image processor for '$HOME/.cache/huggingface/hub/models--FINAL-Bench--Darwin-28B-Opus/snapshots/165e98249be214cbdc19015fa565ad1b571f2e76/'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure '$HOME/.cache/huggingface/hub/models--FINAL-Bench--Darwin-28B-Opus/snapshots/165e98249be214cbdc19015fa565ad1b571f2e76/' is the correct path to a directory containing a preprocessor_config.json file

As a quick workaround, I tried copying preprocessor_config.json from FINAL-Bench/Darwin-27B-Opus and Qwen/Qwen3.6-27B into the model directory. After doing so, the service was able to start normally.

However, during agent-based calls — for example, using VS Code Copilot or Hermes — I consistently run into the same issue: the model outputs a brief reasoning sentence such as:

The user wants a summary of the entire DOC.md document, not just the selected section. Let me read the full document to provide a comprehensive summary.

and then immediately stops generating any further output.

I’m not sure what exactly is going wrong here. Specifically, I’m unclear about:

how this model is supposed to be launched when preprocessor_config.json is missing;
why the agent calls start behaving abnormally after this workaround.

If this is due to an issue in my deployment or configuration, I would really appreciate your guidance. On the other hand, if some files are missing from the model repository, I would be very grateful if you could help fix them.

SeaWolf-AI

FINAL_Bench org 14 days ago

Thank you for reporting this — we've fixed both issues.

1. Missing preprocessor_config.json
The file has been added to both repositories:

FINAL-Bench/Darwin-28B-Opus → commit 950d8ed
FINAL-Bench/Darwin-28B-REASON → commit 4669ba2

You no longer need the manual workaround.

2. Agent calls stopping after reasoning
This is caused by using --reasoning-parser qwen3 and --tool-call-parser qwen3_coder together — a known SGLang conflict that leaves content=null. Remove --reasoning-parser qwen3 and --enable-mixed-chunk from your launch command. The model will still emit <think>...</think> content normally; it just won't be split into a separate field.

Let us know if anything else comes up!

sothouth

14 days ago

Thanks for the quick fix and clarification!

I’ll pull the latest version again, and retest the agent invocation after removing --reasoning-parser qwen3 and --enable-mixed-chunk as suggested.

If I encounter any further issues, I’ll provide the logs and reproduction steps. Thanks again!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment