Darwin-36B-Opus β€” VKAE Accelerated

Ready-to-run, VKAE-accelerated serving of Darwin-36B-Opus, VIDRAFT's house 36B Mixture-of-Experts model. Model weights and an optimized serving runtime in a single self-contained container.

VKAE (VIDRAFT Kernel Acceleration Engine) is VIDRAFT's proprietary inference-serving optimization. The acceleration recipe is withheld; only the reproducible results are published here.

Measured performance

NVIDIA B200, single GPU, bf16, same-harness before/after.

Metric Baseline VKAE Gain
Single-stream throughput 25.0 tok/s 280.8 tok/s 11.2Γ—
Output quality reference preserved no degradation

Quick start

docker pull vidraft/darwin36-vkae:281
docker run --gpus all -p 8000:8000 vidraft/darwin36-vkae:281

The container serves an OpenAI-compatible API on port 8000 β€” point any OpenAI client at http://localhost:8000/v1. A Blackwell (B200) or Hopper (H100/H200) class GPU is recommended.

Links

About

Darwin-36B-Opus is a VIDRAFT house model (36B Mixture-of-Experts). This card documents VIDRAFT's accelerated serving of the model; the acceleration method is proprietary and not distributed in source form.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for FINAL-Bench/Darwin-36B-Opus-VKAE

Finetuned
(2)
this model

Space using FINAL-Bench/Darwin-36B-Opus-VKAE 1

Collection including FINAL-Bench/Darwin-36B-Opus-VKAE