Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning
Paper • 2605.14386 • Published • 60
How to use sovthpaw/OmniSenter-Base-16B with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="sovthpaw/OmniSenter-Base-16B")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM
tokenizer = AutoTokenizer.from_pretrained("sovthpaw/OmniSenter-Base-16B")
model = AutoModelForMultimodalLM.from_pretrained("sovthpaw/OmniSenter-Base-16B")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use sovthpaw/OmniSenter-Base-16B with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "sovthpaw/OmniSenter-Base-16B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "sovthpaw/OmniSenter-Base-16B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/sovthpaw/OmniSenter-Base-16B
How to use sovthpaw/OmniSenter-Base-16B with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "sovthpaw/OmniSenter-Base-16B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "sovthpaw/OmniSenter-Base-16B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "sovthpaw/OmniSenter-Base-16B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "sovthpaw/OmniSenter-Base-16B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use sovthpaw/OmniSenter-Base-16B with Docker Model Runner:
docker model run hf.co/sovthpaw/OmniSenter-Base-16B
NOUS RESEARCH — EVOLUTIONARY MODEL MERGING
MODEL TYPE .............. MULTIMODAL LANGUAGE MODEL
ARCHITECTURE ............ QWEN3 (MODIFIED) + COSMOS3 MULTIMODAL HEADS
PARAMETERS .............. 16B
PRECISION ............... BFLOAT16
GENERATION .............. 0 (BASE)
STATUS .................. DARWIN MERGED — READY FOR SFT
OMNISENTER BASE 16B IS A DARWIN FAMILY EVOLVED MULTIMODAL MODEL — THE FIRST GENERATION OF THE OMNISENTER LINEAGE. PRODUCED BY FUSING THE REASONING CAPABILITIES OF QWEN3-8B INTO THE MULTIMODAL WORLD MODEL COSMOS3-NANO VIA PER-TENSOR MRI-TRUST FUSION.
THE MODEL PRESERVES ALL OF COSMOS3-NANO'S MULTIMODAL MODALITIES — VISION, AUDIO, VIDEO UNDERSTANDING AND GENERATION — WHILE BLENDING IN QWEN3-8B'S TEXT REASONING STRENGTHS.
| PARENT | ARCHITECTURE | PARAMETERS | ROLE |
|---|---|---|---|
| NVIDIA/COSMOS3-NANO | COSMOS3FORCONDITIONALGENERATION | ~16B | MULTIMODAL WORLD MODEL |
| QWEN/QWEN3-8B | QWEN3FORCAUSALLM | 8B | DENSE TEXT REASONING |
METHOD .................. DARWIN FAMILY MRI-TRUST FUSION
GENOME DENSITY (ρ_b) .... 0.5
MRI-TRUST COEFF (τ) ..... 0.4
TEXT TENSORS MERGED ..... 398
COSMOS EXTRAS PRESERVED . 399 (CROSS-ATTN, MOE TWINS, MODALITY)
TOTAL OUTPUT TENSORS .... 798
SHAPE MATCH RATE ........ 398/398 (100%)
MERGE TIME .............. 195S
MODEL SIZE .............. 29GB (BFLOAT16, 7 SHARDS)
OMNISENTER BASE 16B
├── TEXT BACKBONE (DARWIN-MERGED QWEN3)
│ ├── 36 TRANSFORMER LAYERS
│ ├── SELF-ATTN + MLP + NORMS PER LAYER
│ ├── EMBED_TOKENS (151,936 VOCAB)
│ └── LM_HEAD
├── CROSS-MODAL ATTENTION (FROM COSMOS3-NANO)
│ ├── ADD_Q/K/V_PROJ + TO_ADD_OUT PER LAYER
│ └── NORM_ADDED_Q/K PER LAYER
├── MOE GENERATION TWINS (FROM COSMOS3-NANO)
│ └── LAYERS.*.MLP_MOE_GEN.* + LAYERNORMS
├── VISION ENCODER
├── DIFFUSION TRANSFORMER (VIDEO/IMAGE GEN)
├── SOUND TOKENIZER
└── VAE
TEXT REASONING ......... YES — ENHANCED VIA QWEN3-8B FUSION
VISION ................ YES — PRESERVED FROM COSMOS3-NANO
AUDIO ................. YES — PRESERVED FROM COSMOS3-NANO
VIDEO UNDERSTANDING ... YES — PRESERVED FROM COSMOS3-NANO
VIDEO GENERATION ...... YES — PRESERVED FROM COSMOS3-NANO
TOOL CALLING .......... BASE CAPABILITY — IMPROVEMENT VIA SFT (PLANNED)
AGENTIC BEHAVIOR ...... BASE CAPABILITY — IMPROVEMENT VIA SFT (PLANNED)
MUSIC GENERATION ...... NOT YET — ACESTEP INTEGRATION PLANNED (LINE 2)
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"sovthpaw/OmniSenter-Base-16B",
torch_dtype=torch.bfloat16,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("sovthpaw/OmniSenter-Base-16B")
messages = [{"role": "user", "content": "Hello, what can you do?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
HERMES REASONING TOOL USE ............. 5,000 CONVERSATIONS
AURETH SFT CURRICULUM ................. 5,000 CONVERSATIONS
HERMES AGENT TRACES ................... 3,679 CONVERSATIONS
HERMES FUNCTION CALLING + THINKING .... 3,570 CONVERSATIONS
HERMES FUNCTION CALLING V1 ............ 1,893 CONVERSATIONS
─────────────────────────────────────────────────────────────
TOTAL ................................ 34,142 CONVERSATIONS
ADDITIONAL: NEMOTRON, ATRPOPS, NOUS RESEARCH DATASETS
| FORMAT | VRAM | NOTES |
|---|---|---|
| BFLOAT16 (SAFETENSORS) | ~32GB | FULL PRECISION, A100/2×3090 |
| 4-BIT QUANTIZED (QLORA) | ~8GB | FOR FINE-TUNING |
| Q4_K_M GGUF | ~10GB | INFERENCE ON SINGLE 3090 |
COSMOS3-NANO ──┐
├── DARWIN MERGE ──► OMNISENTER BASE 16B (GEN-0)
QWEN3-8B ─────┘ │
├──► GEN-1 (EVOLVED, CMA-ES)
├──► GEN-2 (EVOLVED + SFT)
└──► ... CONTINUOUS EVOLUTION
@article{darwin2026family,
title={Darwin Family: Training-Free Evolutionary Model Merging},
author={Darwin Team},
journal={arXiv preprint arXiv:2605.14386},
year={2026}
}
TOWARDS SELF-IMPROVEMENT
NOUS RESEARCH