Text Generation
Transformers
Safetensors
gemma4
image-text-to-text
Merge
evolutionary-merge
darwin
darwin-v6
model-mri
cross-architecture
ffn-crossbreed
cma-es
hybrid-vigor
transformer-mamba
reasoning
qwen3.5
gated-deltanet
korean
multilingual
gpqa
open-source
world-first
conversational
Eval Results (legacy)
Instructions to use FINAL-Bench/Darwin-4B-Genesis with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FINAL-Bench/Darwin-4B-Genesis with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="FINAL-Bench/Darwin-4B-Genesis") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("FINAL-Bench/Darwin-4B-Genesis") model = AutoModelForImageTextToText.from_pretrained("FINAL-Bench/Darwin-4B-Genesis") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use FINAL-Bench/Darwin-4B-Genesis with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "FINAL-Bench/Darwin-4B-Genesis" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-4B-Genesis", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/FINAL-Bench/Darwin-4B-Genesis
- SGLang
How to use FINAL-Bench/Darwin-4B-Genesis with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "FINAL-Bench/Darwin-4B-Genesis" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-4B-Genesis", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "FINAL-Bench/Darwin-4B-Genesis" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-4B-Genesis", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use FINAL-Bench/Darwin-4B-Genesis with Docker Model Runner:
docker model run hf.co/FINAL-Bench/Darwin-4B-Genesis
File size: 8,420 Bytes
abae97a 7aba7f7 abae97a 7aba7f7 abae97a 7aba7f7 abae97a 7aba7f7 abae97a 7aba7f7 abae97a de47cbb abae97a 7aba7f7 abae97a 7aba7f7 abae97a 7aba7f7 abae97a 7aba7f7 abae97a 7aba7f7 abae97a 7aba7f7 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 | ---
base_model:
- FINAL-Bench/Darwin-4B-David
- Qwen/Qwen3.5-4B
language:
- ko
- en
- zh
- ja
- de
- fr
- es
license: apache-2.0
pipeline_tag: text-generation
library_name: transformers
tags:
- merge
- evolutionary-merge
- darwin
- darwin-v6
- model-mri
- cross-architecture
- ffn-crossbreed
- cma-es
- hybrid-vigor
- transformer-mamba
- reasoning
- gemma4
- qwen3.5
- gated-deltanet
- korean
- multilingual
- gpqa
- open-source
- world-first
model-index:
- name: Darwin-4B-Genesis
results:
- task:
type: text-generation
name: Korean Cultural Understanding
dataset:
name: CLIcK
type: EunsuKim/CLIcK
metrics:
- type: accuracy
value: 92.0
name: Accuracy
verified: false
- task:
type: text-generation
name: Multi-Step Reasoning
dataset:
name: MuSR
type: TAUR-Lab/MuSR
metrics:
- type: accuracy
value: 70.0
name: Accuracy
verified: false
---
# Darwin-4B-Genesis
<p align="center">
<a href="https://huggingface.co/FINAL-Bench/Darwin-4B-Opus"><img src="https://img.shields.io/badge/π§¬_Gen1-Darwin--4B--Opus-blue?style=for-the-badge" alt="Gen1"></a>
<a href="https://huggingface.co/FINAL-Bench/Darwin-4B-David"><img src="https://img.shields.io/badge/π§¬_Gen2-Darwin--4B--David-blue?style=for-the-badge" alt="Gen2"></a>
<a href="https://huggingface.co/FINAL-Bench/Darwin-4B-Genesis"><img src="https://img.shields.io/badge/β_Gen3-Darwin--4B--Genesis-gold?style=for-the-badge" alt="Gen3"></a>
</p>
Darwin-4B-Genesis is presented in the paper [Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning](https://arxiv.org/abs/2605.14386).
<p align="center">
<a href="https://huggingface.co/FINAL-Bench/Darwin-9B-Opus"><img src="https://img.shields.io/badge/π§¬_Model-Darwin--9B--Opus-blue?style=for-the-badge" alt="9B"></a>
<a href="https://huggingface.co/spaces/FINAL-Bench/Darwin-9B-Opus"><img src="https://img.shields.io/badge/π_Space-9B_Demo-purple?style=for-the-badge" alt="9B Space"></a>
<a href="https://huggingface.co/FINAL-Bench/Darwin-31B-Opus"><img src="https://img.shields.io/badge/π§¬_Model-Darwin--31B--Opus-blue?style=for-the-badge" alt="31B"></a>
<a href="https://huggingface.co/spaces/FINAL-Bench/Darwin-31B-Opus"><img src="https://img.shields.io/badge/π_Space-31B_Demo-purple?style=for-the-badge" alt="31B Space"></a>
</p>
<p align="center">
<a href="https://huggingface.co/FINAL-Bench/Darwin-35B-A3B-Opus"><img src="https://img.shields.io/badge/π§¬_Model-Darwin--35B--A3B--Opus-blue?style=for-the-badge" alt="35B"></a>
<a href="https://huggingface.co/spaces/FINAL-Bench/Darwin-35B-A3B-Opus"><img src="https://img.shields.io/badge/π_Space-35B_Demo-purple?style=for-the-badge" alt="35B Space"></a>
<a href="https://huggingface.co/FINAL-Bench/Darwin-35B-A3B-Opus-Q8-GGUF"><img src="https://img.shields.io/badge/π¦_GGUF-Q8--Official-yellow?style=for-the-badge" alt="Q8 GGUF"></a>
<a href="https://huggingface.co/bartowski/FINAL-Bench_Darwin-35B-A3B-Opus-GGUF"><img src="https://img.shields.io/badge/π¦_GGUF-bartowski-yellow?style=for-the-badge" alt="bartowski GGUF"></a>
</p>
<p align="center">
<a href="https://huggingface.co/spaces/FINAL-Bench/Leaderboard"><img src="https://img.shields.io/badge/π_FINAL_Bench-Leaderboard-green?style=for-the-badge" alt="FINAL Bench"></a>
<a href="https://huggingface.co/spaces/FINAL-Bench/all-bench-leaderboard"><img src="https://img.shields.io/badge/π_ALL_Bench-Leaderboard-orange?style=for-the-badge" alt="ALL Bench"></a>
</p>
> **World's first Transformer Γ Mamba evolutionary cross-architecture FFN breeding** | CLIcK 92% | MuSR 70% | A 4B model outperforming 27B | CMA-ES 42-dimensional genome search | Hybrid Vigor demonstrated | Apache 2.0
---
## What Is This?
Darwin-4B-Genesis is the 3rd generation Darwin model and the **world's first model to successfully crossbreed FFN layers across different architectures** β Transformer (Gemma4) and Mamba (Qwen3.5 GatedDeltaNet) β using evolutionary optimization.
The father's Attention layers (Gemma4 Transformer) are preserved at 100%, while the mother's FFN knowledge (Qwen3.5 Mamba) is transplanted at layer-specific optimal ratios discovered automatically by CMA-ES across 42 dimensions.
The result: the child **outperforms both parents on every benchmark** β a phenomenon known as **Hybrid Vigor**.
---
<p align="center">
<img src="tree.png" alt="Darwin-4B-Genesis" width="100%">
</p>
## Why This Matters
### 1. World First
Existing hybrid models (Jamba, Nemotron-H, Granite 4.0) are all **designed and trained from scratch**. Darwin-4B-Genesis takes **two already-trained models** from different architecture families and breeds them evolutionarily β with **zero additional training**.
### 2. Hybrid Vigor Demonstrated
| Benchmark | David (Father) | Qwen3.5-4B (Mother) | **Genesis (Child)** |
|---|---|---|---|
| CLIcK | 90% | ~50% (est.) | **92%** β
|
| MuSR | 65% | ~55% (est.) | **70%** β
|
The child surpasses **both** parents. This is the first demonstration of Hybrid Vigor in AI model breeding.
---
## Benchmarks
| Benchmark | Genesis | David (Gen2) | K-AI #1 (27B) |
|---|---|---|---|
| **CLIcK** (Korean culture) | **92%** | 90% | 0.794 |
| **MuSR** (multi-step reasoning) | **70%** | 65% | 0.604 |
| **GPQA** (deep reasoning) | ~60% | ~60% | β |
---
## How It Works
### Cross-Architecture FFN Breeding
```
Father: Darwin-4B-David (Gemma4 Transformer, hidden=2560, 42 layers)
Mother: Qwen/Qwen3.5-4B (GatedDeltaNet/Mamba, hidden=2560, 32 layers)
Key insight: hidden_size matches (2560) β direct FFN replacement possible
Method: Attention 100% from Father, FFN blended at per-layer optimal ratios
Optimizer: CMA-ES (Covariance Matrix Adaptation Evolution Strategy)
Genome: 42 dimensions (one ratio per layer)
Fitness: CLIcK 60% + MuSR 40% composite score
Frozen layers: L15, L16, L22, L23, L24, L25 (Korean language preservation)
```
### Optimal Genome Discovered by CMA-ES
```
L00: 0.206 βββββββββββ 21% Qwen
L07: 0.000 βββββββββββ Auto-protected by CMA-ES
L15: 0.000 βββββββββββ Frozen (Korean)
L22: 0.000 βββββββββββ Frozen (Korean)
L29: 0.291 βββββββββββββββ 29% Qwen (maximum)
L31: 0.244 βββββββββββββ 24% Qwen
L32: 0.273 ββββββββββββββ 27% Qwen
```
Key finding: CMA-ES applied the **most aggressive Qwen blending to the final layers (L29-32)**, which govern output quality.
---
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained(
"FINAL-Bench/Darwin-4B-Genesis",
trust_remote_code=True,
)
model = AutoModelForCausalLM.from_pretrained(
"FINAL-Bench/Darwin-4B-Genesis",
dtype="bfloat16",
device_map="auto",
trust_remote_code=True,
)
messages = [{"role": "user", "content": "Explain how hybrid vigor works in genetics."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, do_sample=False)
print(tokenizer.decode(outputs[0][inputs['input_ids'].shape[-1]:], skip_special_tokens=True))
```
---
## Genealogy
```
google/gemma-4-E4B-it Γ TeichAI/Claude-Opus-Distill-E4B
β Darwin-4B-Opus (Gen 1, DARE-TIES merge)
Darwin-4B-Opus Γ DavidAU/DECKARD-Expresso-Universe
β Darwin-4B-David (Gen 2, MRI-guided merge, CLIcK 90%)
Darwin-4B-David Γ Qwen/Qwen3.5-4B
β Darwin-4B-Genesis (Gen 3, Cross-Arch FFN Breeding, CLIcK 92%) β
```
---
## Citation
```bibtex
@misc{vidraft_darwin_4b_genesis,
title = {Darwin-4B-Genesis: World's First Cross-Architecture FFN Breeding},
author = {VIDRAFT},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/FINAL-Bench/Darwin-4B-Genesis}}
}
@article{kim2026darwin,
title={Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning},
author={Kim, Taebong and Hong, Youngsik and Kim, Minsik and Choi, Sunyoung and Jang, Jaewon and Shin, Junghoon and Kim, Minseo},
journal={arXiv preprint arXiv:2605.14386},
year={2026}
}
``` |