Text Generation
Transformers
Safetensors
gemma4
image-text-to-text
Merge
evolutionary-merge
darwin
darwin-v6
model-mri
cross-architecture
ffn-crossbreed
cma-es
hybrid-vigor
transformer-mamba
reasoning
qwen3.5
gated-deltanet
korean
multilingual
gpqa
open-source
world-first
conversational
Eval Results (legacy)
Instructions to use FINAL-Bench/Darwin-4B-Genesis with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FINAL-Bench/Darwin-4B-Genesis with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="FINAL-Bench/Darwin-4B-Genesis") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("FINAL-Bench/Darwin-4B-Genesis") model = AutoModelForImageTextToText.from_pretrained("FINAL-Bench/Darwin-4B-Genesis") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use FINAL-Bench/Darwin-4B-Genesis with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "FINAL-Bench/Darwin-4B-Genesis" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-4B-Genesis", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/FINAL-Bench/Darwin-4B-Genesis
- SGLang
How to use FINAL-Bench/Darwin-4B-Genesis with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "FINAL-Bench/Darwin-4B-Genesis" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-4B-Genesis", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "FINAL-Bench/Darwin-4B-Genesis" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FINAL-Bench/Darwin-4B-Genesis", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use FINAL-Bench/Darwin-4B-Genesis with Docker Model Runner:
docker model run hf.co/FINAL-Bench/Darwin-4B-Genesis
| base_model: | |
| - FINAL-Bench/Darwin-4B-David | |
| - Qwen/Qwen3.5-4B | |
| language: | |
| - ko | |
| - en | |
| - zh | |
| - ja | |
| - de | |
| - fr | |
| - es | |
| license: apache-2.0 | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
| tags: | |
| - merge | |
| - evolutionary-merge | |
| - darwin | |
| - darwin-v6 | |
| - model-mri | |
| - cross-architecture | |
| - ffn-crossbreed | |
| - cma-es | |
| - hybrid-vigor | |
| - transformer-mamba | |
| - reasoning | |
| - gemma4 | |
| - qwen3.5 | |
| - gated-deltanet | |
| - korean | |
| - multilingual | |
| - gpqa | |
| - open-source | |
| - world-first | |
| model-index: | |
| - name: Darwin-4B-Genesis | |
| results: | |
| - task: | |
| type: text-generation | |
| name: Korean Cultural Understanding | |
| dataset: | |
| name: CLIcK | |
| type: EunsuKim/CLIcK | |
| metrics: | |
| - type: accuracy | |
| value: 92.0 | |
| name: Accuracy | |
| verified: false | |
| - task: | |
| type: text-generation | |
| name: Multi-Step Reasoning | |
| dataset: | |
| name: MuSR | |
| type: TAUR-Lab/MuSR | |
| metrics: | |
| - type: accuracy | |
| value: 70.0 | |
| name: Accuracy | |
| verified: false | |
| # Darwin-4B-Genesis | |
| <p align="center"> | |
| <a href="https://huggingface.co/FINAL-Bench/Darwin-4B-Opus"><img src="https://img.shields.io/badge/π§¬_Gen1-Darwin--4B--Opus-blue?style=for-the-badge" alt="Gen1"></a> | |
| <a href="https://huggingface.co/FINAL-Bench/Darwin-4B-David"><img src="https://img.shields.io/badge/π§¬_Gen2-Darwin--4B--David-blue?style=for-the-badge" alt="Gen2"></a> | |
| <a href="https://huggingface.co/FINAL-Bench/Darwin-4B-Genesis"><img src="https://img.shields.io/badge/β_Gen3-Darwin--4B--Genesis-gold?style=for-the-badge" alt="Gen3"></a> | |
| </p> | |
| Darwin-4B-Genesis is presented in the paper [Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning](https://arxiv.org/abs/2605.14386). | |
| <p align="center"> | |
| <a href="https://huggingface.co/FINAL-Bench/Darwin-9B-Opus"><img src="https://img.shields.io/badge/π§¬_Model-Darwin--9B--Opus-blue?style=for-the-badge" alt="9B"></a> | |
| <a href="https://huggingface.co/spaces/FINAL-Bench/Darwin-9B-Opus"><img src="https://img.shields.io/badge/π_Space-9B_Demo-purple?style=for-the-badge" alt="9B Space"></a> | |
| <a href="https://huggingface.co/FINAL-Bench/Darwin-31B-Opus"><img src="https://img.shields.io/badge/π§¬_Model-Darwin--31B--Opus-blue?style=for-the-badge" alt="31B"></a> | |
| <a href="https://huggingface.co/spaces/FINAL-Bench/Darwin-31B-Opus"><img src="https://img.shields.io/badge/π_Space-31B_Demo-purple?style=for-the-badge" alt="31B Space"></a> | |
| </p> | |
| <p align="center"> | |
| <a href="https://huggingface.co/FINAL-Bench/Darwin-35B-A3B-Opus"><img src="https://img.shields.io/badge/π§¬_Model-Darwin--35B--A3B--Opus-blue?style=for-the-badge" alt="35B"></a> | |
| <a href="https://huggingface.co/spaces/FINAL-Bench/Darwin-35B-A3B-Opus"><img src="https://img.shields.io/badge/π_Space-35B_Demo-purple?style=for-the-badge" alt="35B Space"></a> | |
| <a href="https://huggingface.co/FINAL-Bench/Darwin-35B-A3B-Opus-Q8-GGUF"><img src="https://img.shields.io/badge/π¦_GGUF-Q8--Official-yellow?style=for-the-badge" alt="Q8 GGUF"></a> | |
| <a href="https://huggingface.co/bartowski/FINAL-Bench_Darwin-35B-A3B-Opus-GGUF"><img src="https://img.shields.io/badge/π¦_GGUF-bartowski-yellow?style=for-the-badge" alt="bartowski GGUF"></a> | |
| </p> | |
| <p align="center"> | |
| <a href="https://huggingface.co/spaces/FINAL-Bench/Leaderboard"><img src="https://img.shields.io/badge/π_FINAL_Bench-Leaderboard-green?style=for-the-badge" alt="FINAL Bench"></a> | |
| <a href="https://huggingface.co/spaces/FINAL-Bench/all-bench-leaderboard"><img src="https://img.shields.io/badge/π_ALL_Bench-Leaderboard-orange?style=for-the-badge" alt="ALL Bench"></a> | |
| </p> | |
| > **World's first Transformer Γ Mamba evolutionary cross-architecture FFN breeding** | CLIcK 92% | MuSR 70% | A 4B model outperforming 27B | CMA-ES 42-dimensional genome search | Hybrid Vigor demonstrated | Apache 2.0 | |
| --- | |
| ## What Is This? | |
| Darwin-4B-Genesis is the 3rd generation Darwin model and the **world's first model to successfully crossbreed FFN layers across different architectures** β Transformer (Gemma4) and Mamba (Qwen3.5 GatedDeltaNet) β using evolutionary optimization. | |
| The father's Attention layers (Gemma4 Transformer) are preserved at 100%, while the mother's FFN knowledge (Qwen3.5 Mamba) is transplanted at layer-specific optimal ratios discovered automatically by CMA-ES across 42 dimensions. | |
| The result: the child **outperforms both parents on every benchmark** β a phenomenon known as **Hybrid Vigor**. | |
| --- | |
| <p align="center"> | |
| <img src="tree.png" alt="Darwin-4B-Genesis" width="100%"> | |
| </p> | |
| ## Why This Matters | |
| ### 1. World First | |
| Existing hybrid models (Jamba, Nemotron-H, Granite 4.0) are all **designed and trained from scratch**. Darwin-4B-Genesis takes **two already-trained models** from different architecture families and breeds them evolutionarily β with **zero additional training**. | |
| ### 2. Hybrid Vigor Demonstrated | |
| | Benchmark | David (Father) | Qwen3.5-4B (Mother) | **Genesis (Child)** | | |
| |---|---|---|---| | |
| | CLIcK | 90% | ~50% (est.) | **92%** β | | |
| | MuSR | 65% | ~55% (est.) | **70%** β | | |
| The child surpasses **both** parents. This is the first demonstration of Hybrid Vigor in AI model breeding. | |
| --- | |
| ## Benchmarks | |
| | Benchmark | Genesis | David (Gen2) | K-AI #1 (27B) | | |
| |---|---|---|---| | |
| | **CLIcK** (Korean culture) | **92%** | 90% | 0.794 | | |
| | **MuSR** (multi-step reasoning) | **70%** | 65% | 0.604 | | |
| | **GPQA** (deep reasoning) | ~60% | ~60% | β | | |
| --- | |
| ## How It Works | |
| ### Cross-Architecture FFN Breeding | |
| ``` | |
| Father: Darwin-4B-David (Gemma4 Transformer, hidden=2560, 42 layers) | |
| Mother: Qwen/Qwen3.5-4B (GatedDeltaNet/Mamba, hidden=2560, 32 layers) | |
| Key insight: hidden_size matches (2560) β direct FFN replacement possible | |
| Method: Attention 100% from Father, FFN blended at per-layer optimal ratios | |
| Optimizer: CMA-ES (Covariance Matrix Adaptation Evolution Strategy) | |
| Genome: 42 dimensions (one ratio per layer) | |
| Fitness: CLIcK 60% + MuSR 40% composite score | |
| Frozen layers: L15, L16, L22, L23, L24, L25 (Korean language preservation) | |
| ``` | |
| ### Optimal Genome Discovered by CMA-ES | |
| ``` | |
| L00: 0.206 βββββββββββ 21% Qwen | |
| L07: 0.000 βββββββββββ Auto-protected by CMA-ES | |
| L15: 0.000 βββββββββββ Frozen (Korean) | |
| L22: 0.000 βββββββββββ Frozen (Korean) | |
| L29: 0.291 βββββββββββββββ 29% Qwen (maximum) | |
| L31: 0.244 βββββββββββββ 24% Qwen | |
| L32: 0.273 ββββββββββββββ 27% Qwen | |
| ``` | |
| Key finding: CMA-ES applied the **most aggressive Qwen blending to the final layers (L29-32)**, which govern output quality. | |
| --- | |
| ## Usage | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| tokenizer = AutoTokenizer.from_pretrained( | |
| "FINAL-Bench/Darwin-4B-Genesis", | |
| trust_remote_code=True, | |
| ) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| "FINAL-Bench/Darwin-4B-Genesis", | |
| dtype="bfloat16", | |
| device_map="auto", | |
| trust_remote_code=True, | |
| ) | |
| messages = [{"role": "user", "content": "Explain how hybrid vigor works in genetics."}] | |
| text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) | |
| inputs = tokenizer(text, return_tensors="pt").to(model.device) | |
| outputs = model.generate(**inputs, max_new_tokens=1024, do_sample=False) | |
| print(tokenizer.decode(outputs[0][inputs['input_ids'].shape[-1]:], skip_special_tokens=True)) | |
| ``` | |
| --- | |
| ## Genealogy | |
| ``` | |
| google/gemma-4-E4B-it Γ TeichAI/Claude-Opus-Distill-E4B | |
| β Darwin-4B-Opus (Gen 1, DARE-TIES merge) | |
| Darwin-4B-Opus Γ DavidAU/DECKARD-Expresso-Universe | |
| β Darwin-4B-David (Gen 2, MRI-guided merge, CLIcK 90%) | |
| Darwin-4B-David Γ Qwen/Qwen3.5-4B | |
| β Darwin-4B-Genesis (Gen 3, Cross-Arch FFN Breeding, CLIcK 92%) β | |
| ``` | |
| --- | |
| ## Citation | |
| ```bibtex | |
| @misc{vidraft_darwin_4b_genesis, | |
| title = {Darwin-4B-Genesis: World's First Cross-Architecture FFN Breeding}, | |
| author = {VIDRAFT}, | |
| year = {2026}, | |
| publisher = {Hugging Face}, | |
| howpublished = {\url{https://huggingface.co/FINAL-Bench/Darwin-4B-Genesis}} | |
| } | |
| @article{kim2026darwin, | |
| title={Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning}, | |
| author={Kim, Taebong and Hong, Youngsik and Kim, Minsik and Choi, Sunyoung and Jang, Jaewon and Shin, Junghoon and Kim, Minseo}, | |
| journal={arXiv preprint arXiv:2605.14386}, | |
| year={2026} | |
| } | |
| ``` |