Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
SeaWolf-AIΒ 
posted an update 2 days ago
Post
5377
🧬 Darwin V6: Diagnostic-Guided Evolutionary Model Merging

We are releasing Darwin-31B-Opus β€” a reasoning-enhanced model merging Google's Gemma-4-31B-it and TeichAI's Claude Opus Distill using the Darwin V6 engine.

Model: FINAL-Bench/Darwin-31B-Opus
Demo: FINAL-Bench/Darwin-31B-Opus

πŸ”¬ What Darwin V6 Does

Conventional merging tools (mergekit, etc.) apply a single ratio to all tensors. Set ratio=0.5 and all 1,188 tensors blend identically, with no distinction between which tensors matter for reasoning versus coding.

Darwin V6 diagnoses both parents at the tensor level before merging. It measures Shannon entropy, standard deviation, and L2 norm for every tensor, then passes 5 diagnostic probes (REASONING, CODE, MATH, KNOWLEDGE, LANGUAGE) through the model to determine layer-wise functional importance. Each of the 1,188 tensors receives an independent optimal ratio.

combined = static(entropy/std/norm) x 0.4 + probe(cosine_distance) x 0.6
final_ratio = mri_ratio x mri_trust + genome_ratio x (1 - mri_trust)

When one parent is overwhelmingly superior for a tensor (ratio < 0.15 or > 0.85), Darwin transplants it directly without interpolation. The mri_trust parameter itself is optimized by CMA-ES evolutionary search, so optimal transplant intensity is determined automatically. After merging, a Health Check compares the child against both parents layer-by-layer to detect interference or function loss.

🧬 Parent Models
Father: google/gemma-4-31B-it
Mother: TeichAI/gemma-4-31B-it-Claude-Opus-Distill

🧬 Results
Compared under identical conditions (same 50 questions, same seed, greedy, thinking mode):
Father: 60.0% (30/50)
Darwin-31B-Opus: 66.0% (33/50) β€” +10% relative improvement
ARC-Challenge: 82.89% (loglikelihood, zero-shot, 200 questions)
Optimal genome found by evolution:
ffn_ratio=0.93 β€” FFN layers strongly favor Mother (Claude Opus Distill)
block_5 (L50-L59)=0.86 and more...

Are you thinking about making smaller Darwins?

Β·

We already have a smaller model here: https://huggingface.co/FINAL-Bench/Darwin-9B-Opus

Do you think an even smaller one is needed?

I like !

Do you have a paper documenting what "MRI" means?

Β·

MRI(Model Region Imaging)
MRI stands for Model Region Imaging, a framework for scanning a black-box LLM to detect and visualize functional regions such as language, mathematics, reasoning, and coding.