Hybrid Intelligence 0.5B

This is the first public checkpoint of a hybrid intelligence system from Merlin Research.

Hybrid intelligence means the system is not purely statistical (LLM) and not purely symbolic — it couples a language model with a neuromorphic Biological Neural Network (BNN) that observes, evaluates, and selects the LLM's outputs in real time. The two components evolve together: the LLM generates, the BNN judges, and both improve from the same stream of experience.

Architecture: Two Systems, One Loop

The LLM (Falcon H1 0.5B) generates multiple candidate answers. The BNN encodes uncertainty signals as neuromorphic spike trains and selects the best candidate. The correctness of that selection feeds back as training signal for both the BNN and (via DPO) the LLM itself.

The BNN Component

The BNN is inspired by biological neural circuits. It uses Leaky Integrate-and-Fire (LIF) neurons with 4 time scales (decay constants: 0.70, 0.80, 0.85, 0.95) and generates spikes via Poisson statistics — the same model used to describe real neuron firing in cortex. This gives the selector a temporal memory of the generation process, not just a snapshot.

Runs entirely in pure NumPy — no GPU, no special hardware. Total weights: ~8 KB.

Key Discovery: Calibration Inversion

A small LLM is systematically more confident on wrong answers than on right ones.

We measured first-token entropy across thousands of hybrid loop iterations. Correct answers show higher entropy and lower probability margin than wrong ones (t=2.28 and t=−3.41 respectively). The LLM "hesitates" more when it is actually correct.

This is the core insight the BNN learned to exploit. Rather than trusting the model's confidence, the hybrid system uses neuromorphic signals to see past the model's miscalibration and identify the genuinely better answer.

How the System Was Built: 30,000 Experiments

Merlin runs 6 autonomous researchers every night (01:00–07:00):

Process	Role
`hybrid`	Main hybrid loop — generates, encodes, selects, evaluates
`bnn_trainer`	Retrains BNN every 5 min from accumulated experience
`candidate_pool`	Generates diverse candidates (4 sampling strategies)
`neuro_coupling`	BNN-guided token-by-token temperature adjustment
`ml`	Collects DPO preference pairs for LLM fine-tuning
`meta_analyzer`	Updates evolutionary mutation weights before each session

Encoder parameters (pulse width, burst count, frequency, entropy scale) are found by evolutionary search — propose mutation, run 100 benchmark questions, keep if improvement ≥ 0.5pp. This process ran for ~30,000 experiments and produced 38+ confirmed improvements before this checkpoint.

Results

System	Accuracy
Raw Falcon H1 0.5B (baseline)	21.0%
Hybrid Intelligence (BNN + LLM)	~26–28%

+5–7 percentage points improvement. The gap is entirely from the hybrid loop — the BNN selector adds no latency perceivable to the user (~1ms overhead).

DPO Fine-Tuning

The LLM component was fine-tuned with DPO on 4,234 preference pairs collected autonomously by the ml researcher over multiple nights.

LoRA: r=16, α=32, target modules: q_proj + v_proj
β=0.1, 3 epochs, cosine schedule

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "MerlinSafety/falcon-h1-0.5b-dpo",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "MerlinSafety/falcon-h1-0.5b-dpo",
    trust_remote_code=True,
)

prompt = "Question: What is the capital of France?\nAnswer:"
inputs = tokenizer(prompt, return_tensors="pt")
out = model.generate(**inputs, max_new_tokens=40, do_sample=False)
print(tokenizer.decode(out[0][inputs['input_ids'].shape[1]:]))

Status & Roadmap

This is Checkpoint #1. The hybrid loop continues to run and improve.

Stronger base model (Qwen2.5-Math-1.5B or any Qwen3.5)
Scale DPO dataset to 10,000+ pairs
Online BNN adaptation during inference
Multi-model candidate pool
We hope to collaborate with Cortical Labs — running the hybrid loop on biological neurons (CL1) as a true wetware selector

Merlin Research — building hybrid intelligence, one checkpoint at a time.

Downloads last month: 12

Safetensors

Model size

0.5B params

Tensor type

BF16

Model tree for MerlinSafety/HybridIntelligence-0.5B

Base model

tiiuae/Falcon-H1-0.5B-Base

Finetuned

(8)

this model