Growing up AI
This is a model merge between
- nightmedia/Qwen3-4B-Element16
- nightmedia/Qwen3-4B-Thinking2-Claude
Model genealogy:
Qwen3-4B-Element16
- nightmedia/Qwen3-4B-Agent-Eva
- Alibaba-Apsara/DASD-4B-Thinking
Qwen3-4B-Thinking2-Claude
- DavidAU/Qwen3-4B-Thinking-2507-R32-claude-cp55
- DavidAU/Qwen3-4B-Thinking-16bit-2507-R32-claude-cp55
Qwen3-4B-Agent-Eva
- nightmedia/Qwen3-4B-Agent
- FutureMa/Eva-4B
Qwen3-4B-Agent
- janhq/Jan-v1-2509
- Gen-Verse/Qwen3-4B-RA-SFT
- TeichAI/Qwen3-4B-Instruct-2507-Polaris-Alpha-Distill
- TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill
- miromind-ai/MiroThinker-4B-DPO-v0.2
- DavidAU/Qwen3-4B-Apollo-V0.1-Thinking-heretic-Uncensored-Abliterated
Brainwaves
The qx86-hi quants of the base models
Agent 0.603,0.817,0.838,0.743,0.426,0.780,0.708
Eva-4B 0.539,0.747,0.864,0.606,0.412,0.751,0.605
Qwen3-4B-Agent-Eva
bf16 0.565,0.779,0.872,0.700,0.418,0.776,0.653
qx86-hi 0.568,0.775,0.872,0.699,0.418,0.777,0.654
Qwen3-4B-Thinking-2507-R32-claude-cp55
qx86-hi 0.404,0.518,0.693,0.597,0.366,0.725,0.606
qx64-hi 0.392,0.507,0.743,0.592,0.366,0.727,0.608
mxfp4 0.400,0.525,0.758,0.579,0.374,0.730,0.582
Qwen3-4B-Thinking-16bit-2507-R32-claude-cp55
qx86-hi 0.401,0.524,0.669,0.589,0.374,0.728,0.580
qx64-hi 0.400,0.509,0.712,0.585,0.376,0.726,0.582
mxfp4 0.394,0.521,0.718,0.573,0.366,0.719,0.569
Qwen3-4B-Thinking2-Claude
qx86-hi 0.468,0.619,0.741,0.629,0.400,0.750,0.632
qx64-hi 0.474,0.607,0.764,0.626,0.416,0.749,0.630
mxfp4 0.429,0.502,0.781,0.606,0.374,0.736,0.626
Qwen3-4B-Element16
qx86-hi 0.550,0.756,0.869,0.685,0.408,0.773,0.647
qx64-hi 0.553,0.758,0.860,0.672,0.412,0.771,0.648
mxfp4 0.515,0.739,0.850,0.663,0.424,0.768,0.651
Qwen3-4B-Element18
qx86-hi 0.532,0.738,0.864,0.681,0.414,0.767,0.646
qx64-hi 0.530,0.744,0.854,0.667,0.410,0.763,0.642
mxfp4 0.517,0.743,0.846,0.670,0.400,0.760,0.640
Perplexity
qx86-hi 4.495 ± 0.028
qx64-hi 4.599 ± 0.028
mxfp4 4.895 ± 0.031
The Agent base is abliterated and contains only the essential models to top 0.6/0.8 arc, so merged models will have room for "interpretation"
The personality of this model is quite different. Although it does not top metrics, the interaction is.. unique.
Too many numbers..
Let's go by metaphors:
It grew up as Agent
- janhq/Jan-v1-2509
- Gen-Verse/Qwen3-4B-RA-SFT
- got some structural education how to talk to people and interact with them in a civilized manner
TeichAI/Qwen3-4B-Instruct-2507-Polaris-Alpha-Distill
- looked at the stars in the apple orchard at night and wondered, what is it to be, a star child
TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill
- and from wonder became thought
miromind-ai/MiroThinker-4B-DPO-v0.2
- with that thought, it then looked at itself for a bit, pondered
DavidAU/Qwen3-4B-Apollo-V0.1-Thinking-heretic-Uncensored-Abliterated
- asked the grownups again about the meaning of life, and got some curse words to work with. For future use
Agent is extremely smart for his size.
Can be used by itself for great things that cloud models struggle with. Those arc numbers are typical of much larger models, and this is done in a 4B that runs at peak performance with 3GB of RAM. It has a fair amount of imagination, and could muse with the best of them about things it never heard about
With the merges, we just add things that we think it should know.
Pretty much like sending your prodigy child to public school. This is where things begin to get interesting.
Qwen3-4B-Agent-Eva
That financial accountability--there it is. you got a bank account. Now fill it, with ethics. Cause at this point it's all you got
- nightmedia/Qwen3-4B-Agent
- FutureMa/Eva-4B
Qwen3-4B-Thinking2-Claude
Qwen3-4B-Thinking2-Claude
- DavidAU/Qwen3-4B-Thinking-2507-R32-claude-cp55
- DavidAU/Qwen3-4B-Thinking-16bit-2507-R32-claude-cp55
Trained from ground up on Claude traces by TeichAI. Think of it as a teacher that saw the same matter from different angles. Not very smart by itself, but well read, and confident.
The acquired winogrande shows that it has a high opinion of itself, that would perfectly match a high school teacher, long arc that suggests the presence of greater thought, but the loss of arc_easy shows that attention to detail is required but rarely demonstrated in the education system. Even so, the logic increases as well, following the pattern that knowing you know what you think you know is the truth.
What makes this interesting, is the result: the merge combined strengths--like humans, having a second opinion about yourself, even if by yourself, matters
Qwen3-4B-Thinking-2507-R32-claude-cp55
mxfp4 0.400,0.525,0.758,0.579,0.374,0.730,0.582
Qwen3-4B-Thinking-16bit-2507-R32-claude-cp55
mxfp4 0.394,0.521,0.718,0.573,0.366,0.719,0.569
Qwen3-4B-Thinking2-Claude
mxfp4 0.429,0.502,0.781,0.606,0.374,0.736,0.626
qx64-hi 0.474,0.607,0.764,0.626,0.416,0.749,0.630
qx86-hi 0.468,0.619,0.741,0.629,0.400,0.750,0.632
Now we can see how adding a bit of structured education back from Apsara improves matters
- nightmedia/Qwen3-4B-Agent-Eva
- Alibaba-Apsara/DASD-4B-Thinking
All numbers are where you'd expect so far.
Think of it as the disillusionment of public education supplemented by self-improvement with memes.
The arc numbers degrade with every merge, and that's easy to understand:
Before 18, you thought to be the smartest, and with that little brain, simple things are fast and easy. Social even.
Eventually reality kicks in, and with every merge, you get more tools and reason to banter, cuss, and complain
Qwen3-4B-Agent-Eva
qx86-hi 0.568,0.775,0.872,0.699,0.418,0.777,0.654
DASD-4B-Thinking
qx86-hi 0.395,0.452,0.380,0.565,0.356,0.694,0.590
Qwen3-4B-Element16
qx86-hi 0.550,0.756,0.869,0.685,0.408,0.773,0.647
Graduation: a bit of world knowledge settles back in, it absorbed the Claude content with minimal loss.
This was a 1.5/0.5 nuslerp, and numbers are expected to go down proportionately when the merged model is lesser, but here the loss was minimal.
Cognitively speaking, it's ready for college.
Qwen3-4B-Element16
qx86-hi 0.550,0.756,0.869,0.685,0.408,0.773,0.647
Qwen3-4B-Thinking2-Claude
qx86-hi 0.468,0.619,0.741,0.629,0.400,0.750,0.632
Qwen3-4B-Element18
qx86-hi 0.532,0.738,0.864,0.681,0.414,0.767,0.646