Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
nightmedia 
posted an update 22 days ago
Post
623
IBM Granite 4.1 series

New models came up, here is how they compare to models in the same size:

Brainwaves
arc   arc/e boolq hswag obkqa piqa  wino
granite-4.1-30b
mxfp8    0.456,0.572,0.897,0.621,0.444,0.757,0.616
mxfp4    0.453,0.565,0.892,0.624,0.442,0.759,0.585
qx86-hi  0.451,0.568,0.897,0.636,0.440,0.763,0.598

granite-4.1-8b
mxfp8    0.486,0.666,0.875,0.636,0.450,0.766,0.631

granite-4.1-3b
mxfp8    0.406,0.581,0.821,0.484,0.434,0.712,0.559


Gemma-4

quant    arc   arc/e boolq hswag obkqa piqa  wino
gemma-4-E4B-it
mxfp8    0.480,0.656,0.797,0.608,0.400,0.755,0.665
mxfp4    0.455,0.607,0.851,0.585,0.402,0.744,0.651

gemma-4-E2B-it
mxfp8    0.376,0.464,0.743,0.490,0.378,0.709,0.622
mxfp4    0.380,0.451,0.762,0.494,0.374,0.699,0.594


Qwen3.5

quant    arc   arc/e boolq hswag obkqa piqa  wino
Qwen3.5-9B
mxfp8    0.417,0.458,0.623,0.634,0.338,0.737,0.639
mxfp4    0.419,0.472,0.622,0.634,0.352,0.739,0.644

Qwen3.5-4B
mxfp8    0.392,0.441,0.627,0.601,0.360,0.739,0.590
mxfp4    0.371,0.444,0.632,0.585,0.356,0.732,0.548


Right out of the gate, IBM delivered models with better starting metrics than both Gemma and Qwen. Training these should be fun :)

Here is the Nightmedia collection of Granite models

https://huggingface.co/collections/nightmedia/ibm-granite-41

-G

Abliterated models performance

treadon/granite-4.1-3b-Abliterated-AND-Disinhibited

quant    arc   arc/e boolq hswag obkqa piqa  wino
mxfp8    0.405,0.598,0.843,0.520,0.442,0.713,0.582
Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8   11.595 ± 0.129   6.58 GB       1594

granite-4.1-3b
mxfp8    0.406,0.581,0.821,0.484,0.434,0.712,0.559
Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8   11.346 ± 0.127   6.58 GB       1690

treadon/granite-4.1-8b-Abliterated-AND-Disinhibited

mxfp8    0.496,0.692,0.864,0.666,0.466,0.770,0.632
Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8    9.518 ± 0.094   11.75 GB      686

granite-4.1-8b
mxfp8    0.486,0.666,0.875,0.636,0.450,0.766,0.631
Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8   10.134 ± 0.107   12.17 GB      668
In this post