@nightmedia on Hugging Face: "IBM Granite 4.1 series New models came up, here is how they compare to models…"

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

nightmedia

posted an update 22 days ago

Post

623

IBM Granite 4.1 series

New models came up, here is how they compare to models in the same size:

Brainwaves

arc   arc/e boolq hswag obkqa piqa  wino
granite-4.1-30b
mxfp8    0.456,0.572,0.897,0.621,0.444,0.757,0.616
mxfp4    0.453,0.565,0.892,0.624,0.442,0.759,0.585
qx86-hi  0.451,0.568,0.897,0.636,0.440,0.763,0.598

granite-4.1-8b
mxfp8    0.486,0.666,0.875,0.636,0.450,0.766,0.631

granite-4.1-3b
mxfp8    0.406,0.581,0.821,0.484,0.434,0.712,0.559

Gemma-4

quant    arc   arc/e boolq hswag obkqa piqa  wino
gemma-4-E4B-it
mxfp8    0.480,0.656,0.797,0.608,0.400,0.755,0.665
mxfp4    0.455,0.607,0.851,0.585,0.402,0.744,0.651

gemma-4-E2B-it
mxfp8    0.376,0.464,0.743,0.490,0.378,0.709,0.622
mxfp4    0.380,0.451,0.762,0.494,0.374,0.699,0.594

Qwen3.5

quant    arc   arc/e boolq hswag obkqa piqa  wino
Qwen3.5-9B
mxfp8    0.417,0.458,0.623,0.634,0.338,0.737,0.639
mxfp4    0.419,0.472,0.622,0.634,0.352,0.739,0.644

Qwen3.5-4B
mxfp8    0.392,0.441,0.627,0.601,0.360,0.739,0.590
mxfp4    0.371,0.444,0.632,0.585,0.356,0.732,0.548

Right out of the gate, IBM delivered models with better starting metrics than both Gemma and Qwen. Training these should be fun :)

Here is the Nightmedia collection of Granite models

https://huggingface.co/collections/nightmedia/ibm-granite-41

-G

nightmedia

21 days ago

•

edited 21 days ago

Abliterated models performance

treadon/granite-4.1-3b-Abliterated-AND-Disinhibited

quant    arc   arc/e boolq hswag obkqa piqa  wino
mxfp8    0.405,0.598,0.843,0.520,0.442,0.713,0.582
Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8   11.595 ± 0.129   6.58 GB       1594

granite-4.1-3b
mxfp8    0.406,0.581,0.821,0.484,0.434,0.712,0.559
Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8   11.346 ± 0.127   6.58 GB       1690

treadon/granite-4.1-8b-Abliterated-AND-Disinhibited

mxfp8    0.496,0.692,0.864,0.666,0.466,0.770,0.632
Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8    9.518 ± 0.094   11.75 GB      686

granite-4.1-8b
mxfp8    0.486,0.666,0.875,0.636,0.450,0.766,0.631
Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8   10.134 ± 0.107   12.17 GB      668

In this post

nightmedia Gheorghe Chesler