GGUF models
Collection
17 items
•
Updated
This the GGUF models are quantized from ibm-granite/granite-4.0-tiny-base-preview
Granite-4.0-Tiny-Base-Preview is a 7B-parameter hybrid mixture-of-experts (MoE) language model featuring a 128k token context window. The architecture leverages Mamba-2, superimposed with a softmax attention for enhanced expressiveness, with no positional encoding for better length generalization.
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit