Granite 4.0 Collection Ampere's quantization formats (Q4_K_4 / Q8R16) require Ampere optimized llama.cpp available here: https://hub.docker.com/r/amperecomputingai/llama.cpp • 2 items • Updated Jan 13
GPT-OSS Collection With gpt-oss models we recommend using native mxfp4 quantization. • 3 items • Updated Sep 26, 2025 • 1
Qwen 2.5 Collection Ampere's quantization formats (Q4_K_4 / Q8R16) require Ampere optimized llama.cpp available here: https://hub.docker.com/r/amperecomputingai/llama.cpp • 8 items • Updated Sep 16, 2025
Qwen 2.5 Collection Ampere's quantization formats (Q4_K_4 / Q8R16) require Ampere optimized llama.cpp available here: https://hub.docker.com/r/amperecomputingai/llama.cpp • 8 items • Updated Sep 16, 2025