Granite 4.0 Collection Ampere's quantization formats (Q4_K_4 / Q8R16) require Ampere optimized llama.cpp available here: https://hub.docker.com/r/amperecomputingai/llama.cpp • 2 items • Updated 16 days ago
GPT-OSS Collection With gpt-oss models we recommend using native mxfp4 quantization. • 3 items • Updated Sep 26, 2025 • 1
Qwen 2.5 Collection Ampere's quantization formats (Q4_K_4 / Q8R16) require Ampere optimized llama.cpp available here: https://hub.docker.com/r/amperecomputingai/llama.cpp • 8 items • Updated Sep 16, 2025
Qwen 2.5 Collection Ampere's quantization formats (Q4_K_4 / Q8R16) require Ampere optimized llama.cpp available here: https://hub.docker.com/r/amperecomputingai/llama.cpp • 8 items • Updated Sep 16, 2025
Qwen 3 Collection Ampere's quantization formats (Q4_K_4 / Q8R16) require Ampere optimized llama.cpp available here: https://hub.docker.com/r/amperecomputingai/llama.cpp • 14 items • Updated Sep 15, 2025 • 2