Some of the Q4_K_4 quantized models I personally made. They can *ONLY* work on Ampere-optimized llama.cpp / ollama and will not work on anything else.