CPU-only inference broken with latest llama.cpp?

#4
by dinerburger - opened

Went to try this model today on llama.cpp using only CPU and it appears to fail if you’re doing CPU-only inference. Here’s the upstream tracking ticket but I figured you might want it on your radar as well. https://github.com/ggml-org/llama.cpp/issues/19184

Sign up or log in to comment