Commit History

Re-deploy llama.cpp + GGUF CPU serving (default Q4_K_M); fast CPU inference
e44cdab
verified

Bhuvandesai commited on

Revert "Migrate CPU serving to llama.cpp + Q5_K_M GGUF (was bf16 transformers)"
d031aeb

Bhuvandesai commited on

Migrate CPU serving to llama.cpp + Q5_K_M GGUF (was bf16 transformers)
564ad28

Bhuvandesai commited on