Spaces:

build-small-hackathon
/

Case-Lantern

Running

lastmass commited on 5 days ago

Commit

01edaff

1 Parent(s): 61f49a8

Upgrade llama-cpp-python CPU wheel

Files changed (2) hide show

README.md CHANGED Viewed

@@ -52,11 +52,12 @@ launch and cached.
 If you deploy on **ZeroGPU**, keep the prebuilt CPU `llama-cpp-python` wheel.
 The `requirements.txt` file uses the CPU wheel index
-(`llama-cpp-python/whl/cpu`) plus `--only-binary=llama-cpp-python`, so the Space
-will fail fast instead of trying to compile llama.cpp from source. Do not use the
-CUDA wheel URL (`llama-cpp-python/whl/cu124`) unless the Space image also
-provides CUDA runtime libraries such as `libcudart.so.12`; otherwise model
-loading can fail when the first button click triggers inference.
 - Set `DEMO_MODE=auto` (default) to allow a graceful scripted fallback if the
   model cannot load.

 If you deploy on **ZeroGPU**, keep the prebuilt CPU `llama-cpp-python` wheel.
 The `requirements.txt` file uses the CPU wheel index
+(`llama-cpp-python/whl/cpu`) plus `--only-binary=llama-cpp-python`, and pins to
+the latest available prebuilt wheel in that index. This keeps the Space from
+trying to compile llama.cpp from source. Do not use the CUDA wheel URL
+(`llama-cpp-python/whl/cu124`) unless the Space image also provides CUDA runtime
+libraries such as `libcudart.so.12`; otherwise model loading can fail when the
+first button click triggers inference.
 - Set `DEMO_MODE=auto` (default) to allow a graceful scripted fallback if the
   model cannot load.

requirements.txt CHANGED Viewed

@@ -2,4 +2,4 @@
 --only-binary=llama-cpp-python
 gradio==6.15.2
-llama-cpp-python==0.3.22

 --only-binary=llama-cpp-python
 gradio==6.15.2
+llama-cpp-python==0.3.25