Spaces:

build-small-hackathon
/

tiny-dispatch-coach

Running

umr2015 commited on 3 days ago

Commit

dcb0d04

verified ·

1 Parent(s): 03b3034

Keep CPU Basic Space running with MiniCPM5 fallback

Files changed (3) hide show

README.md CHANGED Viewed

@@ -41,19 +41,25 @@ Spaces, and models under 32B parameters.
 - Model repo: `openbmb/MiniCPM5-1B-GGUF`
 - File: `MiniCPM5-1B-Q4_K_M.gguf`
 - Parameter count: `1.08B`
-- Runtime target: local GGUF through `llama-cpp-python`
 - Cloud LLM APIs: none
-If the local model runtime is unavailable during a cold start, the app falls
-back to a deterministic parser and makes that visible in the parser trace. The
-route optimizer never depends on hidden model output: every route, time window,
-lateness minute, and baseline delta is computed deterministically.
 ## Current Scope
 Included now:
-- MiniCPM5 text constraint parsing.
 - Capacity-safe multi-trip route planning.
 - Manual baseline comparison.
 - Synthetic sample data only.

 - Model repo: `openbmb/MiniCPM5-1B-GGUF`
 - File: `MiniCPM5-1B-Q4_K_M.gguf`
 - Parameter count: `1.08B`
+- Runtime target: local GGUF through `llama-cpp-python` when the Space runtime
+  has a prebuilt llama.cpp wheel or enough memory for the dependency
 - Cloud LLM APIs: none
+The public CPU Basic Space keeps the app responsive by treating the MiniCPM5
+runtime as optional. If `llama-cpp-python` is unavailable during a cold start,
+the app falls back to a deterministic parser and makes that visible in the
+parser trace. On a larger Space or local machine with `llama-cpp-python`
+installed, the same code path downloads `openbmb/MiniCPM5-1B-GGUF` and uses
+MiniCPM5 for note parsing.
+The route optimizer never depends on hidden model output: every route, time
+window, lateness minute, and baseline delta is computed deterministically.
 ## Current Scope
 Included now:
+- MiniCPM5-ready text constraint parsing with deterministic CPU fallback.
 - Capacity-safe multi-trip route planning.
 - Manual baseline comparison.
 - Synthetic sample data only.

app.py CHANGED Viewed

@@ -732,7 +732,7 @@ with gr.Blocks(
   <strong>Small-model core:</strong><br>
   <code>{MINICPM_REPO}</code><br>
   <code>{MINICPM_FILE}</code><br>
-  The model parses human dispatch notes into JSON constraints. The route math is deterministic and auditable.
 </div>
 """
             )
@@ -747,7 +747,7 @@ Leave the file empty to run the included sample route.
     metrics = gr.Markdown()
     constraints = gr.Markdown(
-        "### OpenBMB MiniCPM5 Constraint Parse\nClick **Plan route** to parse notes with MiniCPM5-1B-GGUF and build the route plan."
     )
     table = gr.Dataframe(label="Optimized route", interactive=False)
     cards = gr.HTML(label="Driver cards")

   <strong>Small-model core:</strong><br>
   <code>{MINICPM_REPO}</code><br>
   <code>{MINICPM_FILE}</code><br>
+  MiniCPM5 parses human dispatch notes when llama.cpp is available. CPU Basic falls back to the same auditable constraint schema.
 </div>
 """
             )
     metrics = gr.Markdown()
     constraints = gr.Markdown(
+        "### OpenBMB MiniCPM5 Constraint Parse\nClick **Plan route** to parse notes with MiniCPM5-1B-GGUF when available, or the deterministic fallback on CPU Basic."
     )
     table = gr.Dataframe(label="Optimized route", interactive=False)
     cards = gr.HTML(label="Driver cards")

requirements.txt CHANGED Viewed

@@ -1,5 +1,3 @@
 gradio>=6.14.0
 pandas>=2.2.0
 huggingface_hub>=0.34.0
---extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu
-llama-cpp-python==0.3.9

 gradio>=6.14.0
 pandas>=2.2.0
 huggingface_hub>=0.34.0