umr2015 commited on
Commit
dcb0d04
·
verified ·
1 Parent(s): 03b3034

Keep CPU Basic Space running with MiniCPM5 fallback

Browse files
Files changed (3) hide show
  1. README.md +12 -6
  2. app.py +2 -2
  3. requirements.txt +0 -2
README.md CHANGED
@@ -41,19 +41,25 @@ Spaces, and models under 32B parameters.
41
  - Model repo: `openbmb/MiniCPM5-1B-GGUF`
42
  - File: `MiniCPM5-1B-Q4_K_M.gguf`
43
  - Parameter count: `1.08B`
44
- - Runtime target: local GGUF through `llama-cpp-python`
 
45
  - Cloud LLM APIs: none
46
 
47
- If the local model runtime is unavailable during a cold start, the app falls
48
- back to a deterministic parser and makes that visible in the parser trace. The
49
- route optimizer never depends on hidden model output: every route, time window,
50
- lateness minute, and baseline delta is computed deterministically.
 
 
 
 
 
51
 
52
  ## Current Scope
53
 
54
  Included now:
55
 
56
- - MiniCPM5 text constraint parsing.
57
  - Capacity-safe multi-trip route planning.
58
  - Manual baseline comparison.
59
  - Synthetic sample data only.
 
41
  - Model repo: `openbmb/MiniCPM5-1B-GGUF`
42
  - File: `MiniCPM5-1B-Q4_K_M.gguf`
43
  - Parameter count: `1.08B`
44
+ - Runtime target: local GGUF through `llama-cpp-python` when the Space runtime
45
+ has a prebuilt llama.cpp wheel or enough memory for the dependency
46
  - Cloud LLM APIs: none
47
 
48
+ The public CPU Basic Space keeps the app responsive by treating the MiniCPM5
49
+ runtime as optional. If `llama-cpp-python` is unavailable during a cold start,
50
+ the app falls back to a deterministic parser and makes that visible in the
51
+ parser trace. On a larger Space or local machine with `llama-cpp-python`
52
+ installed, the same code path downloads `openbmb/MiniCPM5-1B-GGUF` and uses
53
+ MiniCPM5 for note parsing.
54
+
55
+ The route optimizer never depends on hidden model output: every route, time
56
+ window, lateness minute, and baseline delta is computed deterministically.
57
 
58
  ## Current Scope
59
 
60
  Included now:
61
 
62
+ - MiniCPM5-ready text constraint parsing with deterministic CPU fallback.
63
  - Capacity-safe multi-trip route planning.
64
  - Manual baseline comparison.
65
  - Synthetic sample data only.
app.py CHANGED
@@ -732,7 +732,7 @@ with gr.Blocks(
732
  <strong>Small-model core:</strong><br>
733
  <code>{MINICPM_REPO}</code><br>
734
  <code>{MINICPM_FILE}</code><br>
735
- The model parses human dispatch notes into JSON constraints. The route math is deterministic and auditable.
736
  </div>
737
  """
738
  )
@@ -747,7 +747,7 @@ Leave the file empty to run the included sample route.
747
 
748
  metrics = gr.Markdown()
749
  constraints = gr.Markdown(
750
- "### OpenBMB MiniCPM5 Constraint Parse\nClick **Plan route** to parse notes with MiniCPM5-1B-GGUF and build the route plan."
751
  )
752
  table = gr.Dataframe(label="Optimized route", interactive=False)
753
  cards = gr.HTML(label="Driver cards")
 
732
  <strong>Small-model core:</strong><br>
733
  <code>{MINICPM_REPO}</code><br>
734
  <code>{MINICPM_FILE}</code><br>
735
+ MiniCPM5 parses human dispatch notes when llama.cpp is available. CPU Basic falls back to the same auditable constraint schema.
736
  </div>
737
  """
738
  )
 
747
 
748
  metrics = gr.Markdown()
749
  constraints = gr.Markdown(
750
+ "### OpenBMB MiniCPM5 Constraint Parse\nClick **Plan route** to parse notes with MiniCPM5-1B-GGUF when available, or the deterministic fallback on CPU Basic."
751
  )
752
  table = gr.Dataframe(label="Optimized route", interactive=False)
753
  cards = gr.HTML(label="Driver cards")
requirements.txt CHANGED
@@ -1,5 +1,3 @@
1
  gradio>=6.14.0
2
  pandas>=2.2.0
3
  huggingface_hub>=0.34.0
4
- --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu
5
- llama-cpp-python==0.3.9
 
1
  gradio>=6.14.0
2
  pandas>=2.2.0
3
  huggingface_hub>=0.34.0