Keep CPU Basic Space running with MiniCPM5 fallback
Browse files- README.md +12 -6
- app.py +2 -2
- requirements.txt +0 -2
README.md
CHANGED
|
@@ -41,19 +41,25 @@ Spaces, and models under 32B parameters.
|
|
| 41 |
- Model repo: `openbmb/MiniCPM5-1B-GGUF`
|
| 42 |
- File: `MiniCPM5-1B-Q4_K_M.gguf`
|
| 43 |
- Parameter count: `1.08B`
|
| 44 |
-
- Runtime target: local GGUF through `llama-cpp-python`
|
|
|
|
| 45 |
- Cloud LLM APIs: none
|
| 46 |
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
|
| 52 |
## Current Scope
|
| 53 |
|
| 54 |
Included now:
|
| 55 |
|
| 56 |
-
- MiniCPM5 text constraint parsing.
|
| 57 |
- Capacity-safe multi-trip route planning.
|
| 58 |
- Manual baseline comparison.
|
| 59 |
- Synthetic sample data only.
|
|
|
|
| 41 |
- Model repo: `openbmb/MiniCPM5-1B-GGUF`
|
| 42 |
- File: `MiniCPM5-1B-Q4_K_M.gguf`
|
| 43 |
- Parameter count: `1.08B`
|
| 44 |
+
- Runtime target: local GGUF through `llama-cpp-python` when the Space runtime
|
| 45 |
+
has a prebuilt llama.cpp wheel or enough memory for the dependency
|
| 46 |
- Cloud LLM APIs: none
|
| 47 |
|
| 48 |
+
The public CPU Basic Space keeps the app responsive by treating the MiniCPM5
|
| 49 |
+
runtime as optional. If `llama-cpp-python` is unavailable during a cold start,
|
| 50 |
+
the app falls back to a deterministic parser and makes that visible in the
|
| 51 |
+
parser trace. On a larger Space or local machine with `llama-cpp-python`
|
| 52 |
+
installed, the same code path downloads `openbmb/MiniCPM5-1B-GGUF` and uses
|
| 53 |
+
MiniCPM5 for note parsing.
|
| 54 |
+
|
| 55 |
+
The route optimizer never depends on hidden model output: every route, time
|
| 56 |
+
window, lateness minute, and baseline delta is computed deterministically.
|
| 57 |
|
| 58 |
## Current Scope
|
| 59 |
|
| 60 |
Included now:
|
| 61 |
|
| 62 |
+
- MiniCPM5-ready text constraint parsing with deterministic CPU fallback.
|
| 63 |
- Capacity-safe multi-trip route planning.
|
| 64 |
- Manual baseline comparison.
|
| 65 |
- Synthetic sample data only.
|
app.py
CHANGED
|
@@ -732,7 +732,7 @@ with gr.Blocks(
|
|
| 732 |
<strong>Small-model core:</strong><br>
|
| 733 |
<code>{MINICPM_REPO}</code><br>
|
| 734 |
<code>{MINICPM_FILE}</code><br>
|
| 735 |
-
|
| 736 |
</div>
|
| 737 |
"""
|
| 738 |
)
|
|
@@ -747,7 +747,7 @@ Leave the file empty to run the included sample route.
|
|
| 747 |
|
| 748 |
metrics = gr.Markdown()
|
| 749 |
constraints = gr.Markdown(
|
| 750 |
-
"### OpenBMB MiniCPM5 Constraint Parse\nClick **Plan route** to parse notes with MiniCPM5-1B-GGUF
|
| 751 |
)
|
| 752 |
table = gr.Dataframe(label="Optimized route", interactive=False)
|
| 753 |
cards = gr.HTML(label="Driver cards")
|
|
|
|
| 732 |
<strong>Small-model core:</strong><br>
|
| 733 |
<code>{MINICPM_REPO}</code><br>
|
| 734 |
<code>{MINICPM_FILE}</code><br>
|
| 735 |
+
MiniCPM5 parses human dispatch notes when llama.cpp is available. CPU Basic falls back to the same auditable constraint schema.
|
| 736 |
</div>
|
| 737 |
"""
|
| 738 |
)
|
|
|
|
| 747 |
|
| 748 |
metrics = gr.Markdown()
|
| 749 |
constraints = gr.Markdown(
|
| 750 |
+
"### OpenBMB MiniCPM5 Constraint Parse\nClick **Plan route** to parse notes with MiniCPM5-1B-GGUF when available, or the deterministic fallback on CPU Basic."
|
| 751 |
)
|
| 752 |
table = gr.Dataframe(label="Optimized route", interactive=False)
|
| 753 |
cards = gr.HTML(label="Driver cards")
|
requirements.txt
CHANGED
|
@@ -1,5 +1,3 @@
|
|
| 1 |
gradio>=6.14.0
|
| 2 |
pandas>=2.2.0
|
| 3 |
huggingface_hub>=0.34.0
|
| 4 |
-
--extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu
|
| 5 |
-
llama-cpp-python==0.3.9
|
|
|
|
| 1 |
gradio>=6.14.0
|
| 2 |
pandas>=2.2.0
|
| 3 |
huggingface_hub>=0.34.0
|
|
|
|
|
|