Spaces:

build-small-hackathon
/

tiny-dispatch-coach

Running

App Files Files Community

umr2015 commited on Jun 10

Commit

e28a9bc

verified ·

1 Parent(s): 109a37f

Default to fast parser with optional MiniCPM5 mode

Browse files

Files changed (4) hide show

FIELD_NOTES.md +2 -1
README.md +5 -2
SUBMISSION.md +5 -4
app.py +19 -6

FIELD_NOTES.md CHANGED Viewed

@@ -30,7 +30,8 @@ This directly targets the hackathon signals:
 - Backyard AI: practical helper for a local delivery operator.
 - Off the Grid: no cloud LLM API.
-- Llama Champion: MiniCPM5 GGUF is loaded through llama.cpp when available.
 - Sharing is Caring: the planner trace is included as `agent_trace.json`.
 ## What the model does

 - Backyard AI: practical helper for a local delivery operator.
 - Off the Grid: no cloud LLM API.
+- Llama Champion: MiniCPM5 GGUF is available through llama.cpp, behind an
+  explicit checkbox so the public CPU Basic demo remains responsive.
 - Sharing is Caring: the planner trace is included as `agent_trace.json`.
 ## What the model does

README.md CHANGED Viewed

@@ -50,8 +50,11 @@ Spaces, and models under 32B parameters.
 - Cloud LLM APIs: none
 The Space preloads the Q4 MiniCPM5 GGUF file and installs the CPU llama.cpp
-wheel. If a runtime cold start still cannot load the model, the app falls back
-to a deterministic parser and makes that visible in the parser trace.
 The route optimizer never depends on hidden model output: every route, time
 window, lateness minute, and baseline delta is computed deterministically.

 - Cloud LLM APIs: none
 The Space preloads the Q4 MiniCPM5 GGUF file and installs the CPU llama.cpp
+wheel. The public CPU Basic demo defaults to fast deterministic parsing so
+judges can inspect the route planner immediately. The MiniCPM5 parser can be
+enabled from the checkbox in the UI; if a runtime cold start cannot load the
+model or the model returns invalid JSON, the app falls back to the deterministic
+parser and makes that visible in the parser trace.
 The route optimizer never depends on hidden model output: every route, time
 window, lateness minute, and baseline delta is computed deterministically.

SUBMISSION.md CHANGED Viewed

@@ -8,7 +8,7 @@
 - Model: `openbmb/MiniCPM5-1B-GGUF`
 - File: `MiniCPM5-1B-Q4_K_M.gguf`
 - Parameters: 1.08B
-- Runtime path: local GGUF through `llama-cpp-python`
 ## Why It Fits Build Small
@@ -55,9 +55,10 @@ This makes the small model useful because the task is bounded:
 I built Tiny Dispatch Coach for the Build Small Hackathon:
 Small delivery teams often have messy notes, tight windows, and a van capacity
-constraint. This Gradio Space uses OpenBMB MiniCPM5-1B-GGUF to parse dispatch
-notes into route constraints, then a deterministic planner creates auditable
-driver routes.
 No cloud LLM API. Synthetic demo data only. 1.08B params.

 - Model: `openbmb/MiniCPM5-1B-GGUF`
 - File: `MiniCPM5-1B-Q4_K_M.gguf`
 - Parameters: 1.08B
+- Runtime path: local GGUF through `llama-cpp-python`, enabled by checkbox
 ## Why It Fits Build Small
 I built Tiny Dispatch Coach for the Build Small Hackathon:
 Small delivery teams often have messy notes, tight windows, and a van capacity
+constraint. This Gradio Space uses a MiniCPM5-ready constraint parser plus a
+deterministic planner to create auditable driver routes. The OpenBMB
+MiniCPM5-1B-GGUF path runs locally through llama.cpp when enabled; default fast
+mode keeps the public CPU Basic demo responsive.
 No cloud LLM API. Synthetic demo data only. 1.08B params.

app.py CHANGED Viewed

@@ -224,8 +224,9 @@ def get_minicpm_llm():
         model_path = hf_hub_download(repo_id=MINICPM_REPO, filename=MINICPM_FILE)
         return Llama(
             model_path=model_path,
-            n_ctx=2048,
             n_threads=max(1, min(4, os.cpu_count() or 2)),
             n_gpu_layers=0,
             verbose=False,
         )
@@ -233,8 +234,15 @@ def get_minicpm_llm():
         return None
-def minicpm_parse_dispatch_notes(notes: str) -> Tuple[Dict[str, object], str]:
     fallback = normalize_constraints(parse_dispatch_notes(notes))
     llm = get_minicpm_llm()
     if llm is None:
         fallback["source"] = "rule-fallback"
@@ -262,7 +270,7 @@ Dispatcher notes: {notes}
     try:
         result = llm(
             prompt,
-            max_tokens=180,
             temperature=0.0,
             top_p=1.0,
             stop=["<|im_end|>", "\n\n\n"],
@@ -580,9 +588,9 @@ def route_map(plan: List[PlanStop]) -> str:
 """
-def analyze(file_obj, notes: str):
     stops = parse_orders(file_obj)
-    constraints, model_trace = minicpm_parse_dispatch_notes(notes)
     auto_routes = build_capacity_routes(stops, constraints)
     manual = manual_route(stops)
     auto_plan, auto_metrics = simulate_routes(auto_routes, int(constraints["depot_start"]))
@@ -748,6 +756,11 @@ with gr.Blocks(
                 value=DEFAULT_NOTES,
                 lines=5,
             )
             run = gr.Button("Plan route", variant="primary")
         with gr.Column(scale=1):
             gr.HTML(
@@ -785,7 +798,7 @@ OpenBMB MiniCPM5, 1.08B parameters, local GGUF path, no cloud LLM API, synthetic
     run.click(
         analyze,
-        inputs=[order_file, notes],
         outputs=[metrics, constraints, table, cards, map_html],
     )

         model_path = hf_hub_download(repo_id=MINICPM_REPO, filename=MINICPM_FILE)
         return Llama(
             model_path=model_path,
+            n_ctx=768,
             n_threads=max(1, min(4, os.cpu_count() or 2)),
+            n_batch=32,
             n_gpu_layers=0,
             verbose=False,
         )
         return None
+def minicpm_parse_dispatch_notes(notes: str, use_minicpm: bool = False) -> Tuple[Dict[str, object], str]:
     fallback = normalize_constraints(parse_dispatch_notes(notes))
+    if not use_minicpm:
+        fallback["source"] = "rule-fallback"
+        return (
+            fallback,
+            "Fast CPU Basic mode used the deterministic parser. Enable MiniCPM5 parser to run the local GGUF model path.",
+        )
     llm = get_minicpm_llm()
     if llm is None:
         fallback["source"] = "rule-fallback"
     try:
         result = llm(
             prompt,
+            max_tokens=96,
             temperature=0.0,
             top_p=1.0,
             stop=["<|im_end|>", "\n\n\n"],
 """
+def analyze(file_obj, notes: str, use_minicpm: bool):
     stops = parse_orders(file_obj)
+    constraints, model_trace = minicpm_parse_dispatch_notes(notes, use_minicpm)
     auto_routes = build_capacity_routes(stops, constraints)
     manual = manual_route(stops)
     auto_plan, auto_metrics = simulate_routes(auto_routes, int(constraints["depot_start"]))
                 value=DEFAULT_NOTES,
                 lines=5,
             )
+            use_minicpm = gr.Checkbox(
+                label="Use MiniCPM5 parser",
+                value=False,
+                info="Optional on CPU Basic. Default fast mode keeps the demo responsive.",
+            )
             run = gr.Button("Plan route", variant="primary")
         with gr.Column(scale=1):
             gr.HTML(
     run.click(
         analyze,
+        inputs=[order_file, notes, use_minicpm],
         outputs=[metrics, constraints, table, cards, map_html],
     )