noticecheck

Running on Zero

App Files Files Community

noticecheck / docs /model_endpoint_testing.md

kingabzpro

Separate Space and local model runtimes

cc3c1a2 19 days ago

preview code

Raw

History Blame Contribute Delete

968 Bytes

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Testing the local model

The Space inference path is:

Custom frontend
  -> queued Gradio backend
  -> Nemotron OCR v2 for screenshot text
  -> app/model_endpoint.py
  -> MiniCPM5-1B through Transformers on ZeroGPU

Local endpoint tests use MiniCPM5-1B GGUF through llama-cpp-python.

Fast checks

Run tests that do not load the model:

python app.py --self-test
python -m unittest

Download the configured GGUF:

python -m pip install -r requirements-local.txt
python app.py --download-model

Run a real text-generation contract test:

python app.py --test-endpoint

The command fails unless the model returns all required fields:

risk_label
simple_explanation
red_flags
safe_next_steps
reply_draft

The old Modal deployments and request scripts are intentionally preserved under experiments/ for comparison and reproducibility. They are not imported by the application.