noticecheck / docs /model_endpoint_testing.md
kingabzpro's picture
Separate Space and local model runtimes
cc3c1a2
|
Raw
History Blame Contribute Delete
968 Bytes

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Testing the local model

The Space inference path is:

Custom frontend
  -> queued Gradio backend
  -> Nemotron OCR v2 for screenshot text
  -> app/model_endpoint.py
  -> MiniCPM5-1B through Transformers on ZeroGPU

Local endpoint tests use MiniCPM5-1B GGUF through llama-cpp-python.

Fast checks

Run tests that do not load the model:

python app.py --self-test
python -m unittest

Download the configured GGUF:

python -m pip install -r requirements-local.txt
python app.py --download-model

Run a real text-generation contract test:

python app.py --test-endpoint

The command fails unless the model returns all required fields:

  • risk_label
  • simple_explanation
  • red_flags
  • safe_next_steps
  • reply_draft

The old Modal deployments and request scripts are intentionally preserved under experiments/ for comparison and reproducibility. They are not imported by the application.