File size: 330 Bytes
c55ab5e | 1 2 3 4 5 6 7 | """Models — load the fine-tuned MiniCPM (GGUF) via llama.cpp and hold the prompts.
NEXT MILESTONE. Local-first by default (llama-cpp-python loading a quantized GGUF),
with an optional Modal endpoint as a hosted fallback for the public Space. Exposes
a single chat/tool-call interface so the agent loop is runtime-agnostic.
"""
|