Spaces:
Running on Zero
A newer version of the Gradio SDK is available: 6.17.3
Runtime Configuration
Current Runtime
Local development defaults to deterministic mock paths:
OBJECTVERSE_VISION_BACKEND=mockOBJECTVERSE_TEXT_BACKEND=mock
For local runs, this means:
- object understanding is generated by
src/models/vision_runner.py - persona, diary, and chat are generated by
src/models/llama_cpp_runner.py - traces mark
mock-runtimein thefallbacksfield
No commercial cloud AI APIs are used.
The public Hugging Face Space is configured differently for the live demo:
OBJECTVERSE_VISION_BACKEND=minicpm-v
VISION_MODEL_ID=openbmb/MiniCPM-V-2_6
OBJECTVERSE_TEXT_BACKEND=mock
The Space should run on zero-a10g so @spaces.GPU can allocate GPU time for MiniCPM-V requests. The required HF_TOKEN for gated openbmb/MiniCPM-V-2_6 access is stored as a Space Secret and must not be committed.
MiniCPM-V 2.6 vision can be enabled without changing the UI:
OBJECTVERSE_VISION_BACKEND=minicpm-v \
VISION_MODEL_ID=openbmb/MiniCPM-V-2_6 \
OBJECTVERSE_TEXT_BACKEND=mock \
.venv/bin/python app.py
This only replaces object understanding. Persona generation, diary generation, and chat can remain mock or use the optional llama.cpp text path below.
Optional llama.cpp text generation can be enabled without changing the UI:
OBJECTVERSE_TEXT_BACKEND=llama-cpp \
TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf \
.venv/bin/python app.py
For a hosted Space where the GGUF is stored on Hugging Face Hub instead of the local filesystem, configure the Hub source instead of TEXT_MODEL_PATH:
OBJECTVERSE_TEXT_BACKEND=llama-cpp
TEXT_MODEL_REPO_ID=qqyule/objectverse-diary-qwen15b-lora
TEXT_MODEL_FILENAME=objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
TEXT_MODEL_REVISION is optional and defaults to the Hub repo default branch. If TEXT_MODEL_PATH is set, it takes precedence over Hub download variables.
llama-cpp-python and huggingface_hub are installed by the Space runtime dependencies. Missing package, missing model path, download errors, model loading errors, invalid JSON, or schema validation errors all fall back to deterministic mock text generation.
The runtime trace intentionally records only whether an external GGUF path was configured, not the literal TEXT_MODEL_PATH, so local private paths do not leak into public traces.
Local LoRA v2 GGUF status:
- Base model:
Qwen/Qwen2.5-1.5B-Instruct - Adapter / GGUF repo:
qqyule/objectverse-diary-qwen15b-lora - Published GGUF:
objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf - Local smoke: passed on 2026-06-08 with
llama-cpp text generationand notext-fallback-to-mock - Space runtime: live MiniCPM-V vision with mock text; not switched to llama.cpp text until a separate Space validation passes
Runtime Diagnostics
The Gradio app exposes two hidden diagnostic APIs:
/zero_gpu_probe: checks Torch import and CUDA visibility./vision_runtime_probe: checks configured vision backend, Torch/Transformers import, CUDA/MPS visibility, and MiniCPM-V load success or sanitized failure summaries.
These APIs are for validation scripts and are not visible in the main UI. They must not return tokens, .env paths, Hugging Face token markers, or private local filesystem paths.
scripts/check_space_vlm.py calls /vision_runtime_probe before the mug/keyboard/shoe validation run and writes the probe output into docs/SPACE_VLM_REPORT.md and docs/SPACE_VLM_REPORT.json.
Optional GGUF Smoke Test
Recommended LoRA v2 smoke model:
repo: qqyule/objectverse-diary-qwen15b-lora
file: objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
local path: models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
The models/ directory and *.gguf are ignored by Git. After downloading the file externally and installing optional llama-cpp-python, run:
.venv/bin/python -B scripts/check_llama_cpp_smoke.py \
--model-path models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
A passing smoke test must show llama-cpp text generation and must not include text-fallback-to-mock in either generation or chat fallback markers.
Environment Variables
OBJECTVERSE_VISION_BACKEND=mock
OBJECTVERSE_TEXT_BACKEND=mock
VISION_MODEL_ID=
TEXT_MODEL_PATH=
TEXT_MODEL_REPO_ID=
TEXT_MODEL_FILENAME=
TEXT_MODEL_REVISION=
TRACE_OUTPUT_DIR=data/traces
For the live hosted Space, set these Variables:
OBJECTVERSE_VISION_BACKEND=minicpm-v
VISION_MODEL_ID=openbmb/MiniCPM-V-2_6
OBJECTVERSE_TEXT_BACKEND=mock
Recommended Space hardware for this path is ZeroGPU zero-a10g. If live validation fails, use the rollback command in docs/DEVELOPMENT_STATUS.md to switch OBJECTVERSE_VISION_BACKEND back to mock and request cpu-basic.
For a Space or local runtime with a separately provided GGUF text model, set:
OBJECTVERSE_TEXT_BACKEND=llama-cpp
TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf
For a Space runtime that should download the published LoRA v2 GGUF from Hub, set:
OBJECTVERSE_VISION_BACKEND=mock
OBJECTVERSE_TEXT_BACKEND=llama-cpp
TEXT_MODEL_REPO_ID=qqyule/objectverse-diary-qwen15b-lora
TEXT_MODEL_FILENAME=objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
Do not commit GGUF files or private model paths.
Future Runtime Boundary
The next implementation phase should keep the same pipeline boundary:
- UI calls
src/pipeline.py. src/pipeline.pycalls the configured vision and text runners.- runners return validated Pydantic schemas.
- trace logging records backend metadata and fallback markers.
Do not move model calls into src/ui/layout.py.
Fallback Rules
- VLM unavailable: use manual description and mock/example gallery path.
- llama.cpp unavailable: use mock text generation path and record
text-fallback-to-mock. - invalid model JSON: repair and validate before rendering, then fall back to mock if validation fails.
- private input: anonymize trace text before saving public traces.
Trace fallback markers:
mock-runtime: default mock vision and mock text runtime.mock-text-runtime: real or configured vision path with mock text generation.mock-vision-runtime: mock vision with a configured non-mock text backend.vision-fallback-to-mock: MiniCPM-V failed or returned invalid JSON, so mock object understanding was used.text-fallback-to-mock: llama.cpp was configured but unavailable, invalid, or unable to return schema-valid JSON.