Spaces:
Running on Zero
Running on Zero
| # Runtime Configuration | |
| ## Current Runtime | |
| Local development defaults to deterministic mock paths: | |
| - `OBJECTVERSE_VISION_BACKEND=mock` | |
| - `OBJECTVERSE_TEXT_BACKEND=mock` | |
| For local runs, this means: | |
| - object understanding is generated by `src/models/vision_runner.py` | |
| - persona, diary, and chat are generated by `src/models/llama_cpp_runner.py` | |
| - traces mark `mock-runtime` in the `fallbacks` field | |
| No commercial cloud AI APIs are used. | |
| The public Hugging Face Space is configured differently for the live demo: | |
| ```bash | |
| OBJECTVERSE_VISION_BACKEND=minicpm-v | |
| VISION_MODEL_ID=openbmb/MiniCPM-V-2_6 | |
| OBJECTVERSE_TEXT_BACKEND=mock | |
| ``` | |
| The Space should run on `zero-a10g` so `@spaces.GPU` can allocate GPU time for MiniCPM-V requests. The required `HF_TOKEN` for gated `openbmb/MiniCPM-V-2_6` access is stored as a Space Secret and must not be committed. | |
| MiniCPM-V 2.6 vision can be enabled without changing the UI: | |
| ```bash | |
| OBJECTVERSE_VISION_BACKEND=minicpm-v \ | |
| VISION_MODEL_ID=openbmb/MiniCPM-V-2_6 \ | |
| OBJECTVERSE_TEXT_BACKEND=mock \ | |
| .venv/bin/python app.py | |
| ``` | |
| This only replaces object understanding. Persona generation, diary generation, and chat can remain mock or use the optional llama.cpp text path below. | |
| Optional llama.cpp text generation can be enabled without changing the UI: | |
| ```bash | |
| OBJECTVERSE_TEXT_BACKEND=llama-cpp \ | |
| TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf \ | |
| .venv/bin/python app.py | |
| ``` | |
| For a hosted Space where the GGUF is stored on Hugging Face Hub instead of the local filesystem, configure the Hub source instead of `TEXT_MODEL_PATH`: | |
| ```bash | |
| OBJECTVERSE_TEXT_BACKEND=llama-cpp | |
| TEXT_MODEL_REPO_ID=qqyule/objectverse-diary-qwen15b-lora | |
| TEXT_MODEL_FILENAME=objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf | |
| ``` | |
| `TEXT_MODEL_REVISION` is optional and defaults to the Hub repo default branch. If `TEXT_MODEL_PATH` is set, it takes precedence over Hub download variables. | |
| `llama-cpp-python` and `huggingface_hub` are installed by the Space runtime dependencies. Missing package, missing model path, download errors, model loading errors, invalid JSON, or schema validation errors all fall back to deterministic mock text generation. | |
| The runtime trace intentionally records only whether an external GGUF path was configured, not the literal `TEXT_MODEL_PATH`, so local private paths do not leak into public traces. | |
| Local LoRA v2 GGUF status: | |
| - Base model: `Qwen/Qwen2.5-1.5B-Instruct` | |
| - Adapter / GGUF repo: `qqyule/objectverse-diary-qwen15b-lora` | |
| - Published GGUF: `objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf` | |
| - Local smoke: passed on 2026-06-08 with `llama-cpp text generation` and no `text-fallback-to-mock` | |
| - Space runtime: live MiniCPM-V vision with mock text; not switched to llama.cpp text until a separate Space validation passes | |
| ## Runtime Diagnostics | |
| The Gradio app exposes two hidden diagnostic APIs: | |
| - `/zero_gpu_probe`: checks Torch import and CUDA visibility. | |
| - `/vision_runtime_probe`: checks configured vision backend, Torch/Transformers import, CUDA/MPS visibility, and MiniCPM-V load success or sanitized failure summaries. | |
| These APIs are for validation scripts and are not visible in the main UI. They must not return tokens, `.env` paths, Hugging Face token markers, or private local filesystem paths. | |
| `scripts/check_space_vlm.py` calls `/vision_runtime_probe` before the mug/keyboard/shoe validation run and writes the probe output into `docs/SPACE_VLM_REPORT.md` and `docs/SPACE_VLM_REPORT.json`. | |
| ## Optional GGUF Smoke Test | |
| Recommended LoRA v2 smoke model: | |
| ```text | |
| repo: qqyule/objectverse-diary-qwen15b-lora | |
| file: objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf | |
| local path: models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf | |
| ``` | |
| The `models/` directory and `*.gguf` are ignored by Git. After downloading the file externally and installing optional `llama-cpp-python`, run: | |
| ```bash | |
| .venv/bin/python -B scripts/check_llama_cpp_smoke.py \ | |
| --model-path models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf | |
| ``` | |
| A passing smoke test must show `llama-cpp text generation` and must not include `text-fallback-to-mock` in either generation or chat fallback markers. | |
| ## Environment Variables | |
| ```bash | |
| OBJECTVERSE_VISION_BACKEND=mock | |
| OBJECTVERSE_TEXT_BACKEND=mock | |
| VISION_MODEL_ID= | |
| TEXT_MODEL_PATH= | |
| TEXT_MODEL_REPO_ID= | |
| TEXT_MODEL_FILENAME= | |
| TEXT_MODEL_REVISION= | |
| TRACE_OUTPUT_DIR=data/traces | |
| ``` | |
| For the live hosted Space, set these Variables: | |
| ```bash | |
| OBJECTVERSE_VISION_BACKEND=minicpm-v | |
| VISION_MODEL_ID=openbmb/MiniCPM-V-2_6 | |
| OBJECTVERSE_TEXT_BACKEND=mock | |
| ``` | |
| Recommended Space hardware for this path is ZeroGPU `zero-a10g`. If live validation fails, use the rollback command in `docs/DEVELOPMENT_STATUS.md` to switch `OBJECTVERSE_VISION_BACKEND` back to `mock` and request `cpu-basic`. | |
| For a Space or local runtime with a separately provided GGUF text model, set: | |
| ```bash | |
| OBJECTVERSE_TEXT_BACKEND=llama-cpp | |
| TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf | |
| ``` | |
| For a Space runtime that should download the published LoRA v2 GGUF from Hub, set: | |
| ```bash | |
| OBJECTVERSE_VISION_BACKEND=mock | |
| OBJECTVERSE_TEXT_BACKEND=llama-cpp | |
| TEXT_MODEL_REPO_ID=qqyule/objectverse-diary-qwen15b-lora | |
| TEXT_MODEL_FILENAME=objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf | |
| ``` | |
| Do not commit GGUF files or private model paths. | |
| ## Future Runtime Boundary | |
| The next implementation phase should keep the same pipeline boundary: | |
| 1. UI calls `src/pipeline.py`. | |
| 2. `src/pipeline.py` calls the configured vision and text runners. | |
| 3. runners return validated Pydantic schemas. | |
| 4. trace logging records backend metadata and fallback markers. | |
| Do not move model calls into `src/ui/layout.py`. | |
| ## Fallback Rules | |
| - VLM unavailable: use manual description and mock/example gallery path. | |
| - llama.cpp unavailable: use mock text generation path and record `text-fallback-to-mock`. | |
| - invalid model JSON: repair and validate before rendering, then fall back to mock if validation fails. | |
| - private input: anonymize trace text before saving public traces. | |
| Trace fallback markers: | |
| - `mock-runtime`: default mock vision and mock text runtime. | |
| - `mock-text-runtime`: real or configured vision path with mock text generation. | |
| - `mock-vision-runtime`: mock vision with a configured non-mock text backend. | |
| - `vision-fallback-to-mock`: MiniCPM-V failed or returned invalid JSON, so mock object understanding was used. | |
| - `text-fallback-to-mock`: llama.cpp was configured but unavailable, invalid, or unable to return schema-valid JSON. | |