File size: 6,451 Bytes
bc02199
 
 
 
0cadcec
bc02199
 
 
 
0cadcec
bc02199
 
 
 
 
 
 
0cadcec
 
 
 
 
 
 
 
 
 
e20e3d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c45600f
 
 
 
 
 
 
 
 
 
 
e20e3d9
d30bd8e
 
c45600f
 
 
 
 
 
0cadcec
c45600f
d30bd8e
 
 
 
 
 
 
 
 
 
 
 
 
c45600f
d30bd8e
 
c45600f
 
 
d30bd8e
 
c45600f
d30bd8e
 
 
c45600f
d30bd8e
 
 
 
bc02199
 
 
 
 
 
 
c45600f
 
 
bc02199
 
 
0cadcec
e20e3d9
 
 
 
 
 
 
0cadcec
e20e3d9
 
 
 
 
 
 
 
c45600f
 
 
 
 
 
 
 
 
e20e3d9
 
bc02199
 
 
 
 
 
 
 
 
 
 
 
 
 
e20e3d9
 
bc02199
e20e3d9
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
# Runtime Configuration

## Current Runtime

Local development defaults to deterministic mock paths:

- `OBJECTVERSE_VISION_BACKEND=mock`
- `OBJECTVERSE_TEXT_BACKEND=mock`

For local runs, this means:

- object understanding is generated by `src/models/vision_runner.py`
- persona, diary, and chat are generated by `src/models/llama_cpp_runner.py`
- traces mark `mock-runtime` in the `fallbacks` field

No commercial cloud AI APIs are used.

The public Hugging Face Space is configured differently for the live demo:

```bash
OBJECTVERSE_VISION_BACKEND=minicpm-v
VISION_MODEL_ID=openbmb/MiniCPM-V-2_6
OBJECTVERSE_TEXT_BACKEND=mock
```

The Space should run on `zero-a10g` so `@spaces.GPU` can allocate GPU time for MiniCPM-V requests. The required `HF_TOKEN` for gated `openbmb/MiniCPM-V-2_6` access is stored as a Space Secret and must not be committed.

MiniCPM-V 2.6 vision can be enabled without changing the UI:

```bash
OBJECTVERSE_VISION_BACKEND=minicpm-v \
VISION_MODEL_ID=openbmb/MiniCPM-V-2_6 \
OBJECTVERSE_TEXT_BACKEND=mock \
.venv/bin/python app.py
```

This only replaces object understanding. Persona generation, diary generation, and chat can remain mock or use the optional llama.cpp text path below.

Optional llama.cpp text generation can be enabled without changing the UI:

```bash
OBJECTVERSE_TEXT_BACKEND=llama-cpp \
TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf \
.venv/bin/python app.py
```

For a hosted Space where the GGUF is stored on Hugging Face Hub instead of the local filesystem, configure the Hub source instead of `TEXT_MODEL_PATH`:

```bash
OBJECTVERSE_TEXT_BACKEND=llama-cpp
TEXT_MODEL_REPO_ID=qqyule/objectverse-diary-qwen15b-lora
TEXT_MODEL_FILENAME=objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
```

`TEXT_MODEL_REVISION` is optional and defaults to the Hub repo default branch. If `TEXT_MODEL_PATH` is set, it takes precedence over Hub download variables.

`llama-cpp-python` and `huggingface_hub` are installed by the Space runtime dependencies. Missing package, missing model path, download errors, model loading errors, invalid JSON, or schema validation errors all fall back to deterministic mock text generation.

The runtime trace intentionally records only whether an external GGUF path was configured, not the literal `TEXT_MODEL_PATH`, so local private paths do not leak into public traces.

Local LoRA v2 GGUF status:

- Base model: `Qwen/Qwen2.5-1.5B-Instruct`
- Adapter / GGUF repo: `qqyule/objectverse-diary-qwen15b-lora`
- Published GGUF: `objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf`
- Local smoke: passed on 2026-06-08 with `llama-cpp text generation` and no `text-fallback-to-mock`
- Space runtime: live MiniCPM-V vision with mock text; not switched to llama.cpp text until a separate Space validation passes

## Runtime Diagnostics

The Gradio app exposes two hidden diagnostic APIs:

- `/zero_gpu_probe`: checks Torch import and CUDA visibility.
- `/vision_runtime_probe`: checks configured vision backend, Torch/Transformers import, CUDA/MPS visibility, and MiniCPM-V load success or sanitized failure summaries.

These APIs are for validation scripts and are not visible in the main UI. They must not return tokens, `.env` paths, Hugging Face token markers, or private local filesystem paths.

`scripts/check_space_vlm.py` calls `/vision_runtime_probe` before the mug/keyboard/shoe validation run and writes the probe output into `docs/SPACE_VLM_REPORT.md` and `docs/SPACE_VLM_REPORT.json`.

## Optional GGUF Smoke Test

Recommended LoRA v2 smoke model:

```text
repo: qqyule/objectverse-diary-qwen15b-lora
file: objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
local path: models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
```

The `models/` directory and `*.gguf` are ignored by Git. After downloading the file externally and installing optional `llama-cpp-python`, run:

```bash
.venv/bin/python -B scripts/check_llama_cpp_smoke.py \
  --model-path models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
```

A passing smoke test must show `llama-cpp text generation` and must not include `text-fallback-to-mock` in either generation or chat fallback markers.

## Environment Variables

```bash
OBJECTVERSE_VISION_BACKEND=mock
OBJECTVERSE_TEXT_BACKEND=mock
VISION_MODEL_ID=
TEXT_MODEL_PATH=
TEXT_MODEL_REPO_ID=
TEXT_MODEL_FILENAME=
TEXT_MODEL_REVISION=
TRACE_OUTPUT_DIR=data/traces
```

For the live hosted Space, set these Variables:

```bash
OBJECTVERSE_VISION_BACKEND=minicpm-v
VISION_MODEL_ID=openbmb/MiniCPM-V-2_6
OBJECTVERSE_TEXT_BACKEND=mock
```

Recommended Space hardware for this path is ZeroGPU `zero-a10g`. If live validation fails, use the rollback command in `docs/DEVELOPMENT_STATUS.md` to switch `OBJECTVERSE_VISION_BACKEND` back to `mock` and request `cpu-basic`.

For a Space or local runtime with a separately provided GGUF text model, set:

```bash
OBJECTVERSE_TEXT_BACKEND=llama-cpp
TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf
```

For a Space runtime that should download the published LoRA v2 GGUF from Hub, set:

```bash
OBJECTVERSE_VISION_BACKEND=mock
OBJECTVERSE_TEXT_BACKEND=llama-cpp
TEXT_MODEL_REPO_ID=qqyule/objectverse-diary-qwen15b-lora
TEXT_MODEL_FILENAME=objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
```

Do not commit GGUF files or private model paths.

## Future Runtime Boundary

The next implementation phase should keep the same pipeline boundary:

1. UI calls `src/pipeline.py`.
2. `src/pipeline.py` calls the configured vision and text runners.
3. runners return validated Pydantic schemas.
4. trace logging records backend metadata and fallback markers.

Do not move model calls into `src/ui/layout.py`.

## Fallback Rules

- VLM unavailable: use manual description and mock/example gallery path.
- llama.cpp unavailable: use mock text generation path and record `text-fallback-to-mock`.
- invalid model JSON: repair and validate before rendering, then fall back to mock if validation fails.
- private input: anonymize trace text before saving public traces.

Trace fallback markers:

- `mock-runtime`: default mock vision and mock text runtime.
- `mock-text-runtime`: real or configured vision path with mock text generation.
- `mock-vision-runtime`: mock vision with a configured non-mock text backend.
- `vision-fallback-to-mock`: MiniCPM-V failed or returned invalid JSON, so mock object understanding was used.
- `text-fallback-to-mock`: llama.cpp was configured but unavailable, invalid, or unable to return schema-valid JSON.