qqyule commited on
Commit
1e2c036
·
verified ·
1 Parent(s): cb80875

Deploy latest Objectverse Diary version

Browse files
.codex/config.toml ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ [sandbox_workspace_write]
2
+ network_access = false
README.md CHANGED
@@ -23,9 +23,13 @@ Upload a photo of any everyday object. The app wakes it up, gives it a secret pe
23
 
24
  ## Current Status
25
 
26
- Initial mock MVP, MiniCPM-V vision backend wiring, and optional llama.cpp text runtime wiring are available.
27
 
28
- By default, the app still uses deterministic mock outputs for object understanding, persona generation, diary writing, chat replies, share card rendering, and trace saving. `OBJECTVERSE_VISION_BACKEND=minicpm-v` enables the real MiniCPM-V 2.6 vision path. `OBJECTVERSE_TEXT_BACKEND=llama-cpp` can use a local GGUF model through optional `llama-cpp-python` when `TEXT_MODEL_PATH` is configured.
 
 
 
 
29
 
30
  Hugging Face Space:
31
 
@@ -51,21 +55,34 @@ The interface is English-first and Chinese-second.
51
 
52
  ## Badge Targets
53
 
54
- - [ ] Off the Grid
55
- - [ ] Well-Tuned
56
- - [ ] Off-Brand
57
- - [ ] Llama Champion
58
- - [ ] Sharing is Caring
59
- - [ ] Field Notes
60
- - [ ] OpenBMB Special
61
 
62
  ## Planned Model Stack
63
 
64
- - Vision: MiniCPM-V or lightweight VLM fallback
65
- - Text: fine-tuned small LLM
66
  - Runtime: llama.cpp / llama-cpp-python
67
  - UI: Gradio Blocks
68
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
  ## Run Locally
70
 
71
  ```bash
@@ -90,7 +107,7 @@ If `llama-cpp-python` is missing, `TEXT_MODEL_PATH` is empty, the model cannot l
90
 
91
  ## Initial MVP Flow
92
 
93
- The current implementation supports:
94
 
95
  - image upload
96
  - optional object description
@@ -104,6 +121,19 @@ The current implementation supports:
104
  - anonymized trace JSON saved under `data/traces/`
105
  - six public mock sample traces under `data/traces/samples/`
106
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
  ## Generate Sample Traces
108
 
109
  ```bash
 
23
 
24
  ## Current Status
25
 
26
+ Stable mock-safe submission baseline, MiniCPM-V vision backend wiring, optional llama.cpp text runtime wiring, public mock traces, and Space validation evidence are available.
27
 
28
+ By default, the app uses deterministic mock outputs for object understanding, persona generation, diary writing, chat replies, share card rendering, and trace saving. This keeps the public demo reproducible and avoids commercial AI APIs.
29
+
30
+ `OBJECTVERSE_VISION_BACKEND=minicpm-v` enables the optional MiniCPM-V 2.6 vision path. The hosted ZeroGPU validation on June 8, 2026 reached the Space but fell back to mock vision for all three public test images; this is documented in `docs/SPACE_VLM_REPORT.md` and `docs/FAILURES.md`.
31
+
32
+ `OBJECTVERSE_TEXT_BACKEND=llama-cpp` can use a local GGUF model through optional `llama-cpp-python` when `TEXT_MODEL_PATH` is configured. No GGUF file or fine-tuned model is committed in this stable submission baseline.
33
 
34
  Hugging Face Space:
35
 
 
55
 
56
  ## Badge Targets
57
 
58
+ - [x] Off-Brand archive-style Gradio UI, English-first with Chinese helper text.
59
+ - [x] Sharing is Caring — public mock traces, JSONL export, prompt templates, and failure notes.
60
+ - [x] Field Notes — article draft in `docs/FIELD_NOTES.md`.
61
+ - [ ] OpenBMB Special — MiniCPM-V wiring exists, but hosted validation currently falls back to mock vision.
62
+ - [ ] Llama Champion — llama.cpp wiring exists, but real GGUF smoke test is not complete.
63
+ - [ ] Well-Tuned — dataset preview exists, but LoRA training/model publishing is not complete.
64
+ - [ ] Off the Grid — no commercial AI APIs are used; final badge eligibility depends on hackathon review.
65
 
66
  ## Planned Model Stack
67
 
68
+ - Vision: MiniCPM-V 2.6 or deterministic mock fallback
69
+ - Text: deterministic mock text now; optional GGUF later
70
  - Runtime: llama.cpp / llama-cpp-python
71
  - UI: Gradio Blocks
72
 
73
+ ## Parameter Budget
74
+
75
+ The hackathon budget is <= 32B total model parameters.
76
+
77
+ Stable baseline:
78
+
79
+ - default vision backend: deterministic mock, 0 active model parameters
80
+ - default text backend: deterministic mock, 0 active model parameters
81
+ - optional wired vision model: MiniCPM-V 2.6, about 8B parameters when enabled
82
+ - optional text GGUF: not selected or committed yet
83
+
84
+ The stable public demo therefore stays within the 32B budget. Future GGUF or LoRA work must update `docs/MODEL_CARD.md` before being claimed in submission materials.
85
+
86
  ## Run Locally
87
 
88
  ```bash
 
107
 
108
  ## Initial MVP Flow
109
 
110
+ The stable submission baseline supports:
111
 
112
  - image upload
113
  - optional object description
 
121
  - anonymized trace JSON saved under `data/traces/`
122
  - six public mock sample traces under `data/traces/samples/`
123
 
124
+ ## Stable Submission Evidence
125
+
126
+ - Mock-safe Space: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
127
+ - Initial acceptance report: `docs/INITIAL_STAGE_REPORT.md`
128
+ - Runtime notes: `docs/RUNTIME.md`
129
+ - Dataset preview notes: `docs/DATASET.md`
130
+ - Public mock traces: `data/traces/samples/`
131
+ - Trace JSONL export: `data/traces/samples/objectverse_public_mock_traces.jsonl`
132
+ - Hosted VLM failure evidence: `docs/SPACE_VLM_REPORT.md`, `docs/SPACE_VLM_REPORT.json`, `data/traces/space-vlm/`
133
+ - Field Notes draft: `docs/FIELD_NOTES.md`
134
+ - Demo video script: `docs/DEMO_VIDEO_SCRIPT.md`
135
+ - Social post draft: `docs/SOCIAL_POST.md`
136
+
137
  ## Generate Sample Traces
138
 
139
  ```bash
data/examples/README.md CHANGED
@@ -3,3 +3,7 @@
3
  Reserved for public example objects and sample outputs.
4
 
5
  Target: at least 6 example objects for the demo gallery.
 
 
 
 
 
3
  Reserved for public example objects and sample outputs.
4
 
5
  Target: at least 6 example objects for the demo gallery.
6
+
7
+ The Gradio example buttons first try to replay committed sample traces from
8
+ `data/traces/samples/`. If a sample trace is missing, the UI falls back to the
9
+ current generation pipeline.
data/traces/README.md CHANGED
@@ -14,6 +14,8 @@ Target: at least 6 public traces for the Sharing is Caring badge.
14
 
15
  These traces use the current mock runtime and are safe placeholders until real VLM and llama.cpp traces are available.
16
 
 
 
17
  Export the validated sample traces as JSONL:
18
 
19
  ```bash
 
14
 
15
  These traces use the current mock runtime and are safe placeholders until real VLM and llama.cpp traces are available.
16
 
17
+ `../space-vlm/` contains hosted Space validation evidence from the June 8, 2026 ZeroGPU run. Those traces are intentionally kept separate because they include `vision-fallback-to-mock` and should not replace the six stable demo samples.
18
+
19
  Export the validated sample traces as JSONL:
20
 
21
  ```bash
data/traces/space-vlm/keyboard.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "created_at": "2026-06-08T02:16:51.496281Z",
3
+ "diary": {
4
+ "chinese": "今天他们又理所当然地碰了我,好像一个 keyboard 不会有边界感。我保持沉默,因为这大概是我和重力签下的合同。我的情绪是 curious and needlessly profound,秘密恐惧是 discovering that usefulness is not meaning。至少,我已经熬过了好几个所谓紧急计划。",
5
+ "english": "They touched me again today with the confidence of someone who has never asked a keyboard for consent. I remained still, because that is my contract with gravity. My mood is curious and needlessly profound, my secret fear is discovering that usefulness is not meaning, and my only comfort is knowing I have outlived at least three urgent plans.",
6
+ "title": "Secret Diary - Day 425"
7
+ },
8
+ "fallbacks": [
9
+ "vision-fallback-to-mock",
10
+ "mock-text-runtime"
11
+ ],
12
+ "input": {
13
+ "description": "A public Wikimedia Commons photo of a computer keyboard.",
14
+ "has_image": true,
15
+ "image_filename": "keyboard.jpg"
16
+ },
17
+ "mode": "Philosopher",
18
+ "model_runtime": {
19
+ "runtime": "vision model id: openbmb/MiniCPM-V-2_6; no llama.cpp model connected yet",
20
+ "text": "mock persona and diary generation",
21
+ "vision": "minicpm-v object understanding"
22
+ },
23
+ "object_understanding": {
24
+ "object": {
25
+ "confidence": 0.42,
26
+ "likely_context": "everyday human environment",
27
+ "name": "keyboard",
28
+ "visible_features": [
29
+ "uploaded photo provided",
30
+ "user-supplied description"
31
+ ]
32
+ }
33
+ },
34
+ "persona": {
35
+ "persona": {
36
+ "character_name": "Keyboard the Questioning",
37
+ "complaint": "I am not just a keyboard. I am an unpaid witness with excellent recall.",
38
+ "core_memory": "survived many quiet hours as a keyboard while humans called it normal life",
39
+ "mood": "curious and needlessly profound",
40
+ "object_name": "keyboard",
41
+ "secret_fear": "discovering that usefulness is not meaning",
42
+ "tags": [
43
+ "tiny ontology",
44
+ "useful doubt",
45
+ "meaning crisis"
46
+ ]
47
+ }
48
+ },
49
+ "trace_id": "c7172b10d11048008b7a9dda1159d0df"
50
+ }
data/traces/space-vlm/mug.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "created_at": "2026-06-08T02:16:42.380173Z",
3
+ "diary": {
4
+ "chinese": "今天他们又理所当然地碰了我,好像一个 coffee mug 不会有边界感。我保持沉默,因为这大概是我和重力签下的合同。我的情绪是 tired but sarcastic,秘密恐惧是 being replaced by a newer object with worse opinions。至少,我已经熬过了好几个所谓紧急计划。",
5
+ "english": "They touched me again today with the confidence of someone who has never asked a coffee mug for consent. I remained still, because that is my contract with gravity. My mood is tired but sarcastic, my secret fear is being replaced by a newer object with worse opinions, and my only comfort is knowing I have outlived at least three urgent plans.",
6
+ "title": "Secret Diary - Day 427"
7
+ },
8
+ "fallbacks": [
9
+ "vision-fallback-to-mock",
10
+ "mock-text-runtime"
11
+ ],
12
+ "input": {
13
+ "description": "A public Wikimedia Commons photo of a striped coffee mug.",
14
+ "has_image": true,
15
+ "image_filename": "mug.jpg"
16
+ },
17
+ "mode": "Cynical",
18
+ "model_runtime": {
19
+ "runtime": "vision model id: openbmb/MiniCPM-V-2_6; no llama.cpp model connected yet",
20
+ "text": "mock persona and diary generation",
21
+ "vision": "minicpm-v object understanding"
22
+ },
23
+ "object_understanding": {
24
+ "object": {
25
+ "confidence": 0.42,
26
+ "likely_context": "everyday human environment",
27
+ "name": "coffee mug",
28
+ "visible_features": [
29
+ "uploaded photo provided",
30
+ "user-supplied description"
31
+ ]
32
+ }
33
+ },
34
+ "persona": {
35
+ "persona": {
36
+ "character_name": "CoffeeMug worth",
37
+ "complaint": "I am not just a coffee mug. I am an unpaid witness with excellent recall.",
38
+ "core_memory": "survived many quiet hours as a coffee mug while humans called it normal life",
39
+ "mood": "tired but sarcastic",
40
+ "object_name": "coffee mug",
41
+ "secret_fear": "being replaced by a newer object with worse opinions",
42
+ "tags": [
43
+ "desk survivor",
44
+ "burnt optimism",
45
+ "quiet judgment"
46
+ ]
47
+ }
48
+ },
49
+ "trace_id": "31462c2c83a54dc79b38cb16faaed783"
50
+ }
data/traces/space-vlm/shoe.json ADDED
@@ -0,0 +1,50 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "created_at": "2026-06-08T02:16:56.248616Z",
3
+ "diary": {
4
+ "chinese": "今天他们又理所当然地碰了我,好像一个 shoe 不会有边界感。我保持沉默,因为这大概是我和重力签下的合同。我的情绪是 theatrical and wounded,秘密恐惧是 being forgotten before the final act。至少,我已经熬过了好几个所谓紧急计划。",
5
+ "english": "They touched me again today with the confidence of someone who has never asked a shoe for consent. I remained still, because that is my contract with gravity. My mood is theatrical and wounded, my secret fear is being forgotten before the final act, and my only comfort is knowing I have outlived at least three urgent plans.",
6
+ "title": "Secret Diary - Day 421"
7
+ },
8
+ "fallbacks": [
9
+ "vision-fallback-to-mock",
10
+ "mock-text-runtime"
11
+ ],
12
+ "input": {
13
+ "description": "A public Wikimedia Commons photo of running shoes.",
14
+ "has_image": true,
15
+ "image_filename": "shoe.jpg"
16
+ },
17
+ "mode": "Dramatic",
18
+ "model_runtime": {
19
+ "runtime": "vision model id: openbmb/MiniCPM-V-2_6; no llama.cpp model connected yet",
20
+ "text": "mock persona and diary generation",
21
+ "vision": "minicpm-v object understanding"
22
+ },
23
+ "object_understanding": {
24
+ "object": {
25
+ "confidence": 0.42,
26
+ "likely_context": "everyday human environment",
27
+ "name": "shoe",
28
+ "visible_features": [
29
+ "uploaded photo provided",
30
+ "user-supplied description"
31
+ ]
32
+ }
33
+ },
34
+ "persona": {
35
+ "persona": {
36
+ "character_name": "Shoe von Sigh",
37
+ "complaint": "I am not just a shoe. I am an unpaid witness with excellent recall.",
38
+ "core_memory": "survived many quiet hours as a shoe while humans called it normal life",
39
+ "mood": "theatrical and wounded",
40
+ "object_name": "shoe",
41
+ "secret_fear": "being forgotten before the final act",
42
+ "tags": [
43
+ "tragic prop",
44
+ "grand entrance",
45
+ "minor catastrophe"
46
+ ]
47
+ }
48
+ },
49
+ "trace_id": "21ad1c2abe3b406a9e359f9f1b190552"
50
+ }
docs/03-dev-schedule.md CHANGED
@@ -53,12 +53,13 @@
53
  - [x] 加 example gallery
54
  - [x] 新增 Space VLM 验证脚本
55
  - [x] 新增 ZeroGPU 兼容装饰器
 
56
  - [ ] 缓存示例输出
57
- - [ ] Space 真实图片验证(L4 因 HF `402 Payment Required` 阻塞;ZeroGPU 已到 `RUNNING` 但验证请求长时间无返回,已回滚 mock-safe
58
 
59
  验收:上传杯子/键盘/鞋子,模型能识别物品并提取外观特征。
60
 
61
- 完成记录:MiniCPM-V 2.6 已作为可配置 vision backend 接入,默认仍是 mock vision;`scripts/check_space_vlm.py` 已可用三张临时公开图片验证 Space 端 mug/keyboard/shoe。2026-06-06 已尝试切到 L4,但 Hugging Face 返回 `402 Payment Required`,需要组织 billing/pre-paid credits;随后尝试 ZeroGPU,Space 可到 `RUNNING`,但验证请求长时间无返回。两次尝试后均已执行 mock-safe rollback。文本生成已接入可选 llama.cpp runtime wiring,但最终 GGUF 模型仍未选择/下载。
62
 
63
  ---
64
 
@@ -215,7 +216,7 @@ Bottom: Share Card + Trace
215
  ## Day 11:提交检查
216
 
217
  - [ ] Space under official org
218
- - [ ] Space MiniCPM-V validation passes for mug, keyboard, and shoe
219
  - [ ] Demo video ready
220
  - [ ] Social post ready
221
  - [ ] README complete
 
53
  - [x] 加 example gallery
54
  - [x] 新增 Space VLM 验证脚本
55
  - [x] 新增 ZeroGPU 兼容装饰器
56
+ - [x] ZeroGPU CUDA probe
57
  - [ ] 缓存示例输出
58
+ - [ ] Space 真实图片验证(L4 因 HF `402 Payment Required` 阻塞;ZeroGPU CUDA probe 成功;2026-06-08 full validation reached the app but fell back to mock vision for mug/keyboard/shoe
59
 
60
  验收:上传杯子/键盘/鞋子,模型能识别物品并提取外观特征。
61
 
62
+ 完成记录:MiniCPM-V 2.6 已作为可配置 vision backend 接入,默认仍是 mock vision;`scripts/check_space_vlm.py` 已可用三张临时公开图片验证 Space 端 mug/keyboard/shoe。2026-06-06 已尝试切到 L4,但 Hugging Face 返回 `402 Payment Required`;随后 ZeroGPU CUDA probe 成功。2026-06-08 full validation reached the app through the direct `hf.space` path, but all three objects included `vision-fallback-to-mock`。文本生成已接入可选 llama.cpp runtime wiring,但最终 GGUF 模型仍未选择/下载。
63
 
64
  ---
65
 
 
216
  ## Day 11:提交检查
217
 
218
  - [ ] Space under official org
219
+ - [ ] Space MiniCPM-V validation passes for mug, keyboard, and shoe(当前 wired but hosted validation falls back to mock)
220
  - [ ] Demo video ready
221
  - [ ] Social post ready
222
  - [ ] README complete
docs/07-development-plan.md CHANGED
@@ -39,7 +39,7 @@ As of 2026-06-06, the project has:
39
  Not yet done:
40
 
41
  - GitHub repo sync / public submission confirmation
42
- - hosted Space L4 MiniCPM-V validation with real public images
43
  - real GGUF selection and local `TEXT_MODEL_PATH` smoke test
44
  - real curated dataset
45
  - LoRA fine-tuning
@@ -115,7 +115,7 @@ Verification:
115
 
116
  Goal: replace mock object recognition with a real VLM path while preserving fallback behavior.
117
 
118
- Status: local wiring complete; hosted GPU validation pending.
119
 
120
  Scope:
121
 
@@ -136,7 +136,8 @@ Verification:
136
  - Run local sample image checks.
137
  - Confirm schema validation.
138
  - Confirm fallback trace markers.
139
- - Run `scripts/check_space_vlm.py --configure-space` after external-state confirmation.
 
140
 
141
  ## Phase 4 — Text Runtime With llama.cpp
142
 
 
39
  Not yet done:
40
 
41
  - GitHub repo sync / public submission confirmation
42
+ - hosted Space MiniCPM-V validation with real public images
43
  - real GGUF selection and local `TEXT_MODEL_PATH` smoke test
44
  - real curated dataset
45
  - LoRA fine-tuning
 
115
 
116
  Goal: replace mock object recognition with a real VLM path while preserving fallback behavior.
117
 
118
+ Status: local wiring complete; hosted ZeroGPU validation reaches the app but falls back to mock vision.
119
 
120
  Scope:
121
 
 
136
  - Run local sample image checks.
137
  - Confirm schema validation.
138
  - Confirm fallback trace markers.
139
+ - Run `scripts/check_space_vlm.py --configure-space --hardware zero-a10g --rollback-to-mock` after external-state confirmation.
140
+ - Inspect Space runtime logs or add non-secret diagnostics before rerunning, because the 2026-06-08 hosted validation returned `vision-fallback-to-mock` for mug, keyboard, and shoe.
141
 
142
  ## Phase 4 — Text Runtime With llama.cpp
143
 
docs/DATASET.md CHANGED
@@ -18,6 +18,8 @@ data/train/objectverse_sft_preview.jsonl
18
 
19
  This preview is mock-generated. It is not a final training dataset and should not be described as real model output.
20
 
 
 
21
  ## Target Dataset
22
 
23
  Final target before fine-tuning:
@@ -66,6 +68,8 @@ Full candidate pool later:
66
 
67
  Manual curation should happen after generation. Do not publish the full candidate file until it has been reviewed.
68
 
 
 
69
  ## Curation Checklist
70
 
71
  - Persona stays consistent with the object.
 
18
 
19
  This preview is mock-generated. It is not a final training dataset and should not be described as real model output.
20
 
21
+ The stable submission baseline does not publish a final Hugging Face Dataset. The current JSONL file is evidence for schema and workflow readiness only.
22
+
23
  ## Target Dataset
24
 
25
  Final target before fine-tuning:
 
68
 
69
  Manual curation should happen after generation. Do not publish the full candidate file until it has been reviewed.
70
 
71
+ Space VLM validation traces under `data/traces/space-vlm/` are failure evidence because they include `vision-fallback-to-mock`. Do not mix them into curated training data or describe them as successful real VLM outputs.
72
+
73
  ## Curation Checklist
74
 
75
  - Persona stays consistent with the object.
docs/DEMO_VIDEO_SCRIPT.md ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Demo Video Script
2
+
3
+ ## Goal
4
+
5
+ Record a 90-second stable demo for Objectverse Diary using the mock-safe Hugging Face Space or local Gradio app.
6
+
7
+ Do not claim that hosted MiniCPM-V, GGUF text generation, LoRA training, or model publishing are complete. The stable demo should emphasize the product loop, Gradio Off-Brand UI, public traces, and no commercial AI APIs.
8
+
9
+ ## Recording Setup
10
+
11
+ - Use the Hugging Face Space if it is responsive: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
12
+ - If the Space is slow, use local Gradio with default mock settings:
13
+
14
+ ```bash
15
+ .venv/bin/python app.py
16
+ ```
17
+
18
+ - Keep environment defaults:
19
+
20
+ ```bash
21
+ OBJECTVERSE_VISION_BACKEND=mock
22
+ OBJECTVERSE_TEXT_BACKEND=mock
23
+ ```
24
+
25
+ ## 90-Second Script
26
+
27
+ ### 0-8s
28
+
29
+ Voiceover:
30
+
31
+ > What if every object around you had a secret life?
32
+
33
+ Screen:
34
+
35
+ - Show the Objectverse Diary title and archive-style interface.
36
+ - Briefly show the English-first / Chinese-second subtitle.
37
+
38
+ ### 8-20s
39
+
40
+ Voiceover:
41
+
42
+ > This is Objectverse Diary, a small-model AI toy built with Gradio. It turns ordinary object photos into secret personas, diary entries, chats, and shareable cards.
43
+
44
+ Screen:
45
+
46
+ - Show the object intake panel.
47
+ - Hover or point to personality mode selection.
48
+
49
+ ### 20-35s
50
+
51
+ Voiceover:
52
+
53
+ > For the stable demo, I use deterministic mock generation so the public Space stays reproducible without commercial AI APIs.
54
+
55
+ Screen:
56
+
57
+ - Click a stable example, preferably Coffee mug or Mechanical keyboard.
58
+ - Generate or replay the cached example output.
59
+
60
+ ### 35-52s
61
+
62
+ Voiceover:
63
+
64
+ > The app creates a structured object file, then gives the object a hidden personality.
65
+
66
+ Screen:
67
+
68
+ - Show object JSON.
69
+ - Show persona JSON or object file panel.
70
+
71
+ ### 52-68s
72
+
73
+ Voiceover:
74
+
75
+ > The object writes a short English-first secret diary, with Chinese helper text underneath.
76
+
77
+ Screen:
78
+
79
+ - Scroll or focus on diary output.
80
+ - Keep the diary readable.
81
+
82
+ ### 68-82s
83
+
84
+ Voiceover:
85
+
86
+ > You can chat with the object, generate a share card, and inspect the anonymized trace.
87
+
88
+ Screen:
89
+
90
+ - Send one chat message.
91
+ - Show the share card.
92
+ - Show trace panel or trace path.
93
+
94
+ ### 82-90s
95
+
96
+ Voiceover:
97
+
98
+ > MiniCPM-V and llama.cpp paths are wired behind fallbacks, but this stable submission keeps the demo mock-safe and reproducible. Every object has a secret life.
99
+
100
+ Screen:
101
+
102
+ - End on the share card or app title.
103
+
104
+ ## Notes For Submission
105
+
106
+ - Mention MiniCPM-V as wired but not hosted-validated yet.
107
+ - Mention public traces and failure notes if the submission form asks for reproducibility.
108
+ - Keep the final video under 2 minutes.
docs/DEVELOPMENT_STATUS.md CHANGED
@@ -1,12 +1,16 @@
1
  # Development Status
2
 
3
- Last updated: 2026-06-06
4
 
5
  ## Completed
6
 
7
  - Project skeleton, README, AGENTS instructions, and Gradio app entrypoint.
8
  - Mock MVP flow: upload/description, personality mode, object JSON, persona JSON, diary, object chat, share card, and trace saving.
9
  - Archive-style Gradio UI with English-first / Chinese-second copy and six stable examples.
 
 
 
 
10
  - Trace and dataset tooling:
11
  - six public mock sample traces
12
  - public trace JSONL export
@@ -18,22 +22,29 @@ Last updated: 2026-06-06
18
  - Space VLM validation tooling:
19
  - `scripts/check_space_vlm.py`
20
  - failed L4 validation report at `docs/SPACE_VLM_REPORT.md`
 
21
  - ZeroGPU compatibility:
22
  - optional `src/utils/zero_gpu.py`
23
  - Gradio generation callback wrapped with `@zero_gpu(duration=180)`
 
 
 
 
 
 
24
  - Local tests and initial acceptance currently pass.
25
 
26
  ## Not Completed
27
 
28
- - Hosted Space MiniCPM-V validation with real public mug/keyboard/shoe images. Paid L4 was blocked by Hugging Face `402 Payment Required`; ZeroGPU reached `RUNNING` but the validation request did not return within the practical waiting window; mock-safe rollback was applied.
29
- - Stable example output caching for real VLM demos.
30
  - Real GGUF model selection, download/configuration outside Git, and `TEXT_MODEL_PATH` smoke test.
31
  - Final text model parameter count documentation.
32
  - Real model traces and curated object-persona dataset.
33
  - LoRA training, adapter/model export, GGUF conversion, and Hugging Face model publishing.
34
  - Hugging Face dataset publishing.
35
  - GitHub sync / final public repository confirmation.
36
- - Field Notes article, demo video, social post, and final submission package.
37
 
38
  ## Current Safe Defaults
39
 
@@ -44,16 +55,25 @@ Last updated: 2026-06-06
44
 
45
  ## Next Recommended Gate
46
 
47
- Unblock Hugging Face paid hardware access, or debug the ZeroGPU queue/request path with a smaller probe model, then rerun the hosted Space VLM validation:
 
 
 
 
48
 
49
  ```bash
50
  .venv/bin/python -B scripts/check_space_vlm.py \
51
  --configure-space \
 
 
52
  --space-url https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary \
53
- --output docs/SPACE_VLM_REPORT.md
 
 
 
54
  ```
55
 
56
- If Space validation fails or GPU is unavailable, roll back to mock-safe settings:
57
 
58
  ```bash
59
  .venv/bin/python -B scripts/check_space_vlm.py \
 
1
  # Development Status
2
 
3
+ Last updated: 2026-06-08
4
 
5
  ## Completed
6
 
7
  - Project skeleton, README, AGENTS instructions, and Gradio app entrypoint.
8
  - Mock MVP flow: upload/description, personality mode, object JSON, persona JSON, diary, object chat, share card, and trace saving.
9
  - Archive-style Gradio UI with English-first / Chinese-second copy and six stable examples.
10
+ - Stable demo baseline:
11
+ - example buttons replay committed sample traces before falling back to live generation
12
+ - cached and live generation share the same UI output formatter
13
+ - manual upload/description path still saves new runtime traces
14
  - Trace and dataset tooling:
15
  - six public mock sample traces
16
  - public trace JSONL export
 
22
  - Space VLM validation tooling:
23
  - `scripts/check_space_vlm.py`
24
  - failed L4 validation report at `docs/SPACE_VLM_REPORT.md`
25
+ - optional `--trace-output-dir` evidence export for validation traces
26
  - ZeroGPU compatibility:
27
  - optional `src/utils/zero_gpu.py`
28
  - Gradio generation callback wrapped with `@zero_gpu(duration=180)`
29
+ - hidden `/zero_gpu_probe` API confirms ZeroGPU CUDA availability when run through direct `hf.space` URL
30
+ - Stable submission materials:
31
+ - Field Notes draft
32
+ - demo video script
33
+ - social post draft
34
+ - stable submission guide
35
  - Local tests and initial acceptance currently pass.
36
 
37
  ## Not Completed
38
 
39
+ - Hosted Space MiniCPM-V validation with real public mug/keyboard/shoe images. Paid L4 was blocked by Hugging Face `402 Payment Required`; ZeroGPU CUDA probe passed; the 2026-06-08 full ZeroGPU validation reached the app but all three objects fell back to mock vision.
40
+ - Passing real VLM demo trace capture. Failed Space VLM traces are kept as fallback evidence and do not replace mock sample traces.
41
  - Real GGUF model selection, download/configuration outside Git, and `TEXT_MODEL_PATH` smoke test.
42
  - Final text model parameter count documentation.
43
  - Real model traces and curated object-persona dataset.
44
  - LoRA training, adapter/model export, GGUF conversion, and Hugging Face model publishing.
45
  - Hugging Face dataset publishing.
46
  - GitHub sync / final public repository confirmation.
47
+ - Published Field Notes URL, recorded demo video URL, social post URL, and final public submission.
48
 
49
  ## Current Safe Defaults
50
 
 
55
 
56
  ## Next Recommended Gate
57
 
58
+ For a stable public baseline, keep the mock-safe Space as the demo path and only sync GitHub/Hugging Face after explicit confirmation.
59
+
60
+ Next model gate:
61
+
62
+ Optional next model gate after stable submission: inspect the hosted Space MiniCPM-V failure cause without exposing secrets, then rerun hosted Space VLM validation on ZeroGPU:
63
 
64
  ```bash
65
  .venv/bin/python -B scripts/check_space_vlm.py \
66
  --configure-space \
67
+ --hardware zero-a10g \
68
+ --rollback-to-mock \
69
  --space-url https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary \
70
+ --output docs/SPACE_VLM_REPORT.md \
71
+ --json-output docs/SPACE_VLM_REPORT.json \
72
+ --trace-output-dir data/traces/space-vlm \
73
+ --timeout-seconds 1200
74
  ```
75
 
76
+ If only rollback is needed, use:
77
 
78
  ```bash
79
  .venv/bin/python -B scripts/check_space_vlm.py \
docs/EXTERNAL_SETUP.md CHANGED
@@ -67,19 +67,25 @@ pinned: false
67
 
68
  Recommended runtime setup:
69
 
70
- - set `OBJECTVERSE_VISION_BACKEND=minicpm-v`
71
- - set `VISION_MODEL_ID=openbmb/MiniCPM-V-2_6`
72
- - set `OBJECTVERSE_TEXT_BACKEND=mock`
73
- - use 1x Nvidia L4 for MiniCPM-V 2.6
74
- - switch vision backend back to `mock` if GPU is unavailable
 
75
 
76
  Automated validation command after confirmation:
77
 
78
  ```bash
79
  .venv/bin/python -B scripts/check_space_vlm.py \
80
  --configure-space \
 
 
81
  --space-url https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary \
82
- --output docs/SPACE_VLM_REPORT.md
 
 
 
83
  ```
84
 
85
  Optional rollback to mock-safe settings:
@@ -100,8 +106,16 @@ The validation script must not print Hugging Face tokens. It uses three temporar
100
  - Mock-safe rollback was run afterward.
101
  - ZeroGPU compatibility was added and uploaded to the Space.
102
  - `--configure-space --hardware zero-a10g` reached `RUNNING`, and `/config` was reachable, but the validation request did not return within the practical waiting window.
103
- - Mock-safe rollback was run afterward and confirmed at `cpu-basic`.
104
- - Next unblock step: enable billing/pre-paid credits, or debug ZeroGPU with a smaller probe before retrying full MiniCPM-V validation.
 
 
 
 
 
 
 
 
105
 
106
  ## Safety Notes
107
 
 
67
 
68
  Recommended runtime setup:
69
 
70
+ - stable public demo: keep `OBJECTVERSE_VISION_BACKEND=mock` and `OBJECTVERSE_TEXT_BACKEND=mock`
71
+ - optional MiniCPM-V validation: temporarily set `OBJECTVERSE_VISION_BACKEND=minicpm-v`
72
+ - optional MiniCPM-V validation: set `VISION_MODEL_ID=openbmb/MiniCPM-V-2_6`
73
+ - optional MiniCPM-V validation: keep `OBJECTVERSE_TEXT_BACKEND=mock`
74
+ - optional MiniCPM-V validation: use ZeroGPU `zero-a10g` first; paid L4 previously returned `402 Payment Required`
75
+ - always roll back to mock-safe settings after validation unless the hosted VLM path passes reliably
76
 
77
  Automated validation command after confirmation:
78
 
79
  ```bash
80
  .venv/bin/python -B scripts/check_space_vlm.py \
81
  --configure-space \
82
+ --hardware zero-a10g \
83
+ --rollback-to-mock \
84
  --space-url https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary \
85
+ --output docs/SPACE_VLM_REPORT.md \
86
+ --json-output docs/SPACE_VLM_REPORT.json \
87
+ --trace-output-dir data/traces/space-vlm \
88
+ --timeout-seconds 1200
89
  ```
90
 
91
  Optional rollback to mock-safe settings:
 
106
  - Mock-safe rollback was run afterward.
107
  - ZeroGPU compatibility was added and uploaded to the Space.
108
  - `--configure-space --hardware zero-a10g` reached `RUNNING`, and `/config` was reachable, but the validation request did not return within the practical waiting window.
109
+ - `spaces>=0.30` and a hidden `/zero_gpu_probe` endpoint were added.
110
+ - The ZeroGPU probe succeeded through the direct `hf.space` URL with CUDA available on an NVIDIA RTX PRO 6000 Blackwell MIG device.
111
+
112
+ 2026-06-08 validation attempt:
113
+
114
+ - `--configure-space --hardware zero-a10g --rollback-to-mock` reached the app through the direct `hf.space` client path.
115
+ - Mug, keyboard, and shoe checks all returned schema-valid traces, but every trace included `vision-fallback-to-mock`.
116
+ - Evidence is saved in `docs/SPACE_VLM_REPORT.md`, `docs/SPACE_VLM_REPORT.json`, and `data/traces/space-vlm/`.
117
+ - The report records rollback to `cpu-basic` with `OBJECTVERSE_VISION_BACKEND=mock` and `OBJECTVERSE_TEXT_BACKEND=mock`.
118
+ - Next model unblock step: inspect Space runtime logs or add non-secret MiniCPM-V diagnostics before rerunning validation.
119
 
120
  ## Safety Notes
121
 
docs/FAILURES.md CHANGED
@@ -8,7 +8,7 @@ Use it for model/runtime/deployment/data issues, not for UI polish notes.
8
 
9
  ## Current Status
10
 
11
- MiniCPM-V 2.6 is wired as an optional vision backend. No hosted Space GPU failures have been observed yet because Space GPU validation is still pending.
12
 
13
  Known non-blocking warning:
14
 
@@ -29,6 +29,17 @@ Known non-blocking warning:
29
  - Evidence:
30
  ```
31
 
 
 
 
 
 
 
 
 
 
 
 
32
  ## Anticipated Failure Areas
33
 
34
  ### Vision Runtime
 
8
 
9
  ## Current Status
10
 
11
+ MiniCPM-V 2.6 is wired as an optional vision backend. Hosted Space ZeroGPU validation ran on 2026-06-08, but all three public object checks fell back to mock vision, so full hosted MiniCPM-V validation is still unresolved.
12
 
13
  Known non-blocking warning:
14
 
 
29
  - Evidence:
30
  ```
31
 
32
+ ## 2026-06-08 - Hosted ZeroGPU MiniCPM-V Falls Back To Mock
33
+
34
+ - Area: Hugging Face Space vision runtime.
35
+ - Reproduction: Run `scripts/check_space_vlm.py` with `--configure-space --hardware zero-a10g --rollback-to-mock` against `build-small-hackathon/ObjectverseDiary`.
36
+ - Expected: mug, keyboard, and shoe validations use `minicpm-v object understanding` without `vision-fallback-to-mock`.
37
+ - Actual: all three validations returned schema-valid traces, but every trace included `vision-fallback-to-mock`.
38
+ - Impact: hosted Space MiniCPM-V evidence is not ready for submission; stable mock demo remains usable.
39
+ - Fallback used: mock object understanding plus mock text runtime.
40
+ - Resolution: unresolved; inspect Space runtime logs or add non-secret fallback diagnostics for the MiniCPM-V load/chat exception.
41
+ - Evidence: `docs/SPACE_VLM_REPORT.md`, `docs/SPACE_VLM_REPORT.json`, and `data/traces/space-vlm/`.
42
+
43
  ## Anticipated Failure Areas
44
 
45
  ### Vision Runtime
docs/FIELD_NOTES.md CHANGED
@@ -1,79 +1,155 @@
1
- # Field Notes
2
 
3
- Working title:
4
 
5
- `Building Objectverse Diary: A Small-Model AI Toy Where Everyday Objects Come Alive`
6
 
7
- ## Status
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
- Draft source. The final article should be written after real runtime and deployment evidence are available.
 
 
 
 
10
 
11
- ## Draft Outline
12
 
13
- 1. Why I built it
14
- 2. Why Track 2
15
- 3. Why small models are enough
16
- 4. Product design
17
- 5. Model architecture
18
- 6. Gradio Off-Brand UI
19
- 7. llama.cpp runtime
20
- 8. Fine-tuning dataset
21
- 9. Traces and reproducibility
22
- 10. What failed
23
- 11. What I would improve next
24
 
25
- ## Draft Notes
26
 
27
- ### 1. Why I built it
28
 
29
- Objectverse Diary starts from a simple joke: what if the objects around us were quietly keeping emotional records of our lives? The project turns an everyday photo into a hidden object persona, a secret diary entry, a short chat, and a shareable card.
30
 
31
- ### 2. Why Track 2
 
 
 
 
32
 
33
- The experience is purely digital and depends on AI for its main interaction loop: visual understanding, persona invention, voice consistency, and diary generation. It fits the "An Adventure in Thousand Token Wood" track because the product is a strange AI-native toy rather than a conventional productivity workflow.
34
 
35
- ### 3. Why small models are enough
36
 
37
- The task does not need a giant frontier model. It needs compact object recognition, strong style constraints, schema-following, and a reliable fallback path. The project is designed around a <= 32B total parameter budget.
38
 
39
- ### 4. Product design
40
 
41
- The app is English-first and Chinese-second. The intended feeling is a mysterious archive of ordinary objects: museum labels, typewriter diary entries, and slightly uncanny object personalities.
 
 
42
 
43
- ### 5. Model architecture
44
 
45
- The planned architecture keeps model calls behind `src/pipeline.py`: vision understanding, persona generation, diary generation, chat, share card rendering, and trace logging. The current mock runtime preserves this boundary before real model integration.
46
 
47
- ### 6. Gradio Off-Brand UI
48
 
49
- The UI must remain Gradio, but should not feel like a default demo. This section should be completed after UI polish.
50
 
51
- ### 7. llama.cpp runtime
52
 
53
- The text path is planned for GGUF plus llama.cpp or llama-cpp-python. This section should include the chosen model, parameter count, local command, Space behavior, and fallback notes.
 
 
54
 
55
- ### 8. Fine-tuning dataset
56
 
57
- The dataset plan starts with deterministic preview JSONL, then moves to 200-500 candidate rows and 50 curated high-quality samples. Document provenance, curation, privacy checks, and any rejected examples.
58
 
59
- ### 9. Traces and reproducibility
60
 
61
- The app saves anonymized traces. Current public samples are mock traces; real submission traces should include model runtime metadata and fallback markers.
62
 
63
- ### 10. What failed
64
 
65
- Reserve this section for real integration failures: VLM loading, JSON validity, Space resource limits, and any dataset quality issues.
66
 
67
- ### 11. What I would improve next
 
 
 
 
68
 
69
- Likely future work: better object memory, richer conversations, stronger card rendering, more curated styles, and a larger public evaluation set.
70
 
71
- ## Evidence To Add Later
72
 
73
- - Hugging Face Space URL
74
- - GitHub repository URL
75
- - model repo URL
76
- - dataset URL
77
- - trace dataset URL
78
- - demo video URL
79
- - screenshots after UI polish
 
1
+ # Building Objectverse Diary: A Small-Model AI Toy Where Everyday Objects Come Alive
2
 
3
+ ## Status
4
 
5
+ Stable submission draft. This document is ready to adapt into the final Field Notes post after the public GitHub, demo video, and social post URLs are confirmed.
6
 
7
+ ## 1. Why I Built It
8
+
9
+ Objectverse Diary began with a small, silly question: what if the objects around us were quietly keeping emotional records of our lives?
10
+
11
+ The product loop is intentionally simple. A user uploads an everyday object photo, chooses a personality mode, and the app turns the object into a hidden character. The object gets a structured file, a secret diary entry, a short chat voice, and a shareable card.
12
+
13
+ The joke only works if the app treats ordinary objects with strange seriousness. A coffee mug is not just a mug; it is a tired witness. A keyboard is not just a keyboard; it is a percussion instrument for anxious deadlines. The app is a tiny archive for that kind of imagined life.
14
+
15
+ ## 2. Why This Fits The Track
16
+
17
+ Objectverse Diary was built for the Build Small Hackathon track "An Adventure in Thousand Token Wood." The core experience is AI-native:
18
+
19
+ - vision understanding turns a photo into structured object facts
20
+ - persona generation invents the object's hidden self
21
+ - diary generation writes in a consistent first-person voice
22
+ - chat lets the object keep that voice across replies
23
+ - trace logging makes each generation inspectable and reproducible
24
+
25
+ It is not a productivity wrapper. It is a compact AI toy with a specific emotional shape.
26
+
27
+ ## 3. Why Small Models Are Enough
28
+
29
+ This project does not need a frontier model to be interesting. It needs:
30
+
31
+ - useful object recognition
32
+ - compact structured JSON output
33
+ - a distinctive writing style
34
+ - consistent persona fields
35
+ - reliable fallback behavior
36
+ - a UI that makes the output feel intentional
37
+
38
+ The architecture is designed around a <= 32B total parameter budget. MiniCPM-V 2.6 is wired as the optional vision path, and llama.cpp is wired as the optional local text runtime. The stable public baseline still defaults to deterministic mock generation so the demo stays reproducible without commercial model APIs.
39
+
40
+ ## 4. Product Design
41
+
42
+ The interface is English-first and Chinese-second. The visual direction is a strange object archive: warm dark paper, amber highlights, museum-label copy, and typewriter-like diary output.
43
+
44
+ The product avoids a generic chatbot layout. The main flow is closer to opening an object file:
45
+
46
+ 1. intake the object
47
+ 2. generate an object record
48
+ 3. reveal the persona
49
+ 4. read the diary
50
+ 5. chat with the object
51
+ 6. export or inspect the trace
52
+
53
+ Six stable examples are included so the demo can run even when hosted model resources are unavailable.
54
+
55
+ ## 5. Architecture
56
+
57
+ The app keeps the Gradio UI separate from model execution:
58
+
59
+ - `src/ui/layout.py` builds the Gradio Blocks interface
60
+ - `src/pipeline.py` coordinates generation
61
+ - `src/models/vision_runner.py` handles mock or MiniCPM-V object understanding
62
+ - `src/models/llama_cpp_runner.py` handles mock text or optional llama.cpp text generation
63
+ - `src/traces/logger.py` writes anonymized trace records
64
+ - `src/renderer/share_card.py` renders the shareable card preview
65
+
66
+ This boundary matters. It lets the mock MVP, hosted Space validation, and future local GGUF experiments share the same data shapes and fallback markers.
67
+
68
+ ## 6. Runtime And Fallbacks
69
+
70
+ The stable baseline uses:
71
+
72
+ ```bash
73
+ OBJECTVERSE_VISION_BACKEND=mock
74
+ OBJECTVERSE_TEXT_BACKEND=mock
75
+ ```
76
+
77
+ Optional MiniCPM-V vision can be enabled with:
78
 
79
+ ```bash
80
+ OBJECTVERSE_VISION_BACKEND=minicpm-v
81
+ VISION_MODEL_ID=openbmb/MiniCPM-V-2_6
82
+ OBJECTVERSE_TEXT_BACKEND=mock
83
+ ```
84
 
85
+ Optional llama.cpp text generation can be enabled with:
86
 
87
+ ```bash
88
+ OBJECTVERSE_TEXT_BACKEND=llama-cpp
89
+ TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf
90
+ ```
 
 
 
 
 
 
 
91
 
92
+ The fallback behavior is explicit. If MiniCPM-V fails or returns invalid JSON, the trace records `vision-fallback-to-mock`. If llama.cpp is unavailable, missing a model path, or returns invalid JSON, the trace records `text-fallback-to-mock`.
93
 
94
+ ## 7. What Worked
95
 
96
+ The stable loop works locally and in the mock-safe Space:
97
 
98
+ - upload or choose an example object
99
+ - generate object facts, persona, diary, chat state, share card, and trace JSON
100
+ - replay six committed sample traces
101
+ - export public mock traces to JSONL
102
+ - run local unittest and initial-stage checks
103
 
104
+ The Gradio UI also moves away from the default demo feel. It is still Gradio, but the experience reads like a small archive interface.
105
 
106
+ ## 8. What Failed
107
 
108
+ The important failure is hosted MiniCPM-V validation.
109
 
110
+ Paid L4 hardware on the hackathon organization returned `402 Payment Required`. ZeroGPU CUDA probing later succeeded, and the full validation command reached the hosted Space on June 8, 2026. However, mug, keyboard, and shoe validation all fell back to mock vision. The evidence is saved in:
111
 
112
+ - `docs/SPACE_VLM_REPORT.md`
113
+ - `docs/SPACE_VLM_REPORT.json`
114
+ - `data/traces/space-vlm/`
115
 
116
+ This is not hidden in the submission. The stable baseline treats MiniCPM-V as wired but not yet validated in the hosted environment.
117
 
118
+ ## 9. Traces And Reproducibility
119
 
120
+ The project includes public mock traces for the six stable examples under `data/traces/samples/`. They are deterministic and intended for demo replay, schema validation, and public inspection.
121
 
122
+ The Space VLM traces under `data/traces/space-vlm/` are different: they are failure evidence. They show that the hosted Space reached the generation endpoint but used the mock fallback. These traces should not replace the stable mock examples.
123
 
124
+ The export command is:
125
 
126
+ ```bash
127
+ .venv/bin/python -B scripts/export_traces.py
128
+ ```
129
 
130
+ ## 10. Privacy And Safety
131
 
132
+ The project does not use OpenAI, Anthropic, Gemini, Cohere, or other commercial model APIs. It does not commit GGUF files, private images, tokens, credit codes, or `.env` files.
133
 
134
+ Trace logging anonymizes text inputs before public export. The current public traces are synthetic mock examples rather than private user photos.
135
 
136
+ ## 11. What I Would Improve Next
137
 
138
+ The next model-focused step is to inspect Space runtime logs or add non-secret MiniCPM-V diagnostics so the hosted fallback can be diagnosed without leaking credentials.
139
 
140
+ After that:
141
 
142
+ - rerun ZeroGPU MiniCPM-V validation
143
+ - choose and smoke-test a real GGUF text model
144
+ - generate and curate real training candidates
145
+ - publish a dataset and fine-tuned adapter if time allows
146
+ - record a final demo video from the stable Space
147
 
148
+ The current version is intentionally honest: it is a stable, reproducible small-model toy baseline with clear boundaries, visible failures, and a path to stronger model evidence.
149
 
150
+ ## Evidence Links To Fill Before Final Submission
151
 
152
+ - Hugging Face Space: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
153
+ - GitHub repository: pending push confirmation
154
+ - Demo video: pending recording
155
+ - Social post: pending publishing
 
 
 
docs/FINAL_VERIFICATION_REPORT.md ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Final Verification Report
2
+
3
+ - Generated at: 2026-06-08 11:19:49 CST
4
+ - Verified source commit: `b7cb470`
5
+ - Branch: `main`
6
+ - Verification target: stable mock-safe submission baseline
7
+ - Local app URL: `http://127.0.0.1:7860/`
8
+
9
+ ## Summary
10
+
11
+ Objectverse Diary's stable mock-safe baseline is locally verifiable. The app starts with default mock backends, renders the archive-style Gradio interface, runs all six committed example objects, supports object chat, renders share cards, and exposes trace evidence.
12
+
13
+ This report does not claim hosted MiniCPM-V validation, GGUF text generation, LoRA training, model publishing, dataset publishing, or final public submission URLs are complete.
14
+
15
+ ## Command Verification
16
+
17
+ | Check | Result | Notes |
18
+ | --- | --- | --- |
19
+ | `git status --short --untracked-files=all` | PASS | Clean before report generation. |
20
+ | `.venv/bin/python -B -m unittest discover -s tests` | PASS | 30 tests passed. Gradio 6.0 deprecation warnings are non-blocking. |
21
+ | `.venv/bin/python -B scripts/check_initial_stage.py` | PASS | Required files, runtime defaults, trace generation, sample traces, dataset preview, trace export, and Gradio build all passed. |
22
+ | `.venv/bin/python -B scripts/export_traces.py` | PASS | Exported 6 traces to `data/traces/samples/objectverse_public_mock_traces.jsonl`. |
23
+ | `git diff --check` | PASS | No whitespace errors. |
24
+
25
+ ## Browser Verification
26
+
27
+ The local app was started with:
28
+
29
+ ```bash
30
+ GRADIO_SERVER_NAME=127.0.0.1 GRADIO_SERVER_PORT=7860 .venv/bin/python app.py
31
+ ```
32
+
33
+ Browser checks:
34
+
35
+ | Scenario | Result | Evidence |
36
+ | --- | --- | --- |
37
+ | App loads at `http://127.0.0.1:7860/` | PASS | Page rendered after Gradio load state. |
38
+ | English-first / Chinese-second UI appears | PASS | Title, subtitle, section headings, and helper text visible. |
39
+ | Six example buttons visible | PASS | OVD-001 through OVD-006 visible in the Example Objects section. |
40
+ | Coffee mug example | PASS | Object file, Secret Diary, Share Card, and trace content appeared. |
41
+ | Mechanical keyboard example | PASS | Object file, Secret Diary, Share Card, and trace content appeared. |
42
+ | Running shoe example | PASS | Object file, Secret Diary, Share Card, trace content, and saved sample path appeared. |
43
+ | Desk lamp example | PASS | Expected object term, Secret Diary, Share Card, and trace saved state appeared. |
44
+ | Water bottle example | PASS | Expected object term, Secret Diary, Share Card, and trace saved state appeared. |
45
+ | Notebook example | PASS | Expected object term, Secret Diary, Share Card, and trace saved state appeared. |
46
+ | Object chat | PASS | Message `What did you see today?` returned a persona-consistent `Shoe Afterlight` reply. |
47
+ | Browser console | PASS | No warning or error logs observed during local verification. |
48
+
49
+ ## Trace Verification
50
+
51
+ - Six stable public mock sample traces remain under `data/traces/samples/`.
52
+ - The trace export JSONL was regenerated successfully.
53
+ - Hosted Space VLM traces under `data/traces/space-vlm/` remain failure evidence because they include `vision-fallback-to-mock`; they are intentionally not used as successful real VLM traces.
54
+
55
+ ## Security Scan
56
+
57
+ Scanned project docs, source, scripts, tests, and trace directories for:
58
+
59
+ - `hf_`
60
+ - `HF_TOKEN`
61
+ - `HUGGINGFACE_TOKEN`
62
+ - `BEGIN PRIVATE KEY`
63
+ - `SUPABASE_SERVICE_ROLE_KEY`
64
+ - test email pattern
65
+ - private local path markers
66
+ - `.env`
67
+
68
+ Result: PASS with known safe hits only.
69
+
70
+ Known safe hits:
71
+
72
+ - test fixtures intentionally containing `user@example.com`
73
+ - tests asserting that token markers are absent
74
+ - `scripts/check_space_vlm.py` sensitive marker constants and auth helper names
75
+ - documentation warning not to commit `.env`
76
+ - `.env.example` path shown in architecture docs
77
+
78
+ No real token, private key, credential, private image path, GGUF file, or `.env` file was found in the scanned project content.
79
+
80
+ ## Remaining External Items
81
+
82
+ - GitHub push is not performed in this verification run.
83
+ - Hugging Face Space hardware or environment variables are not changed in this verification run.
84
+ - Demo video URL is still pending recording/publication.
85
+ - Field Notes URL is still pending publication.
86
+ - Social post URL is still pending publication.
87
+ - Hosted MiniCPM-V validation still falls back to mock vision.
88
+ - Real GGUF smoke test, LoRA training, HF model publishing, and HF dataset publishing remain future work.
89
+
90
+ ## Verdict
91
+
92
+ PASS for the stable mock-safe local submission baseline.
93
+
94
+ The project is ready for explicit-confirmation external steps: push `main`, record/publish the demo video, publish Field Notes/social post, and fill final submission URLs.
docs/MODEL_CARD.md CHANGED
@@ -2,10 +2,12 @@
2
 
3
  ## Status
4
 
5
- Draft only. No text model has been fine-tuned, converted, or published yet.
6
 
7
  The app defaults to deterministic mock backends. MiniCPM-V 2.6 vision is wired as an optional runtime backend for GPU environments. Text generation has optional llama.cpp wiring for an externally configured GGUF model via `TEXT_MODEL_PATH`.
8
 
 
 
9
  ## Planned Components
10
 
11
  - Vision understanding: MiniCPM-V or lightweight fallback VLM.
@@ -16,8 +18,8 @@ The app defaults to deterministic mock backends. MiniCPM-V 2.6 vision is wired a
16
 
17
  | Component | Candidate | Notes |
18
  | --- | --- | --- |
19
- | Vision | `openbmb/MiniCPM-V-2_6` or mock fallback | Must run without commercial API calls. |
20
- | Text | externally configured GGUF, later small instruct model plus LoRA adapter | Final base model still pending. |
21
  | Runtime | optional GGUF through llama.cpp / llama-cpp-python | Wired with mock fallback; real-model smoke test still pending. |
22
  | UI | Gradio Blocks | Required by the hackathon and project rules. |
23
 
@@ -29,10 +31,11 @@ Record final numbers here before submission:
29
 
30
  | Component | Model | Parameters | Counted Toward Total |
31
  | --- | --- | ---: | --- |
32
- | Vision | MiniCPM-V 2.6 | ~8B | yes |
33
- | Text base | Externally configured GGUF, final model TBD | TBD | yes |
34
- | LoRA adapter | TBD | TBD | yes |
35
- | Total | TBD | TBD | must be <= 32B |
 
36
 
37
  ## Intended Inputs And Outputs
38
 
@@ -69,6 +72,7 @@ Current preview data is deterministic and mock-generated. It should only be used
69
  - If VLM loading fails, use manual description and stable example flow.
70
  - If llama.cpp is not installed, `TEXT_MODEL_PATH` is missing, model loading fails, or output JSON is invalid, keep deterministic mock text fallback for demo safety.
71
  - If model JSON is invalid, repair and validate before rendering.
 
72
 
73
  ## Required Notes
74
 
 
2
 
3
  ## Status
4
 
5
+ Stable submission baseline. No text model has been fine-tuned, converted, or published yet.
6
 
7
  The app defaults to deterministic mock backends. MiniCPM-V 2.6 vision is wired as an optional runtime backend for GPU environments. Text generation has optional llama.cpp wiring for an externally configured GGUF model via `TEXT_MODEL_PATH`.
8
 
9
+ Hosted MiniCPM-V validation is not passing yet. The June 8, 2026 ZeroGPU validation reached the Space, but all three public object checks fell back to mock vision. See `docs/SPACE_VLM_REPORT.md` and `docs/FAILURES.md`.
10
+
11
  ## Planned Components
12
 
13
  - Vision understanding: MiniCPM-V or lightweight fallback VLM.
 
18
 
19
  | Component | Candidate | Notes |
20
  | --- | --- | --- |
21
+ | Vision | `openbmb/MiniCPM-V-2_6` or mock fallback | Wired as optional backend; hosted validation currently falls back to mock. |
22
+ | Text | deterministic mock text; optional externally configured GGUF later | Final base model still pending. |
23
  | Runtime | optional GGUF through llama.cpp / llama-cpp-python | Wired with mock fallback; real-model smoke test still pending. |
24
  | UI | Gradio Blocks | Required by the hackathon and project rules. |
25
 
 
31
 
32
  | Component | Model | Parameters | Counted Toward Total |
33
  | --- | --- | ---: | --- |
34
+ | Vision | MiniCPM-V 2.6 optional path | ~8B | yes, when enabled |
35
+ | Text base | Stable baseline mock text | 0 | no model parameters |
36
+ | Future text base | Externally configured GGUF, final model TBD | TBD | yes, when enabled |
37
+ | Future LoRA adapter | TBD | TBD | yes, when enabled |
38
+ | Stable baseline total | Mock text + optional wired vision not active by default | 0 active model parameters by default | <= 32B |
39
 
40
  ## Intended Inputs And Outputs
41
 
 
72
  - If VLM loading fails, use manual description and stable example flow.
73
  - If llama.cpp is not installed, `TEXT_MODEL_PATH` is missing, model loading fails, or output JSON is invalid, keep deterministic mock text fallback for demo safety.
74
  - If model JSON is invalid, repair and validate before rendering.
75
+ - Hosted VLM fallback evidence is preserved in `data/traces/space-vlm/` and should not be described as successful real VLM output.
76
 
77
  ## Required Notes
78
 
docs/SOCIAL_POST.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Social Post Draft
2
+
3
+ ## Short Version
4
+
5
+ I built Objectverse Diary for Build Small Hackathon: a Gradio app where everyday objects wake up, get secret personas, write diaries, chat with you, and generate share cards.
6
+
7
+ Stable demo: mock-safe, reproducible, no commercial AI APIs.
8
+ MiniCPM-V and llama.cpp paths are wired behind fallbacks; hosted VLM validation is documented honestly.
9
+
10
+ Space: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
11
+
12
+ ## Longer Version
13
+
14
+ What if your coffee mug had been quietly keeping a diary?
15
+
16
+ Objectverse Diary is my Build Small Hackathon project: a strange little object archive built with Gradio. Upload an everyday object photo, choose a personality mode, and the app creates:
17
+
18
+ - a structured object file
19
+ - a hidden object persona
20
+ - an English-first secret diary with Chinese helper text
21
+ - an object chat voice
22
+ - a shareable personality card
23
+ - an anonymized trace record
24
+
25
+ The stable submission baseline is mock-safe and reproducible, with no commercial AI APIs. MiniCPM-V vision and llama.cpp text paths are wired as optional backends, and the current hosted MiniCPM-V fallback is documented instead of hidden.
26
+
27
+ Space:
28
+ https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
29
+
30
+ ## Hashtag Ideas
31
+
32
+ #BuildSmallHackathon #Gradio #SmallModels #HuggingFace #ObjectverseDiary
33
+
34
+ ## Notes Before Posting
35
+
36
+ - Add GitHub URL after push is confirmed.
37
+ - Add demo video URL after recording.
38
+ - Do not claim LoRA, GGUF smoke test, or hosted MiniCPM-V validation are complete.
docs/SPACE_VLM_REPORT.json ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "key": "mug",
4
+ "label": "Coffee mug",
5
+ "source_page": "https://commons.wikimedia.org/wiki/File:Striped_coffee_mug.jpg",
6
+ "image_path": ".tmp/space-vlm-assets/mug.jpg",
7
+ "passed": false,
8
+ "object_name": "coffee mug",
9
+ "visible_features": [
10
+ "uploaded photo provided",
11
+ "user-supplied description"
12
+ ],
13
+ "likely_context": "everyday human environment",
14
+ "confidence": 0.42,
15
+ "runtime_vision": "minicpm-v object understanding",
16
+ "runtime_text": "mock persona and diary generation",
17
+ "fallbacks": [
18
+ "vision-fallback-to-mock",
19
+ "mock-text-runtime"
20
+ ],
21
+ "error": "vision fallback marker was present"
22
+ },
23
+ {
24
+ "key": "keyboard",
25
+ "label": "Computer keyboard",
26
+ "source_page": "https://commons.wikimedia.org/wiki/File:Computer_keyboard.jpg",
27
+ "image_path": ".tmp/space-vlm-assets/keyboard.jpg",
28
+ "passed": false,
29
+ "object_name": "keyboard",
30
+ "visible_features": [
31
+ "uploaded photo provided",
32
+ "user-supplied description"
33
+ ],
34
+ "likely_context": "everyday human environment",
35
+ "confidence": 0.42,
36
+ "runtime_vision": "minicpm-v object understanding",
37
+ "runtime_text": "mock persona and diary generation",
38
+ "fallbacks": [
39
+ "vision-fallback-to-mock",
40
+ "mock-text-runtime"
41
+ ],
42
+ "error": "vision fallback marker was present"
43
+ },
44
+ {
45
+ "key": "shoe",
46
+ "label": "Running shoe",
47
+ "source_page": "https://commons.wikimedia.org/wiki/File:Running_shoes.jpg",
48
+ "image_path": ".tmp/space-vlm-assets/shoe.jpg",
49
+ "passed": false,
50
+ "object_name": "shoe",
51
+ "visible_features": [
52
+ "uploaded photo provided",
53
+ "user-supplied description"
54
+ ],
55
+ "likely_context": "everyday human environment",
56
+ "confidence": 0.42,
57
+ "runtime_vision": "minicpm-v object understanding",
58
+ "runtime_text": "mock persona and diary generation",
59
+ "fallbacks": [
60
+ "vision-fallback-to-mock",
61
+ "mock-text-runtime"
62
+ ],
63
+ "error": "vision fallback marker was present"
64
+ }
65
+ ]
docs/SPACE_VLM_REPORT.md CHANGED
@@ -1,15 +1,20 @@
1
  # Space VLM Validation Report
2
 
3
- - Generated at: 2026-06-06 05:19:42 UTC
4
  - Space URL: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
5
  - Space repo: `build-small-hackathon/ObjectverseDiary`
6
- - Overall status: NOT RUN
7
  - Vision backend expected: `minicpm-v`
8
  - Text backend expected: `mock`
9
 
10
  ## Space Configuration
11
 
12
- - Applied configuration: not changed by this run.
 
 
 
 
 
13
 
14
  - Rollback configuration:
15
  - `repo_id`: `build-small-hackathon/ObjectverseDiary`
@@ -19,6 +24,48 @@
19
 
20
  ## Results
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  ## Notes
23
 
24
  - Test images are temporary public Wikimedia Commons assets and are not committed.
 
1
  # Space VLM Validation Report
2
 
3
+ - Generated at: 2026-06-08 02:16:59 UTC
4
  - Space URL: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
5
  - Space repo: `build-small-hackathon/ObjectverseDiary`
6
+ - Overall status: FAIL
7
  - Vision backend expected: `minicpm-v`
8
  - Text backend expected: `mock`
9
 
10
  ## Space Configuration
11
 
12
+ - Applied configuration:
13
+ - `repo_id`: `build-small-hackathon/ObjectverseDiary`
14
+ - `hardware`: `zero-a10g`
15
+ - `OBJECTVERSE_VISION_BACKEND`: `minicpm-v`
16
+ - `VISION_MODEL_ID`: `openbmb/MiniCPM-V-2_6`
17
+ - `OBJECTVERSE_TEXT_BACKEND`: `mock`
18
 
19
  - Rollback configuration:
20
  - `repo_id`: `build-small-hackathon/ObjectverseDiary`
 
24
 
25
  ## Results
26
 
27
+ ### Coffee mug
28
+
29
+ - Status: FAIL
30
+ - Source: https://commons.wikimedia.org/wiki/File:Striped_coffee_mug.jpg
31
+ - Local temporary image: `.tmp/space-vlm-assets/mug.jpg`
32
+ - Object name: `coffee mug`
33
+ - Visible features: uploaded photo provided, user-supplied description
34
+ - Likely context: `everyday human environment`
35
+ - Confidence: 0.42
36
+ - Runtime vision: `minicpm-v object understanding`
37
+ - Runtime text: `mock persona and diary generation`
38
+ - Fallbacks: vision-fallback-to-mock, mock-text-runtime
39
+ - Error: `vision fallback marker was present`
40
+
41
+ ### Computer keyboard
42
+
43
+ - Status: FAIL
44
+ - Source: https://commons.wikimedia.org/wiki/File:Computer_keyboard.jpg
45
+ - Local temporary image: `.tmp/space-vlm-assets/keyboard.jpg`
46
+ - Object name: `keyboard`
47
+ - Visible features: uploaded photo provided, user-supplied description
48
+ - Likely context: `everyday human environment`
49
+ - Confidence: 0.42
50
+ - Runtime vision: `minicpm-v object understanding`
51
+ - Runtime text: `mock persona and diary generation`
52
+ - Fallbacks: vision-fallback-to-mock, mock-text-runtime
53
+ - Error: `vision fallback marker was present`
54
+
55
+ ### Running shoe
56
+
57
+ - Status: FAIL
58
+ - Source: https://commons.wikimedia.org/wiki/File:Running_shoes.jpg
59
+ - Local temporary image: `.tmp/space-vlm-assets/shoe.jpg`
60
+ - Object name: `shoe`
61
+ - Visible features: uploaded photo provided, user-supplied description
62
+ - Likely context: `everyday human environment`
63
+ - Confidence: 0.42
64
+ - Runtime vision: `minicpm-v object understanding`
65
+ - Runtime text: `mock persona and diary generation`
66
+ - Fallbacks: vision-fallback-to-mock, mock-text-runtime
67
+ - Error: `vision fallback marker was present`
68
+
69
  ## Notes
70
 
71
  - Test images are temporary public Wikimedia Commons assets and are not committed.
docs/SUBMISSION_GUIDE.md CHANGED
@@ -3,13 +3,13 @@
3
  ## Required Package
4
 
5
  - [x] Hugging Face Space URL: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
6
- - [ ] GitHub Repository URL: local `origin` configured, sync/submission confirmation pending
7
- - [ ] Demo Video URL: pending recording
8
- - [ ] Social Media Post URL: pending final copy
9
- - [ ] Fine-tuned Model URL: pending model training
10
- - [ ] Dataset URL: pending curation and publishing
11
- - [ ] Trace Dataset URL: local mock JSONL export ready, publishing pending
12
- - [ ] Field Notes Blog URL: draft source in `docs/FIELD_NOTES.md`
13
  - [x] Short project description: available in README
14
 
15
  ## Local Evidence Ready
@@ -18,31 +18,36 @@
18
  - Runtime boundary: `docs/RUNTIME.md`
19
  - Dataset plan and preview workflow: `docs/DATASET.md`
20
  - External setup checklist: `docs/EXTERNAL_SETUP.md`
21
- - Space VLM validation report: `docs/SPACE_VLM_REPORT.md` currently failed because `l4x1` hardware returned `402 Payment Required`; ZeroGPU reached `RUNNING` but the validation request did not return.
 
22
  - Public mock traces: `data/traces/samples/`
 
23
  - Optional llama.cpp runtime wiring: `src/models/llama_cpp_runner.py`
24
 
25
  ## Completed Locally
26
 
27
  - Mock MVP flow, archive-style UI, share card, trace logging, sample traces, dataset preview, and initial acceptance tooling.
 
28
  - MiniCPM-V 2.6 backend wiring with fallback markers.
29
  - Optional llama.cpp text runtime wiring through `TEXT_MODEL_PATH`.
30
- - Hosted Space VLM validation script and pending report template.
 
31
 
32
  ## Not Completed Yet
33
 
34
- - Hosted Space MiniCPM-V validation for mug, keyboard, and shoe; L4 is blocked by Hugging Face paid hardware billing, and ZeroGPU needs further debugging.
35
  - Real GGUF `TEXT_MODEL_PATH` smoke test and final text model parameter count.
36
  - Real model traces, curated dataset, LoRA training, model/dataset publishing.
37
- - Field Notes article, demo video, social post, final submission package.
38
 
39
  ## Final Checks
40
 
41
  - [ ] Space is under the official organization.
42
- - [ ] Space MiniCPM-V validation passes for mug, keyboard, and shoe. Current status: L4 blocked by paid hardware billing; ZeroGPU request path unresolved.
43
- - [ ] Demo video is under 2 minutes.
44
- - [ ] README includes model parameter counts.
45
  - [ ] No commercial cloud AI APIs are used.
 
46
  - [ ] Fine-tuned model is linked.
47
  - [ ] Dataset is linked.
48
  - [ ] Traces are linked.
 
3
  ## Required Package
4
 
5
  - [x] Hugging Face Space URL: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
6
+ - [x] GitHub Repository URL: local `origin` configured as `https://github.com/qqyule/Objectverse-Diary.git`; push still requires explicit confirmation
7
+ - [x] Demo Video Script: `docs/DEMO_VIDEO_SCRIPT.md`
8
+ - [x] Social Media Post Draft: `docs/SOCIAL_POST.md`
9
+ - [ ] Fine-tuned Model URL: not included in stable baseline; LoRA/model publishing remains future work
10
+ - [ ] Dataset URL: not included in stable baseline; local mock preview exists
11
+ - [x] Trace Dataset: local public mock JSONL export at `data/traces/samples/objectverse_public_mock_traces.jsonl`
12
+ - [x] Field Notes Draft: `docs/FIELD_NOTES.md`
13
  - [x] Short project description: available in README
14
 
15
  ## Local Evidence Ready
 
18
  - Runtime boundary: `docs/RUNTIME.md`
19
  - Dataset plan and preview workflow: `docs/DATASET.md`
20
  - External setup checklist: `docs/EXTERNAL_SETUP.md`
21
+ - Space VLM validation report: `docs/SPACE_VLM_REPORT.md` currently failed. Paid L4 returned `402 Payment Required`; later ZeroGPU validation reached the app on 2026-06-08, but mug/keyboard/shoe all fell back to mock vision.
22
+ - Space VLM trace evidence: `data/traces/space-vlm/`
23
  - Public mock traces: `data/traces/samples/`
24
+ - Stable demo baseline: Gradio example buttons replay committed sample traces first, then fall back to the live generation pipeline if a cached trace is missing.
25
  - Optional llama.cpp runtime wiring: `src/models/llama_cpp_runner.py`
26
 
27
  ## Completed Locally
28
 
29
  - Mock MVP flow, archive-style UI, share card, trace logging, sample traces, dataset preview, and initial acceptance tooling.
30
+ - Stable local demo baseline with six replayable example outputs, shared cached/live UI formatting, chat wake state, share card, and trace panel output.
31
  - MiniCPM-V 2.6 backend wiring with fallback markers.
32
  - Optional llama.cpp text runtime wiring through `TEXT_MODEL_PATH`.
33
+ - Hosted Space VLM validation script, report, JSON summary, and trace evidence export.
34
+ - Field Notes draft, demo video script, and social post draft for the stable submission package.
35
 
36
  ## Not Completed Yet
37
 
38
+ - Hosted Space MiniCPM-V validation for mug, keyboard, and shoe; ZeroGPU validation reached the app but currently falls back to mock vision.
39
  - Real GGUF `TEXT_MODEL_PATH` smoke test and final text model parameter count.
40
  - Real model traces, curated dataset, LoRA training, model/dataset publishing.
41
+ - Field Notes publication URL, recorded demo video URL, social post URL, and final public push/submission.
42
 
43
  ## Final Checks
44
 
45
  - [ ] Space is under the official organization.
46
+ - [ ] Space MiniCPM-V validation passes for mug, keyboard, and shoe. Current status: wired but hosted validation falls back to mock.
47
+ - [x] Demo video script targets under 2 minutes.
48
+ - [x] README includes stable-baseline parameter budget and links to the model card.
49
  - [ ] No commercial cloud AI APIs are used.
50
+ - [x] Mock-safe local demo baseline is reproducible from committed sample traces.
51
  - [ ] Fine-tuned model is linked.
52
  - [ ] Dataset is linked.
53
  - [ ] Traces are linked.
scripts/check_space_vlm.py CHANGED
@@ -42,6 +42,8 @@ MOCK_SAFE_VARIABLES = {
42
  "OBJECTVERSE_TEXT_BACKEND": "mock",
43
  }
44
 
 
 
45
 
46
  @dataclass(frozen=True)
47
  class ValidationAsset:
@@ -182,12 +184,14 @@ def run_space_validation(
182
  asset_dir: Path = DEFAULT_ASSET_DIR,
183
  timeout_seconds: int = 900,
184
  assets: list[ValidationAsset] | None = None,
 
185
  ) -> list[ValidationResult]:
186
  from gradio_client import handle_file
187
 
188
  selected_assets = assets or TEST_ASSETS
189
  paths = download_validation_assets(asset_dir, selected_assets)
190
- client = _build_gradio_client(space_url, timeout_seconds=timeout_seconds)
 
191
  results: list[ValidationResult] = []
192
  started = time.monotonic()
193
  for asset in selected_assets:
@@ -202,6 +206,9 @@ def run_space_validation(
202
  asset.mode,
203
  timeout_seconds=min(PREDICTION_TIMEOUT_SECONDS, remaining),
204
  )
 
 
 
205
  results.append(validate_prediction(asset, paths[asset.key], response))
206
  except Exception as exc:
207
  results.append(
@@ -265,6 +272,15 @@ def _build_gradio_client(space_url: str, *, timeout_seconds: int) -> Any:
265
  raise TimeoutError(f"Could not fetch Gradio config for {space_url}: {type(last_error).__name__}: {last_error}")
266
 
267
 
 
 
 
 
 
 
 
 
 
268
  def validate_prediction(
269
  asset: ValidationAsset,
270
  image_path: Path,
@@ -387,6 +403,14 @@ def write_json_results(results: list[ValidationResult], output_path: Path) -> Pa
387
  return output_path
388
 
389
 
 
 
 
 
 
 
 
 
390
  def _download_url(url: str, output_path: Path) -> None:
391
  request = urllib.request.Request(
392
  url,
@@ -410,6 +434,16 @@ def _extract_trace_payload(response: Any) -> dict[str, Any]:
410
  return trace_payload
411
 
412
 
 
 
 
 
 
 
 
 
 
 
413
  def _failure_reason(
414
  expected_match: bool,
415
  vision_runtime_ok: bool,
@@ -464,6 +498,7 @@ def _parse_args() -> argparse.Namespace:
464
  parser.add_argument("--rollback-to-mock", action="store_true")
465
  parser.add_argument("--hardware", default=DEFAULT_HARDWARE)
466
  parser.add_argument("--skip-validation", action="store_true")
 
467
  return parser.parse_args()
468
 
469
 
@@ -499,6 +534,7 @@ def main() -> None:
499
  space_url=args.space_url,
500
  asset_dir=args.asset_dir,
501
  timeout_seconds=args.timeout_seconds,
 
502
  )
503
  except Exception as exc:
504
  configuration_error = f"{type(exc).__name__}: {exc}"
 
42
  "OBJECTVERSE_TEXT_BACKEND": "mock",
43
  }
44
 
45
+ SENSITIVE_TRACE_MARKERS = ("HUGGINGFACE_TOKEN", "HF_TOKEN", "hf_")
46
+
47
 
48
  @dataclass(frozen=True)
49
  class ValidationAsset:
 
184
  asset_dir: Path = DEFAULT_ASSET_DIR,
185
  timeout_seconds: int = 900,
186
  assets: list[ValidationAsset] | None = None,
187
+ trace_output_dir: Path | None = None,
188
  ) -> list[ValidationResult]:
189
  from gradio_client import handle_file
190
 
191
  selected_assets = assets or TEST_ASSETS
192
  paths = download_validation_assets(asset_dir, selected_assets)
193
+ client_url = space_client_url(space_url)
194
+ client = _build_gradio_client(client_url, timeout_seconds=timeout_seconds)
195
  results: list[ValidationResult] = []
196
  started = time.monotonic()
197
  for asset in selected_assets:
 
206
  asset.mode,
207
  timeout_seconds=min(PREDICTION_TIMEOUT_SECONDS, remaining),
208
  )
209
+ if trace_output_dir is not None:
210
+ trace = extract_trace_record(response)
211
+ write_trace_record(trace, trace_output_dir / f"{asset.key}.json")
212
  results.append(validate_prediction(asset, paths[asset.key], response))
213
  except Exception as exc:
214
  results.append(
 
272
  raise TimeoutError(f"Could not fetch Gradio config for {space_url}: {type(last_error).__name__}: {last_error}")
273
 
274
 
275
+ def space_client_url(space_url: str) -> str:
276
+ parsed = urlparse(space_url)
277
+ if parsed.netloc.endswith(".hf.space"):
278
+ return space_url.rstrip("/")
279
+ repo_id = parse_space_repo_id(space_url)
280
+ owner, space_name = repo_id.split("/", 1)
281
+ return f"https://{owner}-{space_name}.hf.space".lower()
282
+
283
+
284
  def validate_prediction(
285
  asset: ValidationAsset,
286
  image_path: Path,
 
403
  return output_path
404
 
405
 
406
+ def write_trace_record(trace: TraceRecord, output_path: Path) -> Path:
407
+ output_path.parent.mkdir(parents=True, exist_ok=True)
408
+ serialized = json.dumps(trace.model_dump(mode="json"), ensure_ascii=False, indent=2, sort_keys=True)
409
+ _assert_trace_is_public_safe(serialized)
410
+ output_path.write_text(serialized + "\n", encoding="utf-8")
411
+ return output_path
412
+
413
+
414
  def _download_url(url: str, output_path: Path) -> None:
415
  request = urllib.request.Request(
416
  url,
 
434
  return trace_payload
435
 
436
 
437
+ def extract_trace_record(response: Any) -> TraceRecord:
438
+ return TraceRecord.model_validate(_extract_trace_payload(response))
439
+
440
+
441
+ def _assert_trace_is_public_safe(serialized_trace: str) -> None:
442
+ for marker in SENSITIVE_TRACE_MARKERS:
443
+ if marker in serialized_trace:
444
+ raise ValueError("Trace output may contain a sensitive token marker.")
445
+
446
+
447
  def _failure_reason(
448
  expected_match: bool,
449
  vision_runtime_ok: bool,
 
498
  parser.add_argument("--rollback-to-mock", action="store_true")
499
  parser.add_argument("--hardware", default=DEFAULT_HARDWARE)
500
  parser.add_argument("--skip-validation", action="store_true")
501
+ parser.add_argument("--trace-output-dir", type=Path)
502
  return parser.parse_args()
503
 
504
 
 
534
  space_url=args.space_url,
535
  asset_dir=args.asset_dir,
536
  timeout_seconds=args.timeout_seconds,
537
+ trace_output_dir=args.trace_output_dir,
538
  )
539
  except Exception as exc:
540
  configuration_error = f"{type(exc).__name__}: {exc}"
src/example_cache.py ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Stable example output loading for demo playback."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import json
6
+ from pathlib import Path
7
+
8
+ from src.models.schema import GenerationResult, TraceRecord
9
+
10
+
11
+ DEFAULT_SAMPLE_TRACE_DIR = Path("data/traces/samples")
12
+
13
+
14
+ def sample_trace_path(index: int, sample_dir: Path = DEFAULT_SAMPLE_TRACE_DIR) -> Path | None:
15
+ """Return the committed sample trace path for a 0-based example index."""
16
+ trace_id = f"sample-{index + 1:02d}"
17
+ matches = sorted(sample_dir.glob(f"{trace_id}-*.json"))
18
+ return matches[0] if matches else None
19
+
20
+
21
+ def load_sample_generation(index: int, sample_dir: Path = DEFAULT_SAMPLE_TRACE_DIR) -> GenerationResult | None:
22
+ path = sample_trace_path(index, sample_dir)
23
+ if path is None:
24
+ return None
25
+
26
+ trace = TraceRecord.model_validate(json.loads(path.read_text(encoding="utf-8")))
27
+ return GenerationResult(
28
+ object_understanding=trace.object_understanding,
29
+ persona=trace.persona,
30
+ diary=trace.diary,
31
+ trace=trace,
32
+ trace_path=str(path),
33
+ )
src/ui/layout.py CHANGED
@@ -9,6 +9,7 @@ from typing import Any
9
  import gradio as gr
10
 
11
  from src.config import APP_TITLE, DEFAULT_MODE, PERSONALITY_MODES
 
12
  from src.examples import EXAMPLE_OBJECTS, example_button_label
13
  from src.models.llama_cpp_runner import reply_as_object
14
  from src.models.schema import GenerationResult
@@ -237,6 +238,10 @@ def _panel_header(index: str, title: str, chinese: str, note: str) -> str:
237
  def _example_handler(index: int):
238
  def load_example() -> tuple[Any, ...]:
239
  item = EXAMPLE_OBJECTS[index]
 
 
 
 
240
  result = generate_object_file(None, item["description"], item["mode"])
241
  return item["description"], item["mode"], *result
242
 
@@ -254,6 +259,10 @@ def generate_object_file(
254
  except Exception as exc: # pragma: no cover - exercised manually by UI failure paths.
255
  return _generation_error(exc, description, mode)
256
 
 
 
 
 
257
  object_payload = result.object_understanding.model_dump(mode="json")
258
  persona_payload = result.persona.model_dump(mode="json")
259
  return (
 
9
  import gradio as gr
10
 
11
  from src.config import APP_TITLE, DEFAULT_MODE, PERSONALITY_MODES
12
+ from src.example_cache import load_sample_generation
13
  from src.examples import EXAMPLE_OBJECTS, example_button_label
14
  from src.models.llama_cpp_runner import reply_as_object
15
  from src.models.schema import GenerationResult
 
238
  def _example_handler(index: int):
239
  def load_example() -> tuple[Any, ...]:
240
  item = EXAMPLE_OBJECTS[index]
241
+ cached_result = load_sample_generation(index)
242
+ if cached_result is not None:
243
+ return item["description"], item["mode"], *_format_generation_result(cached_result)
244
+
245
  result = generate_object_file(None, item["description"], item["mode"])
246
  return item["description"], item["mode"], *result
247
 
 
259
  except Exception as exc: # pragma: no cover - exercised manually by UI failure paths.
260
  return _generation_error(exc, description, mode)
261
 
262
+ return _format_generation_result(result)
263
+
264
+
265
+ def _format_generation_result(result: GenerationResult) -> GenerationUiResult:
266
  object_payload = result.object_understanding.model_dump(mode="json")
267
  persona_payload = result.persona.model_dump(mode="json")
268
  return (
tests/test_mock_mvp.py CHANGED
@@ -6,10 +6,17 @@ import json
6
  import tempfile
7
  import unittest
8
  from pathlib import Path
 
9
 
 
10
  from src.examples import EXAMPLE_OBJECTS, gradio_examples
11
- from src.models.llama_cpp_runner import generate_diary, generate_persona, reply_as_object
12
- from src.models.vision_runner import understand_object
 
 
 
 
 
13
  from src.pipeline import generate_object_diary
14
  from src.renderer.share_card import render_share_card
15
  from src.traces.anonymizer import anonymize_text
@@ -19,7 +26,27 @@ from scripts.check_initial_stage import run_checks
19
  from src.config import get_runtime_settings, runtime_status
20
 
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  class MockMvpTest(unittest.TestCase):
 
 
 
23
  def test_runtime_defaults_to_mock(self) -> None:
24
  settings = get_runtime_settings({})
25
  status = runtime_status(settings)
@@ -34,6 +61,17 @@ class MockMvpTest(unittest.TestCase):
34
  self.assertEqual(len(gradio_examples()), 6)
35
  self.assertTrue(all(len(example) == 1 for example in gradio_examples()))
36
 
 
 
 
 
 
 
 
 
 
 
 
37
  def test_mock_generation_flow(self) -> None:
38
  object_understanding = understand_object(
39
  None,
@@ -49,6 +87,120 @@ class MockMvpTest(unittest.TestCase):
49
  self.assertIn("今天", diary.chinese)
50
  self.assertIn("objectverse-card", share_card)
51
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
  def test_pipeline_saves_generation_result(self) -> None:
53
  with tempfile.TemporaryDirectory() as tmp_dir:
54
  result = generate_object_diary(
@@ -63,6 +215,28 @@ class MockMvpTest(unittest.TestCase):
63
  self.assertEqual(result.object_understanding.object.name, "coffee mug")
64
  self.assertEqual(saved_path.stem, result.trace.trace_id)
65
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66
  def test_chat_uses_current_persona(self) -> None:
67
  object_understanding = understand_object(None, "dusty black mechanical keyboard")
68
  persona = generate_persona(object_understanding, "Philosopher")
 
6
  import tempfile
7
  import unittest
8
  from pathlib import Path
9
+ from unittest.mock import patch
10
 
11
+ from src.example_cache import load_sample_generation, sample_trace_path
12
  from src.examples import EXAMPLE_OBJECTS, gradio_examples
13
+ from src.models.llama_cpp_runner import (
14
+ generate_diary,
15
+ generate_persona,
16
+ reply_as_object,
17
+ reset_text_runtime_fallbacks,
18
+ )
19
+ from src.models.vision_runner import understand_object, understand_object_with_metadata
20
  from src.pipeline import generate_object_diary
21
  from src.renderer.share_card import render_share_card
22
  from src.traces.anonymizer import anonymize_text
 
26
  from src.config import get_runtime_settings, runtime_status
27
 
28
 
29
+ class FakeMiniCpmModel:
30
+ def __init__(self, response: str) -> None:
31
+ self.response = response
32
+
33
+ def chat(self, **_: object) -> str:
34
+ return self.response
35
+
36
+
37
+ class FakeLlamaModel:
38
+ def __init__(self, responses: list[str]) -> None:
39
+ self.responses = responses
40
+
41
+ def create_chat_completion(self, **_: object) -> dict:
42
+ response = self.responses.pop(0)
43
+ return {"choices": [{"message": {"content": response}}]}
44
+
45
+
46
  class MockMvpTest(unittest.TestCase):
47
+ def tearDown(self) -> None:
48
+ reset_text_runtime_fallbacks()
49
+
50
  def test_runtime_defaults_to_mock(self) -> None:
51
  settings = get_runtime_settings({})
52
  status = runtime_status(settings)
 
61
  self.assertEqual(len(gradio_examples()), 6)
62
  self.assertTrue(all(len(example) == 1 for example in gradio_examples()))
63
 
64
+ def test_sample_generation_cache_loads_committed_example_trace(self) -> None:
65
+ path = sample_trace_path(0)
66
+ result = load_sample_generation(0)
67
+
68
+ self.assertIsNotNone(path)
69
+ self.assertIsNotNone(result)
70
+ assert result is not None
71
+ self.assertEqual(result.trace.trace_id, "sample-01")
72
+ self.assertEqual(result.object_understanding.object.name, "coffee mug")
73
+ self.assertEqual(result.trace_path, str(path))
74
+
75
  def test_mock_generation_flow(self) -> None:
76
  object_understanding = understand_object(
77
  None,
 
87
  self.assertIn("今天", diary.chinese)
88
  self.assertIn("objectverse-card", share_card)
89
 
90
+ def test_llama_cpp_persona_diary_and_chat_accept_valid_json(self) -> None:
91
+ env = {
92
+ "OBJECTVERSE_TEXT_BACKEND": "llama-cpp",
93
+ "TEXT_MODEL_PATH": "/tmp/objectverse-text-model.gguf",
94
+ }
95
+ fake_llama = FakeLlamaModel(
96
+ [
97
+ """
98
+ {"persona":{"object_name":"coffee mug","character_name":"Mugworth","mood":"dry and suspicious","secret_fear":"being left empty forever","core_memory":"It remembers every late-night refill.","complaint":"I am treated like a ceramic fuel tank.","tags":["desk witness","warm archive","quiet judgment"]}}
99
+ """,
100
+ """
101
+ {"title":"Secret Diary - Day 418","english":"Today I held another bitter storm and called it service.","chinese":"今天我又装下一场苦涩风暴,并被称为有用。"}
102
+ """,
103
+ """
104
+ {"reply":"Mugworth: I have seen your deadlines dissolve into coffee rings."}
105
+ """,
106
+ ]
107
+ )
108
+
109
+ with (
110
+ patch.dict("os.environ", env, clear=False),
111
+ patch("src.models.llama_cpp_runner._load_llama_model", return_value=fake_llama),
112
+ ):
113
+ object_understanding = understand_object(None, "white coffee mug")
114
+ persona = generate_persona(object_understanding, "Cynical")
115
+ diary = generate_diary(persona, "Cynical")
116
+ reply = reply_as_object(persona.model_dump(mode="json"), "What did you see?")
117
+
118
+ self.assertEqual(persona.persona.character_name, "Mugworth")
119
+ self.assertEqual(diary.title, "Secret Diary - Day 418")
120
+ self.assertIn("Mugworth", reply)
121
+
122
+ def test_llama_cpp_missing_model_path_falls_back_to_mock(self) -> None:
123
+ env = {"OBJECTVERSE_TEXT_BACKEND": "llama-cpp", "TEXT_MODEL_PATH": ""}
124
+
125
+ with patch.dict("os.environ", env, clear=False):
126
+ result = generate_object_diary(None, "dusty black keyboard", "Philosopher", save=False)
127
+
128
+ self.assertEqual(result.persona.persona.object_name, "keyboard")
129
+ self.assertIn("text-fallback-to-mock", result.trace.fallbacks)
130
+ self.assertIn("mock-vision-runtime", result.trace.fallbacks)
131
+ self.assertNotIn("mock-text-runtime", result.trace.fallbacks)
132
+
133
+ def test_llama_cpp_import_failure_falls_back_to_mock(self) -> None:
134
+ env = {
135
+ "OBJECTVERSE_TEXT_BACKEND": "llama_cpp",
136
+ "TEXT_MODEL_PATH": "/tmp/objectverse-text-model.gguf",
137
+ }
138
+
139
+ with (
140
+ patch.dict("os.environ", env, clear=False),
141
+ patch("src.models.llama_cpp_runner._load_llama_model", side_effect=ImportError("no llama_cpp")),
142
+ ):
143
+ result = generate_object_diary(None, "old white coffee mug", "Cynical", save=False)
144
+
145
+ self.assertEqual(result.persona.persona.object_name, "coffee mug")
146
+ self.assertIn("text-fallback-to-mock", result.trace.fallbacks)
147
+
148
+ def test_llama_cpp_invalid_json_falls_back_to_mock(self) -> None:
149
+ env = {
150
+ "OBJECTVERSE_TEXT_BACKEND": "llama-cpp",
151
+ "TEXT_MODEL_PATH": "/tmp/objectverse-text-model.gguf",
152
+ }
153
+
154
+ with (
155
+ patch.dict("os.environ", env, clear=False),
156
+ patch("src.models.llama_cpp_runner._load_llama_model", return_value=FakeLlamaModel(["not json"])),
157
+ ):
158
+ result = generate_object_diary(None, "old white coffee mug", "Cynical", save=False)
159
+
160
+ self.assertEqual(result.persona.persona.object_name, "coffee mug")
161
+ self.assertIn("text-fallback-to-mock", result.trace.fallbacks)
162
+ self.assertEqual(result.trace.model_runtime["text"], "llama-cpp text generation")
163
+
164
+ def test_minicpm_vision_backend_accepts_valid_json(self) -> None:
165
+ response = """
166
+ {"object":{"name":"coffee mug","visible_features":["white ceramic","round handle","desk shadow"],"likely_context":"work desk","confidence":0.88}}
167
+ """
168
+ settings = get_runtime_settings(
169
+ {
170
+ "OBJECTVERSE_VISION_BACKEND": "minicpm-v",
171
+ "VISION_MODEL_ID": "openbmb/MiniCPM-V-2_6",
172
+ "OBJECTVERSE_TEXT_BACKEND": "mock",
173
+ }
174
+ )
175
+
176
+ with (
177
+ patch("src.models.vision_runner._load_rgb_image", return_value=object()),
178
+ patch("src.models.vision_runner._load_minicpm_components", return_value=(FakeMiniCpmModel(response), object())),
179
+ ):
180
+ result = understand_object_with_metadata("/tmp/mug.png", "white mug", settings=settings)
181
+
182
+ self.assertEqual(result.object_understanding.object.name, "coffee mug")
183
+ self.assertEqual(result.object_understanding.object.confidence, 0.88)
184
+ self.assertEqual(result.fallbacks, [])
185
+
186
+ def test_minicpm_vision_backend_falls_back_on_invalid_json(self) -> None:
187
+ settings = get_runtime_settings(
188
+ {
189
+ "OBJECTVERSE_VISION_BACKEND": "minicpm-v",
190
+ "VISION_MODEL_ID": "openbmb/MiniCPM-V-2_6",
191
+ "OBJECTVERSE_TEXT_BACKEND": "mock",
192
+ }
193
+ )
194
+
195
+ with (
196
+ patch("src.models.vision_runner._load_rgb_image", return_value=object()),
197
+ patch("src.models.vision_runner._load_minicpm_components", return_value=(FakeMiniCpmModel("not json"), object())),
198
+ ):
199
+ result = understand_object_with_metadata("/tmp/keyboard.png", "dusty black keyboard", settings=settings)
200
+
201
+ self.assertEqual(result.object_understanding.object.name, "keyboard")
202
+ self.assertEqual(result.fallbacks, ["vision-fallback-to-mock"])
203
+
204
  def test_pipeline_saves_generation_result(self) -> None:
205
  with tempfile.TemporaryDirectory() as tmp_dir:
206
  result = generate_object_diary(
 
215
  self.assertEqual(result.object_understanding.object.name, "coffee mug")
216
  self.assertEqual(saved_path.stem, result.trace.trace_id)
217
 
218
+ def test_pipeline_records_minicpm_vision_runtime(self) -> None:
219
+ response = """
220
+ {"object":{"name":"desk lamp","visible_features":["metal shade","thin neck","warm light"],"likely_context":"desk","confidence":0.91}}
221
+ """
222
+ env = {
223
+ "OBJECTVERSE_VISION_BACKEND": "minicpm-v",
224
+ "VISION_MODEL_ID": "openbmb/MiniCPM-V-2_6",
225
+ "OBJECTVERSE_TEXT_BACKEND": "mock",
226
+ }
227
+
228
+ with (
229
+ patch.dict("os.environ", env, clear=False),
230
+ patch("src.models.vision_runner._load_rgb_image", return_value=object()),
231
+ patch("src.models.vision_runner._load_minicpm_components", return_value=(FakeMiniCpmModel(response), object())),
232
+ ):
233
+ result = generate_object_diary("/tmp/lamp.png", "desk lamp", "Dramatic", save=False)
234
+
235
+ self.assertEqual(result.object_understanding.object.name, "desk lamp")
236
+ self.assertEqual(result.trace.model_runtime["vision"], "minicpm-v object understanding")
237
+ self.assertIn("mock-text-runtime", result.trace.fallbacks)
238
+ self.assertNotIn("mock-runtime", result.trace.fallbacks)
239
+
240
  def test_chat_uses_current_persona(self) -> None:
241
  object_understanding = understand_object(None, "dusty black mechanical keyboard")
242
  persona = generate_persona(object_understanding, "Philosopher")
tests/test_space_vlm_tooling.py ADDED
@@ -0,0 +1,220 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Tests for hosted Space VLM validation tooling."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import tempfile
6
+ import unittest
7
+ from datetime import datetime, timezone
8
+ from pathlib import Path
9
+
10
+ from scripts.check_space_vlm import (
11
+ TEST_ASSETS,
12
+ ValidationResult,
13
+ extract_trace_record,
14
+ parse_space_repo_id,
15
+ render_report,
16
+ space_client_url,
17
+ validate_prediction,
18
+ write_trace_record,
19
+ )
20
+ from src.models.schema import DiaryEntry, ObjectInfo, ObjectUnderstanding, Persona, PersonaEnvelope, TraceRecord
21
+ from src.utils.zero_gpu import zero_gpu
22
+
23
+
24
+ class SpaceVlmToolingTest(unittest.TestCase):
25
+ def test_asset_manifest_covers_three_validation_objects(self) -> None:
26
+ keys = {asset.key for asset in TEST_ASSETS}
27
+
28
+ self.assertEqual(keys, {"mug", "keyboard", "shoe"})
29
+ self.assertTrue(all(asset.source_page.startswith("https://commons.wikimedia.org/") for asset in TEST_ASSETS))
30
+ self.assertTrue(all(asset.download_url.startswith("https://commons.wikimedia.org/") for asset in TEST_ASSETS))
31
+
32
+ def test_parse_space_repo_id_from_space_url(self) -> None:
33
+ repo_id = parse_space_repo_id("https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary")
34
+
35
+ self.assertEqual(repo_id, "build-small-hackathon/ObjectverseDiary")
36
+
37
+ def test_space_client_url_uses_direct_hf_space_host(self) -> None:
38
+ client_url = space_client_url("https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary")
39
+
40
+ self.assertEqual(client_url, "https://build-small-hackathon-objectversediary.hf.space")
41
+
42
+ def test_zero_gpu_decorator_is_noop_without_spaces_package(self) -> None:
43
+ def sample(value: int) -> int:
44
+ return value + 1
45
+
46
+ decorated = zero_gpu(duration=10)(sample)
47
+
48
+ self.assertEqual(decorated(2), 3)
49
+
50
+ def test_validate_prediction_accepts_minicpm_runtime(self) -> None:
51
+ asset = TEST_ASSETS[0]
52
+ trace = _trace_record(
53
+ object_name="striped coffee mug",
54
+ visible_features=["ceramic cup", "handle", "striped surface"],
55
+ runtime_vision="minicpm-v object understanding",
56
+ fallbacks=["mock-text-runtime"],
57
+ )
58
+ response = [None, {}, {}, "", "", "", trace.model_dump(mode="json")]
59
+
60
+ result = validate_prediction(asset, Path("/tmp/mug.jpg"), response)
61
+
62
+ self.assertTrue(result.passed)
63
+ self.assertEqual(result.object_name, "striped coffee mug")
64
+ self.assertEqual(result.runtime_text, "mock persona and diary generation")
65
+
66
+ def test_validate_prediction_rejects_vision_fallback(self) -> None:
67
+ asset = TEST_ASSETS[1]
68
+ trace = _trace_record(
69
+ object_name="computer keyboard",
70
+ visible_features=["black keys"],
71
+ runtime_vision="minicpm-v object understanding",
72
+ fallbacks=["vision-fallback-to-mock", "mock-text-runtime"],
73
+ )
74
+ response = [None, {}, {}, "", "", "", trace.model_dump(mode="json")]
75
+
76
+ result = validate_prediction(asset, Path("/tmp/keyboard.jpg"), response)
77
+
78
+ self.assertFalse(result.passed)
79
+ self.assertIn("vision fallback marker", result.error)
80
+
81
+ def test_extract_trace_record_accepts_gradio_response(self) -> None:
82
+ trace = _trace_record(
83
+ object_name="running shoe",
84
+ visible_features=["laces", "rubber sole"],
85
+ runtime_vision="minicpm-v object understanding",
86
+ fallbacks=["mock-text-runtime"],
87
+ )
88
+ response = [None, {}, {}, "", "", "", trace.model_dump(mode="json")]
89
+
90
+ extracted = extract_trace_record(response)
91
+
92
+ self.assertEqual(extracted.object_understanding.object.name, "running shoe")
93
+ self.assertEqual(extracted.model_runtime["vision"], "minicpm-v object understanding")
94
+
95
+ def test_write_trace_record_writes_valid_public_json(self) -> None:
96
+ trace = _trace_record(
97
+ object_name="striped coffee mug",
98
+ visible_features=["ceramic cup", "handle"],
99
+ runtime_vision="minicpm-v object understanding",
100
+ fallbacks=["mock-text-runtime"],
101
+ )
102
+
103
+ with tempfile.TemporaryDirectory() as tmp_dir:
104
+ output_path = write_trace_record(trace, Path(tmp_dir) / "mug.json")
105
+ payload = output_path.read_text(encoding="utf-8")
106
+ parsed = TraceRecord.model_validate_json(payload)
107
+
108
+ self.assertEqual(parsed.trace_id, trace.trace_id)
109
+ self.assertNotIn("HUGGINGFACE_TOKEN", payload)
110
+ self.assertNotIn("HF_TOKEN", payload)
111
+ self.assertNotIn("hf_", payload)
112
+
113
+ def test_write_trace_record_rejects_sensitive_token_markers(self) -> None:
114
+ trace = _trace_record(
115
+ object_name="computer keyboard",
116
+ visible_features=["black keys"],
117
+ runtime_vision="minicpm-v object understanding",
118
+ fallbacks=["mock-text-runtime"],
119
+ )
120
+ trace.model_runtime["runtime"] = "vision model id: openbmb/MiniCPM-V-2_6; token hf_forbidden"
121
+
122
+ with tempfile.TemporaryDirectory() as tmp_dir:
123
+ output_path = Path(tmp_dir) / "keyboard.json"
124
+ with self.assertRaises(ValueError):
125
+ write_trace_record(trace, output_path)
126
+
127
+ self.assertFalse(output_path.exists())
128
+
129
+ def test_render_report_includes_results_and_safe_config(self) -> None:
130
+ result = ValidationResult(
131
+ key="shoe",
132
+ label="Running shoe",
133
+ source_page="https://commons.wikimedia.org/wiki/File:Running_shoes.jpg",
134
+ image_path="/tmp/shoe.jpg",
135
+ passed=True,
136
+ object_name="running shoe",
137
+ visible_features=["laces", "athletic sole"],
138
+ likely_context="sports gear",
139
+ confidence=0.86,
140
+ runtime_vision="minicpm-v object understanding",
141
+ runtime_text="mock persona and diary generation",
142
+ fallbacks=["mock-text-runtime"],
143
+ )
144
+
145
+ report = render_report(
146
+ space_url="https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary",
147
+ repo_id="build-small-hackathon/ObjectverseDiary",
148
+ results=[result],
149
+ configured={"hardware": "l4x1", "OBJECTVERSE_VISION_BACKEND": "minicpm-v"},
150
+ rollback={"hardware": "cpu-basic", "OBJECTVERSE_VISION_BACKEND": "mock"},
151
+ )
152
+
153
+ self.assertIn("Overall status: PASS", report)
154
+ self.assertIn("Running shoe", report)
155
+ self.assertIn("OBJECTVERSE_VISION_BACKEND", report)
156
+ self.assertNotIn("hf_", report.lower())
157
+ self.assertNotIn("HUGGINGFACE_TOKEN", report)
158
+
159
+ def test_render_report_includes_configuration_error(self) -> None:
160
+ report = render_report(
161
+ space_url="https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary",
162
+ repo_id="build-small-hackathon/ObjectverseDiary",
163
+ results=[],
164
+ rollback={"hardware": "cpu-basic", "OBJECTVERSE_VISION_BACKEND": "mock"},
165
+ configuration_error="HfHubHTTPError: 402 Payment Required",
166
+ )
167
+
168
+ self.assertIn("Overall status: FAIL", report)
169
+ self.assertIn("Configuration Error", report)
170
+ self.assertIn("402 Payment Required", report)
171
+
172
+
173
+ def _trace_record(
174
+ *,
175
+ object_name: str,
176
+ visible_features: list[str],
177
+ runtime_vision: str,
178
+ fallbacks: list[str],
179
+ ) -> TraceRecord:
180
+ persona = PersonaEnvelope(
181
+ persona=Persona(
182
+ object_name=object_name,
183
+ character_name="Test Object",
184
+ mood="watchful",
185
+ secret_fear="being ignored",
186
+ core_memory="It remembers the test bench.",
187
+ complaint="I am more than a fixture.",
188
+ tags=["test", "object", "archive"],
189
+ )
190
+ )
191
+ return TraceRecord(
192
+ trace_id="space-vlm-test",
193
+ created_at=datetime.now(timezone.utc),
194
+ mode="Cynical",
195
+ input={"has_image": True, "image_filename": "asset.jpg", "description": "public test asset"},
196
+ object_understanding=ObjectUnderstanding(
197
+ object=ObjectInfo(
198
+ name=object_name,
199
+ visible_features=visible_features,
200
+ likely_context="test environment",
201
+ confidence=0.9,
202
+ )
203
+ ),
204
+ persona=persona,
205
+ diary=DiaryEntry(
206
+ title="Secret Diary - Day 1",
207
+ english="I was tested today.",
208
+ chinese="今天我被测试了。",
209
+ ),
210
+ model_runtime={
211
+ "vision": runtime_vision,
212
+ "text": "mock persona and diary generation",
213
+ "runtime": "vision model id: openbmb/MiniCPM-V-2_6; no llama.cpp model connected yet",
214
+ },
215
+ fallbacks=fallbacks,
216
+ )
217
+
218
+
219
+ if __name__ == "__main__":
220
+ unittest.main()