Spaces:

build-small-hackathon
/

ObjectverseDiary

Paused

App Files Files Community

qqyule commited on Jun 8

Commit

1e2c036

verified ·

1 Parent(s): cb80875

Deploy latest Objectverse Diary version

Browse files

Files changed (26) hide show

.codex/config.toml +2 -0
README.md +42 -12
data/examples/README.md +4 -0
data/traces/README.md +2 -0
data/traces/space-vlm/keyboard.json +50 -0
data/traces/space-vlm/mug.json +50 -0
data/traces/space-vlm/shoe.json +50 -0
docs/03-dev-schedule.md +4 -3
docs/07-development-plan.md +4 -3
docs/DATASET.md +4 -0
docs/DEMO_VIDEO_SCRIPT.md +108 -0
docs/DEVELOPMENT_STATUS.md +27 -7
docs/EXTERNAL_SETUP.md +22 -8
docs/FAILURES.md +12 -1
docs/FIELD_NOTES.md +124 -48
docs/FINAL_VERIFICATION_REPORT.md +94 -0
docs/MODEL_CARD.md +11 -7
docs/SOCIAL_POST.md +38 -0
docs/SPACE_VLM_REPORT.json +65 -0
docs/SPACE_VLM_REPORT.md +50 -3
docs/SUBMISSION_GUIDE.md +19 -14
scripts/check_space_vlm.py +37 -1
src/example_cache.py +33 -0
src/ui/layout.py +9 -0
tests/test_mock_mvp.py +176 -2
tests/test_space_vlm_tooling.py +220 -0

.codex/config.toml ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ [sandbox_workspace_write]
2	+ network_access = false

README.md CHANGED Viewed

@@ -23,9 +23,13 @@ Upload a photo of any everyday object. The app wakes it up, gives it a secret pe
 ## Current Status
-Initial mock MVP, MiniCPM-V vision backend wiring, and optional llama.cpp text runtime wiring are available.
-By default, the app still uses deterministic mock outputs for object understanding, persona generation, diary writing, chat replies, share card rendering, and trace saving. `OBJECTVERSE_VISION_BACKEND=minicpm-v` enables the real MiniCPM-V 2.6 vision path. `OBJECTVERSE_TEXT_BACKEND=llama-cpp` can use a local GGUF model through optional `llama-cpp-python` when `TEXT_MODEL_PATH` is configured.
 Hugging Face Space:
@@ -51,21 +55,34 @@ The interface is English-first and Chinese-second.
 ## Badge Targets
-- [ ] Off the Grid
-- [ ] Well-Tuned
-- [ ] Off-Brand
-- [ ] Llama Champion
-- [ ] Sharing is Caring
-- [ ] Field Notes
-- [ ] OpenBMB Special
 ## Planned Model Stack
-- Vision: MiniCPM-V or lightweight VLM fallback
-- Text: fine-tuned small LLM
 - Runtime: llama.cpp / llama-cpp-python
 - UI: Gradio Blocks
 ## Run Locally
 ```bash
@@ -90,7 +107,7 @@ If `llama-cpp-python` is missing, `TEXT_MODEL_PATH` is empty, the model cannot l
 ## Initial MVP Flow
-The current implementation supports:
 - image upload
 - optional object description
@@ -104,6 +121,19 @@ The current implementation supports:
 - anonymized trace JSON saved under `data/traces/`
 - six public mock sample traces under `data/traces/samples/`
 ## Generate Sample Traces
 ```bash

 ## Current Status
+Stable mock-safe submission baseline, MiniCPM-V vision backend wiring, optional llama.cpp text runtime wiring, public mock traces, and Space validation evidence are available.
+By default, the app uses deterministic mock outputs for object understanding, persona generation, diary writing, chat replies, share card rendering, and trace saving. This keeps the public demo reproducible and avoids commercial AI APIs.
+`OBJECTVERSE_VISION_BACKEND=minicpm-v` enables the optional MiniCPM-V 2.6 vision path. The hosted ZeroGPU validation on June 8, 2026 reached the Space but fell back to mock vision for all three public test images; this is documented in `docs/SPACE_VLM_REPORT.md` and `docs/FAILURES.md`.
+`OBJECTVERSE_TEXT_BACKEND=llama-cpp` can use a local GGUF model through optional `llama-cpp-python` when `TEXT_MODEL_PATH` is configured. No GGUF file or fine-tuned model is committed in this stable submission baseline.
 Hugging Face Space:
 ## Badge Targets
+- [x] Off-Brand — archive-style Gradio UI, English-first with Chinese helper text.
+- [x] Sharing is Caring — public mock traces, JSONL export, prompt templates, and failure notes.
+- [x] Field Notes — article draft in `docs/FIELD_NOTES.md`.
+- [ ] OpenBMB Special — MiniCPM-V wiring exists, but hosted validation currently falls back to mock vision.
+- [ ] Llama Champion — llama.cpp wiring exists, but real GGUF smoke test is not complete.
+- [ ] Well-Tuned — dataset preview exists, but LoRA training/model publishing is not complete.
+- [ ] Off the Grid — no commercial AI APIs are used; final badge eligibility depends on hackathon review.
 ## Planned Model Stack
+- Vision: MiniCPM-V 2.6 or deterministic mock fallback
+- Text: deterministic mock text now; optional GGUF later
 - Runtime: llama.cpp / llama-cpp-python
 - UI: Gradio Blocks
+## Parameter Budget
+The hackathon budget is <= 32B total model parameters.
+Stable baseline:
+- default vision backend: deterministic mock, 0 active model parameters
+- default text backend: deterministic mock, 0 active model parameters
+- optional wired vision model: MiniCPM-V 2.6, about 8B parameters when enabled
+- optional text GGUF: not selected or committed yet
+The stable public demo therefore stays within the 32B budget. Future GGUF or LoRA work must update `docs/MODEL_CARD.md` before being claimed in submission materials.
 ## Run Locally
 ```bash
 ## Initial MVP Flow
+The stable submission baseline supports:
 - image upload
 - optional object description
 - anonymized trace JSON saved under `data/traces/`
 - six public mock sample traces under `data/traces/samples/`
+## Stable Submission Evidence
+- Mock-safe Space: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
+- Initial acceptance report: `docs/INITIAL_STAGE_REPORT.md`
+- Runtime notes: `docs/RUNTIME.md`
+- Dataset preview notes: `docs/DATASET.md`
+- Public mock traces: `data/traces/samples/`
+- Trace JSONL export: `data/traces/samples/objectverse_public_mock_traces.jsonl`
+- Hosted VLM failure evidence: `docs/SPACE_VLM_REPORT.md`, `docs/SPACE_VLM_REPORT.json`, `data/traces/space-vlm/`
+- Field Notes draft: `docs/FIELD_NOTES.md`
+- Demo video script: `docs/DEMO_VIDEO_SCRIPT.md`
+- Social post draft: `docs/SOCIAL_POST.md`
 ## Generate Sample Traces
 ```bash

data/examples/README.md CHANGED Viewed

@@ -3,3 +3,7 @@
 Reserved for public example objects and sample outputs.
 Target: at least 6 example objects for the demo gallery.

 Reserved for public example objects and sample outputs.
 Target: at least 6 example objects for the demo gallery.
+The Gradio example buttons first try to replay committed sample traces from
+`data/traces/samples/`. If a sample trace is missing, the UI falls back to the
+current generation pipeline.

data/traces/README.md CHANGED Viewed

@@ -14,6 +14,8 @@ Target: at least 6 public traces for the Sharing is Caring badge.
 These traces use the current mock runtime and are safe placeholders until real VLM and llama.cpp traces are available.
 Export the validated sample traces as JSONL:
 ```bash

 These traces use the current mock runtime and are safe placeholders until real VLM and llama.cpp traces are available.
+`../space-vlm/` contains hosted Space validation evidence from the June 8, 2026 ZeroGPU run. Those traces are intentionally kept separate because they include `vision-fallback-to-mock` and should not replace the six stable demo samples.
 Export the validated sample traces as JSONL:
 ```bash

data/traces/space-vlm/keyboard.json ADDED Viewed

	@@ -0,0 +1,50 @@

+{
+  "created_at": "2026-06-08T02:16:51.496281Z",
+  "diary": {
+    "chinese": "今天他们又理所当然地碰了我，好像一个 keyboard 不会有边界感。我保持沉默，因为这大概是我和重力签下的合同。我的情绪是 curious and needlessly profound，秘密恐惧是 discovering that usefulness is not meaning。至少，我已经熬过了好几个所谓紧急计划。",
+    "english": "They touched me again today with the confidence of someone who has never asked a keyboard for consent. I remained still, because that is my contract with gravity. My mood is curious and needlessly profound, my secret fear is discovering that usefulness is not meaning, and my only comfort is knowing I have outlived at least three urgent plans.",
+    "title": "Secret Diary - Day 425"
+  },
+  "fallbacks": [
+    "vision-fallback-to-mock",
+    "mock-text-runtime"
+  ],
+  "input": {
+    "description": "A public Wikimedia Commons photo of a computer keyboard.",
+    "has_image": true,
+    "image_filename": "keyboard.jpg"
+  },
+  "mode": "Philosopher",
+  "model_runtime": {
+    "runtime": "vision model id: openbmb/MiniCPM-V-2_6; no llama.cpp model connected yet",
+    "text": "mock persona and diary generation",
+    "vision": "minicpm-v object understanding"
+  },
+  "object_understanding": {
+    "object": {
+      "confidence": 0.42,
+      "likely_context": "everyday human environment",
+      "name": "keyboard",
+      "visible_features": [
+        "uploaded photo provided",
+        "user-supplied description"
+      ]
+    }
+  },
+  "persona": {
+    "persona": {
+      "character_name": "Keyboard the Questioning",
+      "complaint": "I am not just a keyboard. I am an unpaid witness with excellent recall.",
+      "core_memory": "survived many quiet hours as a keyboard while humans called it normal life",
+      "mood": "curious and needlessly profound",
+      "object_name": "keyboard",
+      "secret_fear": "discovering that usefulness is not meaning",
+      "tags": [
+        "tiny ontology",
+        "useful doubt",
+        "meaning crisis"
+      ]
+    }
+  },
+  "trace_id": "c7172b10d11048008b7a9dda1159d0df"
+}

data/traces/space-vlm/mug.json ADDED Viewed

	@@ -0,0 +1,50 @@

+{
+  "created_at": "2026-06-08T02:16:42.380173Z",
+  "diary": {
+    "chinese": "今天他们又理所当然地碰了我，好像一个 coffee mug 不会有边界感。我保持沉默，因为这大概是我和重力签下的合同。我的情绪是 tired but sarcastic，秘密恐惧是 being replaced by a newer object with worse opinions。至少，我已经熬过了好几个所谓紧急计划。",
+    "english": "They touched me again today with the confidence of someone who has never asked a coffee mug for consent. I remained still, because that is my contract with gravity. My mood is tired but sarcastic, my secret fear is being replaced by a newer object with worse opinions, and my only comfort is knowing I have outlived at least three urgent plans.",
+    "title": "Secret Diary - Day 427"
+  },
+  "fallbacks": [
+    "vision-fallback-to-mock",
+    "mock-text-runtime"
+  ],
+  "input": {
+    "description": "A public Wikimedia Commons photo of a striped coffee mug.",
+    "has_image": true,
+    "image_filename": "mug.jpg"
+  },
+  "mode": "Cynical",
+  "model_runtime": {
+    "runtime": "vision model id: openbmb/MiniCPM-V-2_6; no llama.cpp model connected yet",
+    "text": "mock persona and diary generation",
+    "vision": "minicpm-v object understanding"
+  },
+  "object_understanding": {
+    "object": {
+      "confidence": 0.42,
+      "likely_context": "everyday human environment",
+      "name": "coffee mug",
+      "visible_features": [
+        "uploaded photo provided",
+        "user-supplied description"
+      ]
+    }
+  },
+  "persona": {
+    "persona": {
+      "character_name": "CoffeeMug worth",
+      "complaint": "I am not just a coffee mug. I am an unpaid witness with excellent recall.",
+      "core_memory": "survived many quiet hours as a coffee mug while humans called it normal life",
+      "mood": "tired but sarcastic",
+      "object_name": "coffee mug",
+      "secret_fear": "being replaced by a newer object with worse opinions",
+      "tags": [
+        "desk survivor",
+        "burnt optimism",
+        "quiet judgment"
+      ]
+    }
+  },
+  "trace_id": "31462c2c83a54dc79b38cb16faaed783"
+}

data/traces/space-vlm/shoe.json ADDED Viewed

	@@ -0,0 +1,50 @@

+{
+  "created_at": "2026-06-08T02:16:56.248616Z",
+  "diary": {
+    "chinese": "今天他们又理所当然地碰了我，好像一个 shoe 不会有边界感。我保持沉默，因为这大概是我和重力签下的合同。我的情绪是 theatrical and wounded，秘密恐惧是 being forgotten before the final act。至少，我已经熬过了好几个所谓紧急计划。",
+    "english": "They touched me again today with the confidence of someone who has never asked a shoe for consent. I remained still, because that is my contract with gravity. My mood is theatrical and wounded, my secret fear is being forgotten before the final act, and my only comfort is knowing I have outlived at least three urgent plans.",
+    "title": "Secret Diary - Day 421"
+  },
+  "fallbacks": [
+    "vision-fallback-to-mock",
+    "mock-text-runtime"
+  ],
+  "input": {
+    "description": "A public Wikimedia Commons photo of running shoes.",
+    "has_image": true,
+    "image_filename": "shoe.jpg"
+  },
+  "mode": "Dramatic",
+  "model_runtime": {
+    "runtime": "vision model id: openbmb/MiniCPM-V-2_6; no llama.cpp model connected yet",
+    "text": "mock persona and diary generation",
+    "vision": "minicpm-v object understanding"
+  },
+  "object_understanding": {
+    "object": {
+      "confidence": 0.42,
+      "likely_context": "everyday human environment",
+      "name": "shoe",
+      "visible_features": [
+        "uploaded photo provided",
+        "user-supplied description"
+      ]
+    }
+  },
+  "persona": {
+    "persona": {
+      "character_name": "Shoe von Sigh",
+      "complaint": "I am not just a shoe. I am an unpaid witness with excellent recall.",
+      "core_memory": "survived many quiet hours as a shoe while humans called it normal life",
+      "mood": "theatrical and wounded",
+      "object_name": "shoe",
+      "secret_fear": "being forgotten before the final act",
+      "tags": [
+        "tragic prop",
+        "grand entrance",
+        "minor catastrophe"
+      ]
+    }
+  },
+  "trace_id": "21ad1c2abe3b406a9e359f9f1b190552"
+}

docs/03-dev-schedule.md CHANGED Viewed

@@ -53,12 +53,13 @@
 - [x] 加 example gallery
 - [x] 新增 Space VLM 验证脚本
 - [x] 新增 ZeroGPU 兼容装饰器
 - [ ] 缓存示例输出
-- [ ] Space 真实图片验证（L4 因 HF `402 Payment Required` 阻塞；ZeroGPU 已到 `RUNNING` 但验证请求长时间无返回，已回滚 mock-safe）
 验收：上传杯子/键盘/鞋子，模型能识别物品并提取外观特征。
-完成记录：MiniCPM-V 2.6 已作为可配置 vision backend 接入，默认仍是 mock vision；`scripts/check_space_vlm.py` 已可用三张临时公开图片验证 Space 端 mug/keyboard/shoe。2026-06-06 已尝试切到 L4，但 Hugging Face 返回 `402 Payment Required`，需要组织 billing/pre-paid credits；随后尝试 ZeroGPU，Space 可到 `RUNNING`，但验证请求长时间无返回。两次尝试后均已执行 mock-safe rollback。文本生成已接入可选 llama.cpp runtime wiring，但最终 GGUF 模型仍未选择/下载。
 ---
@@ -215,7 +216,7 @@ Bottom: Share Card + Trace
 ## Day 11：提交检查
 - [ ] Space under official org
-- [ ] Space MiniCPM-V validation passes for mug, keyboard, and shoe
 - [ ] Demo video ready
 - [ ] Social post ready
 - [ ] README complete

 - [x] 加 example gallery
 - [x] 新增 Space VLM 验证脚本
 - [x] 新增 ZeroGPU 兼容装饰器
+- [x] ZeroGPU CUDA probe
 - [ ] 缓存示例输出
+- [ ] Space 真实图片验证（L4 因 HF `402 Payment Required` 阻塞；ZeroGPU CUDA probe 成功；2026-06-08 full validation reached the app but fell back to mock vision for mug/keyboard/shoe）
 验收：上传杯子/键盘/鞋子，模型能识别物品并提取外观特征。
+完成记录：MiniCPM-V 2.6 已作为可配置 vision backend 接入，默认仍是 mock vision；`scripts/check_space_vlm.py` 已可用三张临时公开图片验证 Space 端 mug/keyboard/shoe。2026-06-06 已尝试切到 L4，但 Hugging Face 返回 `402 Payment Required`；随后 ZeroGPU CUDA probe 成功。2026-06-08 full validation reached the app through the direct `hf.space` path, but all three objects included `vision-fallback-to-mock`。文本生成已接入可选 llama.cpp runtime wiring，但最终 GGUF 模型仍未选择/下载。
 ---
 ## Day 11：提交检查
 - [ ] Space under official org
+- [ ] Space MiniCPM-V validation passes for mug, keyboard, and shoe（当前 wired but hosted validation falls back to mock）
 - [ ] Demo video ready
 - [ ] Social post ready
 - [ ] README complete

docs/07-development-plan.md CHANGED Viewed

@@ -39,7 +39,7 @@ As of 2026-06-06, the project has:
 Not yet done:
 - GitHub repo sync / public submission confirmation
-- hosted Space L4 MiniCPM-V validation with real public images
 - real GGUF selection and local `TEXT_MODEL_PATH` smoke test
 - real curated dataset
 - LoRA fine-tuning
@@ -115,7 +115,7 @@ Verification:
 Goal: replace mock object recognition with a real VLM path while preserving fallback behavior.
-Status: local wiring complete; hosted GPU validation pending.
 Scope:
@@ -136,7 +136,8 @@ Verification:
 - Run local sample image checks.
 - Confirm schema validation.
 - Confirm fallback trace markers.
-- Run `scripts/check_space_vlm.py --configure-space` after external-state confirmation.
 ## Phase 4 — Text Runtime With llama.cpp

 Not yet done:
 - GitHub repo sync / public submission confirmation
+- hosted Space MiniCPM-V validation with real public images
 - real GGUF selection and local `TEXT_MODEL_PATH` smoke test
 - real curated dataset
 - LoRA fine-tuning
 Goal: replace mock object recognition with a real VLM path while preserving fallback behavior.
+Status: local wiring complete; hosted ZeroGPU validation reaches the app but falls back to mock vision.
 Scope:
 - Run local sample image checks.
 - Confirm schema validation.
 - Confirm fallback trace markers.
+- Run `scripts/check_space_vlm.py --configure-space --hardware zero-a10g --rollback-to-mock` after external-state confirmation.
+- Inspect Space runtime logs or add non-secret diagnostics before rerunning, because the 2026-06-08 hosted validation returned `vision-fallback-to-mock` for mug, keyboard, and shoe.
 ## Phase 4 — Text Runtime With llama.cpp

docs/DATASET.md CHANGED Viewed

@@ -18,6 +18,8 @@ data/train/objectverse_sft_preview.jsonl
 This preview is mock-generated. It is not a final training dataset and should not be described as real model output.
 ## Target Dataset
 Final target before fine-tuning:
@@ -66,6 +68,8 @@ Full candidate pool later:
 Manual curation should happen after generation. Do not publish the full candidate file until it has been reviewed.
 ## Curation Checklist
 - Persona stays consistent with the object.

 This preview is mock-generated. It is not a final training dataset and should not be described as real model output.
+The stable submission baseline does not publish a final Hugging Face Dataset. The current JSONL file is evidence for schema and workflow readiness only.
 ## Target Dataset
 Final target before fine-tuning:
 Manual curation should happen after generation. Do not publish the full candidate file until it has been reviewed.
+Space VLM validation traces under `data/traces/space-vlm/` are failure evidence because they include `vision-fallback-to-mock`. Do not mix them into curated training data or describe them as successful real VLM outputs.
 ## Curation Checklist
 - Persona stays consistent with the object.

docs/DEMO_VIDEO_SCRIPT.md ADDED Viewed

	@@ -0,0 +1,108 @@

+# Demo Video Script
+## Goal
+Record a 90-second stable demo for Objectverse Diary using the mock-safe Hugging Face Space or local Gradio app.
+Do not claim that hosted MiniCPM-V, GGUF text generation, LoRA training, or model publishing are complete. The stable demo should emphasize the product loop, Gradio Off-Brand UI, public traces, and no commercial AI APIs.
+## Recording Setup
+- Use the Hugging Face Space if it is responsive: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
+- If the Space is slow, use local Gradio with default mock settings:
+```bash
+.venv/bin/python app.py
+```
+- Keep environment defaults:
+```bash
+OBJECTVERSE_VISION_BACKEND=mock
+OBJECTVERSE_TEXT_BACKEND=mock
+```
+## 90-Second Script
+### 0-8s
+Voiceover:
+> What if every object around you had a secret life?
+Screen:
+- Show the Objectverse Diary title and archive-style interface.
+- Briefly show the English-first / Chinese-second subtitle.
+### 8-20s
+Voiceover:
+> This is Objectverse Diary, a small-model AI toy built with Gradio. It turns ordinary object photos into secret personas, diary entries, chats, and shareable cards.
+Screen:
+- Show the object intake panel.
+- Hover or point to personality mode selection.
+### 20-35s
+Voiceover:
+> For the stable demo, I use deterministic mock generation so the public Space stays reproducible without commercial AI APIs.
+Screen:
+- Click a stable example, preferably Coffee mug or Mechanical keyboard.
+- Generate or replay the cached example output.
+### 35-52s
+Voiceover:
+> The app creates a structured object file, then gives the object a hidden personality.
+Screen:
+- Show object JSON.
+- Show persona JSON or object file panel.
+### 52-68s
+Voiceover:
+> The object writes a short English-first secret diary, with Chinese helper text underneath.
+Screen:
+- Scroll or focus on diary output.
+- Keep the diary readable.
+### 68-82s
+Voiceover:
+> You can chat with the object, generate a share card, and inspect the anonymized trace.
+Screen:
+- Send one chat message.
+- Show the share card.
+- Show trace panel or trace path.
+### 82-90s
+Voiceover:
+> MiniCPM-V and llama.cpp paths are wired behind fallbacks, but this stable submission keeps the demo mock-safe and reproducible. Every object has a secret life.
+Screen:
+- End on the share card or app title.
+## Notes For Submission
+- Mention MiniCPM-V as wired but not hosted-validated yet.
+- Mention public traces and failure notes if the submission form asks for reproducibility.
+- Keep the final video under 2 minutes.

docs/DEVELOPMENT_STATUS.md CHANGED Viewed

@@ -1,12 +1,16 @@
 # Development Status
-Last updated: 2026-06-06
 ## Completed
 - Project skeleton, README, AGENTS instructions, and Gradio app entrypoint.
 - Mock MVP flow: upload/description, personality mode, object JSON, persona JSON, diary, object chat, share card, and trace saving.
 - Archive-style Gradio UI with English-first / Chinese-second copy and six stable examples.
 - Trace and dataset tooling:
   - six public mock sample traces
   - public trace JSONL export
@@ -18,22 +22,29 @@ Last updated: 2026-06-06
 - Space VLM validation tooling:
   - `scripts/check_space_vlm.py`
   - failed L4 validation report at `docs/SPACE_VLM_REPORT.md`
 - ZeroGPU compatibility:
   - optional `src/utils/zero_gpu.py`
   - Gradio generation callback wrapped with `@zero_gpu(duration=180)`
 - Local tests and initial acceptance currently pass.
 ## Not Completed
-- Hosted Space MiniCPM-V validation with real public mug/keyboard/shoe images. Paid L4 was blocked by Hugging Face `402 Payment Required`; ZeroGPU reached `RUNNING` but the validation request did not return within the practical waiting window; mock-safe rollback was applied.
-- Stable example output caching for real VLM demos.
 - Real GGUF model selection, download/configuration outside Git, and `TEXT_MODEL_PATH` smoke test.
 - Final text model parameter count documentation.
 - Real model traces and curated object-persona dataset.
 - LoRA training, adapter/model export, GGUF conversion, and Hugging Face model publishing.
 - Hugging Face dataset publishing.
 - GitHub sync / final public repository confirmation.
-- Field Notes article, demo video, social post, and final submission package.
 ## Current Safe Defaults
@@ -44,16 +55,25 @@ Last updated: 2026-06-06
 ## Next Recommended Gate
-Unblock Hugging Face paid hardware access, or debug the ZeroGPU queue/request path with a smaller probe model, then rerun the hosted Space VLM validation:
 ```bash
 .venv/bin/python -B scripts/check_space_vlm.py \
   --configure-space \
   --space-url https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary \
-  --output docs/SPACE_VLM_REPORT.md
 ```
-If Space validation fails or GPU is unavailable, roll back to mock-safe settings:
 ```bash
 .venv/bin/python -B scripts/check_space_vlm.py \

 # Development Status
+Last updated: 2026-06-08
 ## Completed
 - Project skeleton, README, AGENTS instructions, and Gradio app entrypoint.
 - Mock MVP flow: upload/description, personality mode, object JSON, persona JSON, diary, object chat, share card, and trace saving.
 - Archive-style Gradio UI with English-first / Chinese-second copy and six stable examples.
+- Stable demo baseline:
+  - example buttons replay committed sample traces before falling back to live generation
+  - cached and live generation share the same UI output formatter
+  - manual upload/description path still saves new runtime traces
 - Trace and dataset tooling:
   - six public mock sample traces
   - public trace JSONL export
 - Space VLM validation tooling:
   - `scripts/check_space_vlm.py`
   - failed L4 validation report at `docs/SPACE_VLM_REPORT.md`
+  - optional `--trace-output-dir` evidence export for validation traces
 - ZeroGPU compatibility:
   - optional `src/utils/zero_gpu.py`
   - Gradio generation callback wrapped with `@zero_gpu(duration=180)`
+  - hidden `/zero_gpu_probe` API confirms ZeroGPU CUDA availability when run through direct `hf.space` URL
+- Stable submission materials:
+  - Field Notes draft
+  - demo video script
+  - social post draft
+  - stable submission guide
 - Local tests and initial acceptance currently pass.
 ## Not Completed
+- Hosted Space MiniCPM-V validation with real public mug/keyboard/shoe images. Paid L4 was blocked by Hugging Face `402 Payment Required`; ZeroGPU CUDA probe passed; the 2026-06-08 full ZeroGPU validation reached the app but all three objects fell back to mock vision.
+- Passing real VLM demo trace capture. Failed Space VLM traces are kept as fallback evidence and do not replace mock sample traces.
 - Real GGUF model selection, download/configuration outside Git, and `TEXT_MODEL_PATH` smoke test.
 - Final text model parameter count documentation.
 - Real model traces and curated object-persona dataset.
 - LoRA training, adapter/model export, GGUF conversion, and Hugging Face model publishing.
 - Hugging Face dataset publishing.
 - GitHub sync / final public repository confirmation.
+- Published Field Notes URL, recorded demo video URL, social post URL, and final public submission.
 ## Current Safe Defaults
 ## Next Recommended Gate
+For a stable public baseline, keep the mock-safe Space as the demo path and only sync GitHub/Hugging Face after explicit confirmation.
+Next model gate:
+Optional next model gate after stable submission: inspect the hosted Space MiniCPM-V failure cause without exposing secrets, then rerun hosted Space VLM validation on ZeroGPU:
 ```bash
 .venv/bin/python -B scripts/check_space_vlm.py \
   --configure-space \
+  --hardware zero-a10g \
+  --rollback-to-mock \
   --space-url https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary \
+  --output docs/SPACE_VLM_REPORT.md \
+  --json-output docs/SPACE_VLM_REPORT.json \
+  --trace-output-dir data/traces/space-vlm \
+  --timeout-seconds 1200
 ```
+If only rollback is needed, use:
 ```bash
 .venv/bin/python -B scripts/check_space_vlm.py \

docs/EXTERNAL_SETUP.md CHANGED Viewed

@@ -67,19 +67,25 @@ pinned: false
 Recommended runtime setup:
-- set `OBJECTVERSE_VISION_BACKEND=minicpm-v`
-- set `VISION_MODEL_ID=openbmb/MiniCPM-V-2_6`
-- set `OBJECTVERSE_TEXT_BACKEND=mock`
-- use 1x Nvidia L4 for MiniCPM-V 2.6
-- switch vision backend back to `mock` if GPU is unavailable
 Automated validation command after confirmation:
 ```bash
 .venv/bin/python -B scripts/check_space_vlm.py \
   --configure-space \
   --space-url https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary \
-  --output docs/SPACE_VLM_REPORT.md
 ```
 Optional rollback to mock-safe settings:
@@ -100,8 +106,16 @@ The validation script must not print Hugging Face tokens. It uses three temporar
 - Mock-safe rollback was run afterward.
 - ZeroGPU compatibility was added and uploaded to the Space.
 - `--configure-space --hardware zero-a10g` reached `RUNNING`, and `/config` was reachable, but the validation request did not return within the practical waiting window.
-- Mock-safe rollback was run afterward and confirmed at `cpu-basic`.
-- Next unblock step: enable billing/pre-paid credits, or debug ZeroGPU with a smaller probe before retrying full MiniCPM-V validation.
 ## Safety Notes

 Recommended runtime setup:
+- stable public demo: keep `OBJECTVERSE_VISION_BACKEND=mock` and `OBJECTVERSE_TEXT_BACKEND=mock`
+- optional MiniCPM-V validation: temporarily set `OBJECTVERSE_VISION_BACKEND=minicpm-v`
+- optional MiniCPM-V validation: set `VISION_MODEL_ID=openbmb/MiniCPM-V-2_6`
+- optional MiniCPM-V validation: keep `OBJECTVERSE_TEXT_BACKEND=mock`
+- optional MiniCPM-V validation: use ZeroGPU `zero-a10g` first; paid L4 previously returned `402 Payment Required`
+- always roll back to mock-safe settings after validation unless the hosted VLM path passes reliably
 Automated validation command after confirmation:
 ```bash
 .venv/bin/python -B scripts/check_space_vlm.py \
   --configure-space \
+  --hardware zero-a10g \
+  --rollback-to-mock \
   --space-url https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary \
+  --output docs/SPACE_VLM_REPORT.md \
+  --json-output docs/SPACE_VLM_REPORT.json \
+  --trace-output-dir data/traces/space-vlm \
+  --timeout-seconds 1200
 ```
 Optional rollback to mock-safe settings:
 - Mock-safe rollback was run afterward.
 - ZeroGPU compatibility was added and uploaded to the Space.
 - `--configure-space --hardware zero-a10g` reached `RUNNING`, and `/config` was reachable, but the validation request did not return within the practical waiting window.
+- `spaces>=0.30` and a hidden `/zero_gpu_probe` endpoint were added.
+- The ZeroGPU probe succeeded through the direct `hf.space` URL with CUDA available on an NVIDIA RTX PRO 6000 Blackwell MIG device.
+2026-06-08 validation attempt:
+- `--configure-space --hardware zero-a10g --rollback-to-mock` reached the app through the direct `hf.space` client path.
+- Mug, keyboard, and shoe checks all returned schema-valid traces, but every trace included `vision-fallback-to-mock`.
+- Evidence is saved in `docs/SPACE_VLM_REPORT.md`, `docs/SPACE_VLM_REPORT.json`, and `data/traces/space-vlm/`.
+- The report records rollback to `cpu-basic` with `OBJECTVERSE_VISION_BACKEND=mock` and `OBJECTVERSE_TEXT_BACKEND=mock`.
+- Next model unblock step: inspect Space runtime logs or add non-secret MiniCPM-V diagnostics before rerunning validation.
 ## Safety Notes

docs/FAILURES.md CHANGED Viewed

@@ -8,7 +8,7 @@ Use it for model/runtime/deployment/data issues, not for UI polish notes.
 ## Current Status
-MiniCPM-V 2.6 is wired as an optional vision backend. No hosted Space GPU failures have been observed yet because Space GPU validation is still pending.
 Known non-blocking warning:
@@ -29,6 +29,17 @@ Known non-blocking warning:
 - Evidence:
 ```
 ## Anticipated Failure Areas
 ### Vision Runtime

 ## Current Status
+MiniCPM-V 2.6 is wired as an optional vision backend. Hosted Space ZeroGPU validation ran on 2026-06-08, but all three public object checks fell back to mock vision, so full hosted MiniCPM-V validation is still unresolved.
 Known non-blocking warning:
 - Evidence:
 ```
+## 2026-06-08 - Hosted ZeroGPU MiniCPM-V Falls Back To Mock
+- Area: Hugging Face Space vision runtime.
+- Reproduction: Run `scripts/check_space_vlm.py` with `--configure-space --hardware zero-a10g --rollback-to-mock` against `build-small-hackathon/ObjectverseDiary`.
+- Expected: mug, keyboard, and shoe validations use `minicpm-v object understanding` without `vision-fallback-to-mock`.
+- Actual: all three validations returned schema-valid traces, but every trace included `vision-fallback-to-mock`.
+- Impact: hosted Space MiniCPM-V evidence is not ready for submission; stable mock demo remains usable.
+- Fallback used: mock object understanding plus mock text runtime.
+- Resolution: unresolved; inspect Space runtime logs or add non-secret fallback diagnostics for the MiniCPM-V load/chat exception.
+- Evidence: `docs/SPACE_VLM_REPORT.md`, `docs/SPACE_VLM_REPORT.json`, and `data/traces/space-vlm/`.
 ## Anticipated Failure Areas
 ### Vision Runtime

docs/FIELD_NOTES.md CHANGED Viewed

@@ -1,79 +1,155 @@
-# Field Notes
-Working title:
-`Building Objectverse Diary: A Small-Model AI Toy Where Everyday Objects Come Alive`
-## Status
-Draft source. The final article should be written after real runtime and deployment evidence are available.
-## Draft Outline
-1. Why I built it
-2. Why Track 2
-3. Why small models are enough
-4. Product design
-5. Model architecture
-6. Gradio Off-Brand UI
-7. llama.cpp runtime
-8. Fine-tuning dataset
-9. Traces and reproducibility
-10. What failed
-11. What I would improve next
-## Draft Notes
-### 1. Why I built it
-Objectverse Diary starts from a simple joke: what if the objects around us were quietly keeping emotional records of our lives? The project turns an everyday photo into a hidden object persona, a secret diary entry, a short chat, and a shareable card.
-### 2. Why Track 2
-The experience is purely digital and depends on AI for its main interaction loop: visual understanding, persona invention, voice consistency, and diary generation. It fits the "An Adventure in Thousand Token Wood" track because the product is a strange AI-native toy rather than a conventional productivity workflow.
-### 3. Why small models are enough
-The task does not need a giant frontier model. It needs compact object recognition, strong style constraints, schema-following, and a reliable fallback path. The project is designed around a <= 32B total parameter budget.
-### 4. Product design
-The app is English-first and Chinese-second. The intended feeling is a mysterious archive of ordinary objects: museum labels, typewriter diary entries, and slightly uncanny object personalities.
-### 5. Model architecture
-The planned architecture keeps model calls behind `src/pipeline.py`: vision understanding, persona generation, diary generation, chat, share card rendering, and trace logging. The current mock runtime preserves this boundary before real model integration.
-### 6. Gradio Off-Brand UI
-The UI must remain Gradio, but should not feel like a default demo. This section should be completed after UI polish.
-### 7. llama.cpp runtime
-The text path is planned for GGUF plus llama.cpp or llama-cpp-python. This section should include the chosen model, parameter count, local command, Space behavior, and fallback notes.
-### 8. Fine-tuning dataset
-The dataset plan starts with deterministic preview JSONL, then moves to 200-500 candidate rows and 50 curated high-quality samples. Document provenance, curation, privacy checks, and any rejected examples.
-### 9. Traces and reproducibility
-The app saves anonymized traces. Current public samples are mock traces; real submission traces should include model runtime metadata and fallback markers.
-### 10. What failed
-Reserve this section for real integration failures: VLM loading, JSON validity, Space resource limits, and any dataset quality issues.
-### 11. What I would improve next
-Likely future work: better object memory, richer conversations, stronger card rendering, more curated styles, and a larger public evaluation set.
-## Evidence To Add Later
-- Hugging Face Space URL
-- GitHub repository URL
-- model repo URL
-- dataset URL
-- trace dataset URL
-- demo video URL
-- screenshots after UI polish

+# Building Objectverse Diary: A Small-Model AI Toy Where Everyday Objects Come Alive
+## Status
+Stable submission draft. This document is ready to adapt into the final Field Notes post after the public GitHub, demo video, and social post URLs are confirmed.
+## 1. Why I Built It
+Objectverse Diary began with a small, silly question: what if the objects around us were quietly keeping emotional records of our lives?
+The product loop is intentionally simple. A user uploads an everyday object photo, chooses a personality mode, and the app turns the object into a hidden character. The object gets a structured file, a secret diary entry, a short chat voice, and a shareable card.
+The joke only works if the app treats ordinary objects with strange seriousness. A coffee mug is not just a mug; it is a tired witness. A keyboard is not just a keyboard; it is a percussion instrument for anxious deadlines. The app is a tiny archive for that kind of imagined life.
+## 2. Why This Fits The Track
+Objectverse Diary was built for the Build Small Hackathon track "An Adventure in Thousand Token Wood." The core experience is AI-native:
+- vision understanding turns a photo into structured object facts
+- persona generation invents the object's hidden self
+- diary generation writes in a consistent first-person voice
+- chat lets the object keep that voice across replies
+- trace logging makes each generation inspectable and reproducible
+It is not a productivity wrapper. It is a compact AI toy with a specific emotional shape.
+## 3. Why Small Models Are Enough
+This project does not need a frontier model to be interesting. It needs:
+- useful object recognition
+- compact structured JSON output
+- a distinctive writing style
+- consistent persona fields
+- reliable fallback behavior
+- a UI that makes the output feel intentional
+The architecture is designed around a <= 32B total parameter budget. MiniCPM-V 2.6 is wired as the optional vision path, and llama.cpp is wired as the optional local text runtime. The stable public baseline still defaults to deterministic mock generation so the demo stays reproducible without commercial model APIs.
+## 4. Product Design
+The interface is English-first and Chinese-second. The visual direction is a strange object archive: warm dark paper, amber highlights, museum-label copy, and typewriter-like diary output.
+The product avoids a generic chatbot layout. The main flow is closer to opening an object file:
+1. intake the object
+2. generate an object record
+3. reveal the persona
+4. read the diary
+5. chat with the object
+6. export or inspect the trace
+Six stable examples are included so the demo can run even when hosted model resources are unavailable.
+## 5. Architecture
+The app keeps the Gradio UI separate from model execution:
+- `src/ui/layout.py` builds the Gradio Blocks interface
+- `src/pipeline.py` coordinates generation
+- `src/models/vision_runner.py` handles mock or MiniCPM-V object understanding
+- `src/models/llama_cpp_runner.py` handles mock text or optional llama.cpp text generation
+- `src/traces/logger.py` writes anonymized trace records
+- `src/renderer/share_card.py` renders the shareable card preview
+This boundary matters. It lets the mock MVP, hosted Space validation, and future local GGUF experiments share the same data shapes and fallback markers.
+## 6. Runtime And Fallbacks
+The stable baseline uses:
+```bash
+OBJECTVERSE_VISION_BACKEND=mock
+OBJECTVERSE_TEXT_BACKEND=mock
+```
+Optional MiniCPM-V vision can be enabled with:
+```bash
+OBJECTVERSE_VISION_BACKEND=minicpm-v
+VISION_MODEL_ID=openbmb/MiniCPM-V-2_6
+OBJECTVERSE_TEXT_BACKEND=mock
+```
+Optional llama.cpp text generation can be enabled with:
+```bash
+OBJECTVERSE_TEXT_BACKEND=llama-cpp
+TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf
+```
+The fallback behavior is explicit. If MiniCPM-V fails or returns invalid JSON, the trace records `vision-fallback-to-mock`. If llama.cpp is unavailable, missing a model path, or returns invalid JSON, the trace records `text-fallback-to-mock`.
+## 7. What Worked
+The stable loop works locally and in the mock-safe Space:
+- upload or choose an example object
+- generate object facts, persona, diary, chat state, share card, and trace JSON
+- replay six committed sample traces
+- export public mock traces to JSONL
+- run local unittest and initial-stage checks
+The Gradio UI also moves away from the default demo feel. It is still Gradio, but the experience reads like a small archive interface.
+## 8. What Failed
+The important failure is hosted MiniCPM-V validation.
+Paid L4 hardware on the hackathon organization returned `402 Payment Required`. ZeroGPU CUDA probing later succeeded, and the full validation command reached the hosted Space on June 8, 2026. However, mug, keyboard, and shoe validation all fell back to mock vision. The evidence is saved in:
+- `docs/SPACE_VLM_REPORT.md`
+- `docs/SPACE_VLM_REPORT.json`
+- `data/traces/space-vlm/`
+This is not hidden in the submission. The stable baseline treats MiniCPM-V as wired but not yet validated in the hosted environment.
+## 9. Traces And Reproducibility
+The project includes public mock traces for the six stable examples under `data/traces/samples/`. They are deterministic and intended for demo replay, schema validation, and public inspection.
+The Space VLM traces under `data/traces/space-vlm/` are different: they are failure evidence. They show that the hosted Space reached the generation endpoint but used the mock fallback. These traces should not replace the stable mock examples.
+The export command is:
+```bash
+.venv/bin/python -B scripts/export_traces.py
+```
+## 10. Privacy And Safety
+The project does not use OpenAI, Anthropic, Gemini, Cohere, or other commercial model APIs. It does not commit GGUF files, private images, tokens, credit codes, or `.env` files.
+Trace logging anonymizes text inputs before public export. The current public traces are synthetic mock examples rather than private user photos.
+## 11. What I Would Improve Next
+The next model-focused step is to inspect Space runtime logs or add non-secret MiniCPM-V diagnostics so the hosted fallback can be diagnosed without leaking credentials.
+After that:
+- rerun ZeroGPU MiniCPM-V validation
+- choose and smoke-test a real GGUF text model
+- generate and curate real training candidates
+- publish a dataset and fine-tuned adapter if time allows
+- record a final demo video from the stable Space
+The current version is intentionally honest: it is a stable, reproducible small-model toy baseline with clear boundaries, visible failures, and a path to stronger model evidence.
+## Evidence Links To Fill Before Final Submission
+- Hugging Face Space: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
+- GitHub repository: pending push confirmation
+- Demo video: pending recording
+- Social post: pending publishing

docs/FINAL_VERIFICATION_REPORT.md ADDED Viewed

	@@ -0,0 +1,94 @@

+# Final Verification Report
+- Generated at: 2026-06-08 11:19:49 CST
+- Verified source commit: `b7cb470`
+- Branch: `main`
+- Verification target: stable mock-safe submission baseline
+- Local app URL: `http://127.0.0.1:7860/`
+## Summary
+Objectverse Diary's stable mock-safe baseline is locally verifiable. The app starts with default mock backends, renders the archive-style Gradio interface, runs all six committed example objects, supports object chat, renders share cards, and exposes trace evidence.
+This report does not claim hosted MiniCPM-V validation, GGUF text generation, LoRA training, model publishing, dataset publishing, or final public submission URLs are complete.
+## Command Verification
+| Check | Result | Notes |
+| --- | --- | --- |
+| `git status --short --untracked-files=all` | PASS | Clean before report generation. |
+| `.venv/bin/python -B -m unittest discover -s tests` | PASS | 30 tests passed. Gradio 6.0 deprecation warnings are non-blocking. |
+| `.venv/bin/python -B scripts/check_initial_stage.py` | PASS | Required files, runtime defaults, trace generation, sample traces, dataset preview, trace export, and Gradio build all passed. |
+| `.venv/bin/python -B scripts/export_traces.py` | PASS | Exported 6 traces to `data/traces/samples/objectverse_public_mock_traces.jsonl`. |
+| `git diff --check` | PASS | No whitespace errors. |
+## Browser Verification
+The local app was started with:
+```bash
+GRADIO_SERVER_NAME=127.0.0.1 GRADIO_SERVER_PORT=7860 .venv/bin/python app.py
+```
+Browser checks:
+| Scenario | Result | Evidence |
+| --- | --- | --- |
+| App loads at `http://127.0.0.1:7860/` | PASS | Page rendered after Gradio load state. |
+| English-first / Chinese-second UI appears | PASS | Title, subtitle, section headings, and helper text visible. |
+| Six example buttons visible | PASS | OVD-001 through OVD-006 visible in the Example Objects section. |
+| Coffee mug example | PASS | Object file, Secret Diary, Share Card, and trace content appeared. |
+| Mechanical keyboard example | PASS | Object file, Secret Diary, Share Card, and trace content appeared. |
+| Running shoe example | PASS | Object file, Secret Diary, Share Card, trace content, and saved sample path appeared. |
+| Desk lamp example | PASS | Expected object term, Secret Diary, Share Card, and trace saved state appeared. |
+| Water bottle example | PASS | Expected object term, Secret Diary, Share Card, and trace saved state appeared. |
+| Notebook example | PASS | Expected object term, Secret Diary, Share Card, and trace saved state appeared. |
+| Object chat | PASS | Message `What did you see today?` returned a persona-consistent `Shoe Afterlight` reply. |
+| Browser console | PASS | No warning or error logs observed during local verification. |
+## Trace Verification
+- Six stable public mock sample traces remain under `data/traces/samples/`.
+- The trace export JSONL was regenerated successfully.
+- Hosted Space VLM traces under `data/traces/space-vlm/` remain failure evidence because they include `vision-fallback-to-mock`; they are intentionally not used as successful real VLM traces.
+## Security Scan
+Scanned project docs, source, scripts, tests, and trace directories for:
+- `hf_`
+- `HF_TOKEN`
+- `HUGGINGFACE_TOKEN`
+- `BEGIN PRIVATE KEY`
+- `SUPABASE_SERVICE_ROLE_KEY`
+- test email pattern
+- private local path markers
+- `.env`
+Result: PASS with known safe hits only.
+Known safe hits:
+- test fixtures intentionally containing `user@example.com`
+- tests asserting that token markers are absent
+- `scripts/check_space_vlm.py` sensitive marker constants and auth helper names
+- documentation warning not to commit `.env`
+- `.env.example` path shown in architecture docs
+No real token, private key, credential, private image path, GGUF file, or `.env` file was found in the scanned project content.
+## Remaining External Items
+- GitHub push is not performed in this verification run.
+- Hugging Face Space hardware or environment variables are not changed in this verification run.
+- Demo video URL is still pending recording/publication.
+- Field Notes URL is still pending publication.
+- Social post URL is still pending publication.
+- Hosted MiniCPM-V validation still falls back to mock vision.
+- Real GGUF smoke test, LoRA training, HF model publishing, and HF dataset publishing remain future work.
+## Verdict
+PASS for the stable mock-safe local submission baseline.
+The project is ready for explicit-confirmation external steps: push `main`, record/publish the demo video, publish Field Notes/social post, and fill final submission URLs.

docs/MODEL_CARD.md CHANGED Viewed

@@ -2,10 +2,12 @@
 ## Status
-Draft only. No text model has been fine-tuned, converted, or published yet.
 The app defaults to deterministic mock backends. MiniCPM-V 2.6 vision is wired as an optional runtime backend for GPU environments. Text generation has optional llama.cpp wiring for an externally configured GGUF model via `TEXT_MODEL_PATH`.
 ## Planned Components
 - Vision understanding: MiniCPM-V or lightweight fallback VLM.
@@ -16,8 +18,8 @@ The app defaults to deterministic mock backends. MiniCPM-V 2.6 vision is wired a
 | Component | Candidate | Notes |
 | --- | --- | --- |
-| Vision | `openbmb/MiniCPM-V-2_6` or mock fallback | Must run without commercial API calls. |
-| Text | externally configured GGUF, later small instruct model plus LoRA adapter | Final base model still pending. |
 | Runtime | optional GGUF through llama.cpp / llama-cpp-python | Wired with mock fallback; real-model smoke test still pending. |
 | UI | Gradio Blocks | Required by the hackathon and project rules. |
@@ -29,10 +31,11 @@ Record final numbers here before submission:
 | Component | Model | Parameters | Counted Toward Total |
 | --- | --- | ---: | --- |
-| Vision | MiniCPM-V 2.6 | ~8B | yes |
-| Text base | Externally configured GGUF, final model TBD | TBD | yes |
-| LoRA adapter | TBD | TBD | yes |
-| Total | TBD | TBD | must be <= 32B |
 ## Intended Inputs And Outputs
@@ -69,6 +72,7 @@ Current preview data is deterministic and mock-generated. It should only be used
 - If VLM loading fails, use manual description and stable example flow.
 - If llama.cpp is not installed, `TEXT_MODEL_PATH` is missing, model loading fails, or output JSON is invalid, keep deterministic mock text fallback for demo safety.
 - If model JSON is invalid, repair and validate before rendering.
 ## Required Notes

 ## Status
+Stable submission baseline. No text model has been fine-tuned, converted, or published yet.
 The app defaults to deterministic mock backends. MiniCPM-V 2.6 vision is wired as an optional runtime backend for GPU environments. Text generation has optional llama.cpp wiring for an externally configured GGUF model via `TEXT_MODEL_PATH`.
+Hosted MiniCPM-V validation is not passing yet. The June 8, 2026 ZeroGPU validation reached the Space, but all three public object checks fell back to mock vision. See `docs/SPACE_VLM_REPORT.md` and `docs/FAILURES.md`.
 ## Planned Components
 - Vision understanding: MiniCPM-V or lightweight fallback VLM.
 | Component | Candidate | Notes |
 | --- | --- | --- |
+| Vision | `openbmb/MiniCPM-V-2_6` or mock fallback | Wired as optional backend; hosted validation currently falls back to mock. |
+| Text | deterministic mock text; optional externally configured GGUF later | Final base model still pending. |
 | Runtime | optional GGUF through llama.cpp / llama-cpp-python | Wired with mock fallback; real-model smoke test still pending. |
 | UI | Gradio Blocks | Required by the hackathon and project rules. |
 | Component | Model | Parameters | Counted Toward Total |
 | --- | --- | ---: | --- |
+| Vision | MiniCPM-V 2.6 optional path | ~8B | yes, when enabled |
+| Text base | Stable baseline mock text | 0 | no model parameters |
+| Future text base | Externally configured GGUF, final model TBD | TBD | yes, when enabled |
+| Future LoRA adapter | TBD | TBD | yes, when enabled |
+| Stable baseline total | Mock text + optional wired vision not active by default | 0 active model parameters by default | <= 32B |
 ## Intended Inputs And Outputs
 - If VLM loading fails, use manual description and stable example flow.
 - If llama.cpp is not installed, `TEXT_MODEL_PATH` is missing, model loading fails, or output JSON is invalid, keep deterministic mock text fallback for demo safety.
 - If model JSON is invalid, repair and validate before rendering.
+- Hosted VLM fallback evidence is preserved in `data/traces/space-vlm/` and should not be described as successful real VLM output.
 ## Required Notes

docs/SOCIAL_POST.md ADDED Viewed

	@@ -0,0 +1,38 @@

+# Social Post Draft
+## Short Version
+I built Objectverse Diary for Build Small Hackathon: a Gradio app where everyday objects wake up, get secret personas, write diaries, chat with you, and generate share cards.
+Stable demo: mock-safe, reproducible, no commercial AI APIs.
+MiniCPM-V and llama.cpp paths are wired behind fallbacks; hosted VLM validation is documented honestly.
+Space: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
+## Longer Version
+What if your coffee mug had been quietly keeping a diary?
+Objectverse Diary is my Build Small Hackathon project: a strange little object archive built with Gradio. Upload an everyday object photo, choose a personality mode, and the app creates:
+- a structured object file
+- a hidden object persona
+- an English-first secret diary with Chinese helper text
+- an object chat voice
+- a shareable personality card
+- an anonymized trace record
+The stable submission baseline is mock-safe and reproducible, with no commercial AI APIs. MiniCPM-V vision and llama.cpp text paths are wired as optional backends, and the current hosted MiniCPM-V fallback is documented instead of hidden.
+Space:
+https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
+## Hashtag Ideas
+#BuildSmallHackathon #Gradio #SmallModels #HuggingFace #ObjectverseDiary
+## Notes Before Posting
+- Add GitHub URL after push is confirmed.
+- Add demo video URL after recording.
+- Do not claim LoRA, GGUF smoke test, or hosted MiniCPM-V validation are complete.

docs/SPACE_VLM_REPORT.json ADDED Viewed

	@@ -0,0 +1,65 @@

+[
+  {
+    "key": "mug",
+    "label": "Coffee mug",
+    "source_page": "https://commons.wikimedia.org/wiki/File:Striped_coffee_mug.jpg",
+    "image_path": ".tmp/space-vlm-assets/mug.jpg",
+    "passed": false,
+    "object_name": "coffee mug",
+    "visible_features": [
+      "uploaded photo provided",
+      "user-supplied description"
+    ],
+    "likely_context": "everyday human environment",
+    "confidence": 0.42,
+    "runtime_vision": "minicpm-v object understanding",
+    "runtime_text": "mock persona and diary generation",
+    "fallbacks": [
+      "vision-fallback-to-mock",
+      "mock-text-runtime"
+    ],
+    "error": "vision fallback marker was present"
+  },
+  {
+    "key": "keyboard",
+    "label": "Computer keyboard",
+    "source_page": "https://commons.wikimedia.org/wiki/File:Computer_keyboard.jpg",
+    "image_path": ".tmp/space-vlm-assets/keyboard.jpg",
+    "passed": false,
+    "object_name": "keyboard",
+    "visible_features": [
+      "uploaded photo provided",
+      "user-supplied description"
+    ],
+    "likely_context": "everyday human environment",
+    "confidence": 0.42,
+    "runtime_vision": "minicpm-v object understanding",
+    "runtime_text": "mock persona and diary generation",
+    "fallbacks": [
+      "vision-fallback-to-mock",
+      "mock-text-runtime"
+    ],
+    "error": "vision fallback marker was present"
+  },
+  {
+    "key": "shoe",
+    "label": "Running shoe",
+    "source_page": "https://commons.wikimedia.org/wiki/File:Running_shoes.jpg",
+    "image_path": ".tmp/space-vlm-assets/shoe.jpg",
+    "passed": false,
+    "object_name": "shoe",
+    "visible_features": [
+      "uploaded photo provided",
+      "user-supplied description"
+    ],
+    "likely_context": "everyday human environment",
+    "confidence": 0.42,
+    "runtime_vision": "minicpm-v object understanding",
+    "runtime_text": "mock persona and diary generation",
+    "fallbacks": [
+      "vision-fallback-to-mock",
+      "mock-text-runtime"
+    ],
+    "error": "vision fallback marker was present"
+  }
+]

docs/SPACE_VLM_REPORT.md CHANGED Viewed

@@ -1,15 +1,20 @@
 # Space VLM Validation Report
-- Generated at: 2026-06-06 05:19:42 UTC
 - Space URL: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
 - Space repo: `build-small-hackathon/ObjectverseDiary`
-- Overall status: NOT RUN
 - Vision backend expected: `minicpm-v`
 - Text backend expected: `mock`
 ## Space Configuration
-- Applied configuration: not changed by this run.
 - Rollback configuration:
   - `repo_id`: `build-small-hackathon/ObjectverseDiary`
@@ -19,6 +24,48 @@
 ## Results
 ## Notes
 - Test images are temporary public Wikimedia Commons assets and are not committed.

 # Space VLM Validation Report
+- Generated at: 2026-06-08 02:16:59 UTC
 - Space URL: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
 - Space repo: `build-small-hackathon/ObjectverseDiary`
+- Overall status: FAIL
 - Vision backend expected: `minicpm-v`
 - Text backend expected: `mock`
 ## Space Configuration
+- Applied configuration:
+  - `repo_id`: `build-small-hackathon/ObjectverseDiary`
+  - `hardware`: `zero-a10g`
+  - `OBJECTVERSE_VISION_BACKEND`: `minicpm-v`
+  - `VISION_MODEL_ID`: `openbmb/MiniCPM-V-2_6`
+  - `OBJECTVERSE_TEXT_BACKEND`: `mock`
 - Rollback configuration:
   - `repo_id`: `build-small-hackathon/ObjectverseDiary`
 ## Results
+### Coffee mug
+- Status: FAIL
+- Source: https://commons.wikimedia.org/wiki/File:Striped_coffee_mug.jpg
+- Local temporary image: `.tmp/space-vlm-assets/mug.jpg`
+- Object name: `coffee mug`
+- Visible features: uploaded photo provided, user-supplied description
+- Likely context: `everyday human environment`
+- Confidence: 0.42
+- Runtime vision: `minicpm-v object understanding`
+- Runtime text: `mock persona and diary generation`
+- Fallbacks: vision-fallback-to-mock, mock-text-runtime
+- Error: `vision fallback marker was present`
+### Computer keyboard
+- Status: FAIL
+- Source: https://commons.wikimedia.org/wiki/File:Computer_keyboard.jpg
+- Local temporary image: `.tmp/space-vlm-assets/keyboard.jpg`
+- Object name: `keyboard`
+- Visible features: uploaded photo provided, user-supplied description
+- Likely context: `everyday human environment`
+- Confidence: 0.42
+- Runtime vision: `minicpm-v object understanding`
+- Runtime text: `mock persona and diary generation`
+- Fallbacks: vision-fallback-to-mock, mock-text-runtime
+- Error: `vision fallback marker was present`
+### Running shoe
+- Status: FAIL
+- Source: https://commons.wikimedia.org/wiki/File:Running_shoes.jpg
+- Local temporary image: `.tmp/space-vlm-assets/shoe.jpg`
+- Object name: `shoe`
+- Visible features: uploaded photo provided, user-supplied description
+- Likely context: `everyday human environment`
+- Confidence: 0.42
+- Runtime vision: `minicpm-v object understanding`
+- Runtime text: `mock persona and diary generation`
+- Fallbacks: vision-fallback-to-mock, mock-text-runtime
+- Error: `vision fallback marker was present`
 ## Notes
 - Test images are temporary public Wikimedia Commons assets and are not committed.

docs/SUBMISSION_GUIDE.md CHANGED Viewed

@@ -3,13 +3,13 @@
 ## Required Package
 - [x] Hugging Face Space URL: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
-- [ ] GitHub Repository URL: local `origin` configured, sync/submission confirmation pending
-- [ ] Demo Video URL: pending recording
-- [ ] Social Media Post URL: pending final copy
-- [ ] Fine-tuned Model URL: pending model training
-- [ ] Dataset URL: pending curation and publishing
-- [ ] Trace Dataset URL: local mock JSONL export ready, publishing pending
-- [ ] Field Notes Blog URL: draft source in `docs/FIELD_NOTES.md`
 - [x] Short project description: available in README
 ## Local Evidence Ready
@@ -18,31 +18,36 @@
 - Runtime boundary: `docs/RUNTIME.md`
 - Dataset plan and preview workflow: `docs/DATASET.md`
 - External setup checklist: `docs/EXTERNAL_SETUP.md`
-- Space VLM validation report: `docs/SPACE_VLM_REPORT.md` currently failed because `l4x1` hardware returned `402 Payment Required`; ZeroGPU reached `RUNNING` but the validation request did not return.
 - Public mock traces: `data/traces/samples/`
 - Optional llama.cpp runtime wiring: `src/models/llama_cpp_runner.py`
 ## Completed Locally
 - Mock MVP flow, archive-style UI, share card, trace logging, sample traces, dataset preview, and initial acceptance tooling.
 - MiniCPM-V 2.6 backend wiring with fallback markers.
 - Optional llama.cpp text runtime wiring through `TEXT_MODEL_PATH`.
-- Hosted Space VLM validation script and pending report template.
 ## Not Completed Yet
-- Hosted Space MiniCPM-V validation for mug, keyboard, and shoe; L4 is blocked by Hugging Face paid hardware billing, and ZeroGPU needs further debugging.
 - Real GGUF `TEXT_MODEL_PATH` smoke test and final text model parameter count.
 - Real model traces, curated dataset, LoRA training, model/dataset publishing.
-- Field Notes article, demo video, social post, final submission package.
 ## Final Checks
 - [ ] Space is under the official organization.
-- [ ] Space MiniCPM-V validation passes for mug, keyboard, and shoe. Current status: L4 blocked by paid hardware billing; ZeroGPU request path unresolved.
-- [ ] Demo video is under 2 minutes.
-- [ ] README includes model parameter counts.
 - [ ] No commercial cloud AI APIs are used.
 - [ ] Fine-tuned model is linked.
 - [ ] Dataset is linked.
 - [ ] Traces are linked.

 ## Required Package
 - [x] Hugging Face Space URL: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
+- [x] GitHub Repository URL: local `origin` configured as `https://github.com/qqyule/Objectverse-Diary.git`; push still requires explicit confirmation
+- [x] Demo Video Script: `docs/DEMO_VIDEO_SCRIPT.md`
+- [x] Social Media Post Draft: `docs/SOCIAL_POST.md`
+- [ ] Fine-tuned Model URL: not included in stable baseline; LoRA/model publishing remains future work
+- [ ] Dataset URL: not included in stable baseline; local mock preview exists
+- [x] Trace Dataset: local public mock JSONL export at `data/traces/samples/objectverse_public_mock_traces.jsonl`
+- [x] Field Notes Draft: `docs/FIELD_NOTES.md`
 - [x] Short project description: available in README
 ## Local Evidence Ready
 - Runtime boundary: `docs/RUNTIME.md`
 - Dataset plan and preview workflow: `docs/DATASET.md`
 - External setup checklist: `docs/EXTERNAL_SETUP.md`
+- Space VLM validation report: `docs/SPACE_VLM_REPORT.md` currently failed. Paid L4 returned `402 Payment Required`; later ZeroGPU validation reached the app on 2026-06-08, but mug/keyboard/shoe all fell back to mock vision.
+- Space VLM trace evidence: `data/traces/space-vlm/`
 - Public mock traces: `data/traces/samples/`
+- Stable demo baseline: Gradio example buttons replay committed sample traces first, then fall back to the live generation pipeline if a cached trace is missing.
 - Optional llama.cpp runtime wiring: `src/models/llama_cpp_runner.py`
 ## Completed Locally
 - Mock MVP flow, archive-style UI, share card, trace logging, sample traces, dataset preview, and initial acceptance tooling.
+- Stable local demo baseline with six replayable example outputs, shared cached/live UI formatting, chat wake state, share card, and trace panel output.
 - MiniCPM-V 2.6 backend wiring with fallback markers.
 - Optional llama.cpp text runtime wiring through `TEXT_MODEL_PATH`.
+- Hosted Space VLM validation script, report, JSON summary, and trace evidence export.
+- Field Notes draft, demo video script, and social post draft for the stable submission package.
 ## Not Completed Yet
+- Hosted Space MiniCPM-V validation for mug, keyboard, and shoe; ZeroGPU validation reached the app but currently falls back to mock vision.
 - Real GGUF `TEXT_MODEL_PATH` smoke test and final text model parameter count.
 - Real model traces, curated dataset, LoRA training, model/dataset publishing.
+- Field Notes publication URL, recorded demo video URL, social post URL, and final public push/submission.
 ## Final Checks
 - [ ] Space is under the official organization.
+- [ ] Space MiniCPM-V validation passes for mug, keyboard, and shoe. Current status: wired but hosted validation falls back to mock.
+- [x] Demo video script targets under 2 minutes.
+- [x] README includes stable-baseline parameter budget and links to the model card.
 - [ ] No commercial cloud AI APIs are used.
+- [x] Mock-safe local demo baseline is reproducible from committed sample traces.
 - [ ] Fine-tuned model is linked.
 - [ ] Dataset is linked.
 - [ ] Traces are linked.

scripts/check_space_vlm.py CHANGED Viewed

@@ -42,6 +42,8 @@ MOCK_SAFE_VARIABLES = {
     "OBJECTVERSE_TEXT_BACKEND": "mock",
 }
 @dataclass(frozen=True)
 class ValidationAsset:
@@ -182,12 +184,14 @@ def run_space_validation(
     asset_dir: Path = DEFAULT_ASSET_DIR,
     timeout_seconds: int = 900,
     assets: list[ValidationAsset] | None = None,
 ) -> list[ValidationResult]:
     from gradio_client import handle_file
     selected_assets = assets or TEST_ASSETS
     paths = download_validation_assets(asset_dir, selected_assets)
-    client = _build_gradio_client(space_url, timeout_seconds=timeout_seconds)
     results: list[ValidationResult] = []
     started = time.monotonic()
     for asset in selected_assets:
@@ -202,6 +206,9 @@ def run_space_validation(
                 asset.mode,
                 timeout_seconds=min(PREDICTION_TIMEOUT_SECONDS, remaining),
             )
             results.append(validate_prediction(asset, paths[asset.key], response))
         except Exception as exc:
             results.append(
@@ -265,6 +272,15 @@ def _build_gradio_client(space_url: str, *, timeout_seconds: int) -> Any:
     raise TimeoutError(f"Could not fetch Gradio config for {space_url}: {type(last_error).__name__}: {last_error}")
 def validate_prediction(
     asset: ValidationAsset,
     image_path: Path,
@@ -387,6 +403,14 @@ def write_json_results(results: list[ValidationResult], output_path: Path) -> Pa
     return output_path
 def _download_url(url: str, output_path: Path) -> None:
     request = urllib.request.Request(
         url,
@@ -410,6 +434,16 @@ def _extract_trace_payload(response: Any) -> dict[str, Any]:
     return trace_payload
 def _failure_reason(
     expected_match: bool,
     vision_runtime_ok: bool,
@@ -464,6 +498,7 @@ def _parse_args() -> argparse.Namespace:
     parser.add_argument("--rollback-to-mock", action="store_true")
     parser.add_argument("--hardware", default=DEFAULT_HARDWARE)
     parser.add_argument("--skip-validation", action="store_true")
     return parser.parse_args()
@@ -499,6 +534,7 @@ def main() -> None:
                 space_url=args.space_url,
                 asset_dir=args.asset_dir,
                 timeout_seconds=args.timeout_seconds,
             )
         except Exception as exc:
             configuration_error = f"{type(exc).__name__}: {exc}"

     "OBJECTVERSE_TEXT_BACKEND": "mock",
 }
+SENSITIVE_TRACE_MARKERS = ("HUGGINGFACE_TOKEN", "HF_TOKEN", "hf_")
 @dataclass(frozen=True)
 class ValidationAsset:
     asset_dir: Path = DEFAULT_ASSET_DIR,
     timeout_seconds: int = 900,
     assets: list[ValidationAsset] | None = None,
+    trace_output_dir: Path | None = None,
 ) -> list[ValidationResult]:
     from gradio_client import handle_file
     selected_assets = assets or TEST_ASSETS
     paths = download_validation_assets(asset_dir, selected_assets)
+    client_url = space_client_url(space_url)
+    client = _build_gradio_client(client_url, timeout_seconds=timeout_seconds)
     results: list[ValidationResult] = []
     started = time.monotonic()
     for asset in selected_assets:
                 asset.mode,
                 timeout_seconds=min(PREDICTION_TIMEOUT_SECONDS, remaining),
             )
+            if trace_output_dir is not None:
+                trace = extract_trace_record(response)
+                write_trace_record(trace, trace_output_dir / f"{asset.key}.json")
             results.append(validate_prediction(asset, paths[asset.key], response))
         except Exception as exc:
             results.append(
     raise TimeoutError(f"Could not fetch Gradio config for {space_url}: {type(last_error).__name__}: {last_error}")
+def space_client_url(space_url: str) -> str:
+    parsed = urlparse(space_url)
+    if parsed.netloc.endswith(".hf.space"):
+        return space_url.rstrip("/")
+    repo_id = parse_space_repo_id(space_url)
+    owner, space_name = repo_id.split("/", 1)
+    return f"https://{owner}-{space_name}.hf.space".lower()
 def validate_prediction(
     asset: ValidationAsset,
     image_path: Path,
     return output_path
+def write_trace_record(trace: TraceRecord, output_path: Path) -> Path:
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+    serialized = json.dumps(trace.model_dump(mode="json"), ensure_ascii=False, indent=2, sort_keys=True)
+    _assert_trace_is_public_safe(serialized)
+    output_path.write_text(serialized + "\n", encoding="utf-8")
+    return output_path
 def _download_url(url: str, output_path: Path) -> None:
     request = urllib.request.Request(
         url,
     return trace_payload
+def extract_trace_record(response: Any) -> TraceRecord:
+    return TraceRecord.model_validate(_extract_trace_payload(response))
+def _assert_trace_is_public_safe(serialized_trace: str) -> None:
+    for marker in SENSITIVE_TRACE_MARKERS:
+        if marker in serialized_trace:
+            raise ValueError("Trace output may contain a sensitive token marker.")
 def _failure_reason(
     expected_match: bool,
     vision_runtime_ok: bool,
     parser.add_argument("--rollback-to-mock", action="store_true")
     parser.add_argument("--hardware", default=DEFAULT_HARDWARE)
     parser.add_argument("--skip-validation", action="store_true")
+    parser.add_argument("--trace-output-dir", type=Path)
     return parser.parse_args()
                 space_url=args.space_url,
                 asset_dir=args.asset_dir,
                 timeout_seconds=args.timeout_seconds,
+                trace_output_dir=args.trace_output_dir,
             )
         except Exception as exc:
             configuration_error = f"{type(exc).__name__}: {exc}"

src/example_cache.py ADDED Viewed

	@@ -0,0 +1,33 @@

+"""Stable example output loading for demo playback."""
+from __future__ import annotations
+import json
+from pathlib import Path
+from src.models.schema import GenerationResult, TraceRecord
+DEFAULT_SAMPLE_TRACE_DIR = Path("data/traces/samples")
+def sample_trace_path(index: int, sample_dir: Path = DEFAULT_SAMPLE_TRACE_DIR) -> Path | None:
+    """Return the committed sample trace path for a 0-based example index."""
+    trace_id = f"sample-{index + 1:02d}"
+    matches = sorted(sample_dir.glob(f"{trace_id}-*.json"))
+    return matches[0] if matches else None
+def load_sample_generation(index: int, sample_dir: Path = DEFAULT_SAMPLE_TRACE_DIR) -> GenerationResult | None:
+    path = sample_trace_path(index, sample_dir)
+    if path is None:
+        return None
+    trace = TraceRecord.model_validate(json.loads(path.read_text(encoding="utf-8")))
+    return GenerationResult(
+        object_understanding=trace.object_understanding,
+        persona=trace.persona,
+        diary=trace.diary,
+        trace=trace,
+        trace_path=str(path),
+    )

src/ui/layout.py CHANGED Viewed

@@ -9,6 +9,7 @@ from typing import Any
 import gradio as gr
 from src.config import APP_TITLE, DEFAULT_MODE, PERSONALITY_MODES
 from src.examples import EXAMPLE_OBJECTS, example_button_label
 from src.models.llama_cpp_runner import reply_as_object
 from src.models.schema import GenerationResult
@@ -237,6 +238,10 @@ def _panel_header(index: str, title: str, chinese: str, note: str) -> str:
 def _example_handler(index: int):
     def load_example() -> tuple[Any, ...]:
         item = EXAMPLE_OBJECTS[index]
         result = generate_object_file(None, item["description"], item["mode"])
         return item["description"], item["mode"], *result
@@ -254,6 +259,10 @@ def generate_object_file(
     except Exception as exc:  # pragma: no cover - exercised manually by UI failure paths.
         return _generation_error(exc, description, mode)
     object_payload = result.object_understanding.model_dump(mode="json")
     persona_payload = result.persona.model_dump(mode="json")
     return (

 import gradio as gr
 from src.config import APP_TITLE, DEFAULT_MODE, PERSONALITY_MODES
+from src.example_cache import load_sample_generation
 from src.examples import EXAMPLE_OBJECTS, example_button_label
 from src.models.llama_cpp_runner import reply_as_object
 from src.models.schema import GenerationResult
 def _example_handler(index: int):
     def load_example() -> tuple[Any, ...]:
         item = EXAMPLE_OBJECTS[index]
+        cached_result = load_sample_generation(index)
+        if cached_result is not None:
+            return item["description"], item["mode"], *_format_generation_result(cached_result)
         result = generate_object_file(None, item["description"], item["mode"])
         return item["description"], item["mode"], *result
     except Exception as exc:  # pragma: no cover - exercised manually by UI failure paths.
         return _generation_error(exc, description, mode)
+    return _format_generation_result(result)
+def _format_generation_result(result: GenerationResult) -> GenerationUiResult:
     object_payload = result.object_understanding.model_dump(mode="json")
     persona_payload = result.persona.model_dump(mode="json")
     return (

tests/test_mock_mvp.py CHANGED Viewed

@@ -6,10 +6,17 @@ import json
 import tempfile
 import unittest
 from pathlib import Path
 from src.examples import EXAMPLE_OBJECTS, gradio_examples
-from src.models.llama_cpp_runner import generate_diary, generate_persona, reply_as_object
-from src.models.vision_runner import understand_object
 from src.pipeline import generate_object_diary
 from src.renderer.share_card import render_share_card
 from src.traces.anonymizer import anonymize_text
@@ -19,7 +26,27 @@ from scripts.check_initial_stage import run_checks
 from src.config import get_runtime_settings, runtime_status
 class MockMvpTest(unittest.TestCase):
     def test_runtime_defaults_to_mock(self) -> None:
         settings = get_runtime_settings({})
         status = runtime_status(settings)
@@ -34,6 +61,17 @@ class MockMvpTest(unittest.TestCase):
         self.assertEqual(len(gradio_examples()), 6)
         self.assertTrue(all(len(example) == 1 for example in gradio_examples()))
     def test_mock_generation_flow(self) -> None:
         object_understanding = understand_object(
             None,
@@ -49,6 +87,120 @@ class MockMvpTest(unittest.TestCase):
         self.assertIn("今天", diary.chinese)
         self.assertIn("objectverse-card", share_card)
     def test_pipeline_saves_generation_result(self) -> None:
         with tempfile.TemporaryDirectory() as tmp_dir:
             result = generate_object_diary(
@@ -63,6 +215,28 @@ class MockMvpTest(unittest.TestCase):
             self.assertEqual(result.object_understanding.object.name, "coffee mug")
             self.assertEqual(saved_path.stem, result.trace.trace_id)
     def test_chat_uses_current_persona(self) -> None:
         object_understanding = understand_object(None, "dusty black mechanical keyboard")
         persona = generate_persona(object_understanding, "Philosopher")

 import tempfile
 import unittest
 from pathlib import Path
+from unittest.mock import patch
+from src.example_cache import load_sample_generation, sample_trace_path
 from src.examples import EXAMPLE_OBJECTS, gradio_examples
+from src.models.llama_cpp_runner import (
+    generate_diary,
+    generate_persona,
+    reply_as_object,
+    reset_text_runtime_fallbacks,
+)
+from src.models.vision_runner import understand_object, understand_object_with_metadata
 from src.pipeline import generate_object_diary
 from src.renderer.share_card import render_share_card
 from src.traces.anonymizer import anonymize_text
 from src.config import get_runtime_settings, runtime_status
+class FakeMiniCpmModel:
+    def __init__(self, response: str) -> None:
+        self.response = response
+    def chat(self, **_: object) -> str:
+        return self.response
+class FakeLlamaModel:
+    def __init__(self, responses: list[str]) -> None:
+        self.responses = responses
+    def create_chat_completion(self, **_: object) -> dict:
+        response = self.responses.pop(0)
+        return {"choices": [{"message": {"content": response}}]}
 class MockMvpTest(unittest.TestCase):
+    def tearDown(self) -> None:
+        reset_text_runtime_fallbacks()
     def test_runtime_defaults_to_mock(self) -> None:
         settings = get_runtime_settings({})
         status = runtime_status(settings)
         self.assertEqual(len(gradio_examples()), 6)
         self.assertTrue(all(len(example) == 1 for example in gradio_examples()))
+    def test_sample_generation_cache_loads_committed_example_trace(self) -> None:
+        path = sample_trace_path(0)
+        result = load_sample_generation(0)
+        self.assertIsNotNone(path)
+        self.assertIsNotNone(result)
+        assert result is not None
+        self.assertEqual(result.trace.trace_id, "sample-01")
+        self.assertEqual(result.object_understanding.object.name, "coffee mug")
+        self.assertEqual(result.trace_path, str(path))
     def test_mock_generation_flow(self) -> None:
         object_understanding = understand_object(
             None,
         self.assertIn("今天", diary.chinese)
         self.assertIn("objectverse-card", share_card)
+    def test_llama_cpp_persona_diary_and_chat_accept_valid_json(self) -> None:
+        env = {
+            "OBJECTVERSE_TEXT_BACKEND": "llama-cpp",
+            "TEXT_MODEL_PATH": "/tmp/objectverse-text-model.gguf",
+        }
+        fake_llama = FakeLlamaModel(
+            [
+                """
+                {"persona":{"object_name":"coffee mug","character_name":"Mugworth","mood":"dry and suspicious","secret_fear":"being left empty forever","core_memory":"It remembers every late-night refill.","complaint":"I am treated like a ceramic fuel tank.","tags":["desk witness","warm archive","quiet judgment"]}}
+                """,
+                """
+                {"title":"Secret Diary - Day 418","english":"Today I held another bitter storm and called it service.","chinese":"今天我又装下一场苦涩风暴，并被称为有用。"}
+                """,
+                """
+                {"reply":"Mugworth: I have seen your deadlines dissolve into coffee rings."}
+                """,
+            ]
+        )
+        with (
+            patch.dict("os.environ", env, clear=False),
+            patch("src.models.llama_cpp_runner._load_llama_model", return_value=fake_llama),
+        ):
+            object_understanding = understand_object(None, "white coffee mug")
+            persona = generate_persona(object_understanding, "Cynical")
+            diary = generate_diary(persona, "Cynical")
+            reply = reply_as_object(persona.model_dump(mode="json"), "What did you see?")
+        self.assertEqual(persona.persona.character_name, "Mugworth")
+        self.assertEqual(diary.title, "Secret Diary - Day 418")
+        self.assertIn("Mugworth", reply)
+    def test_llama_cpp_missing_model_path_falls_back_to_mock(self) -> None:
+        env = {"OBJECTVERSE_TEXT_BACKEND": "llama-cpp", "TEXT_MODEL_PATH": ""}
+        with patch.dict("os.environ", env, clear=False):
+            result = generate_object_diary(None, "dusty black keyboard", "Philosopher", save=False)
+        self.assertEqual(result.persona.persona.object_name, "keyboard")
+        self.assertIn("text-fallback-to-mock", result.trace.fallbacks)
+        self.assertIn("mock-vision-runtime", result.trace.fallbacks)
+        self.assertNotIn("mock-text-runtime", result.trace.fallbacks)
+    def test_llama_cpp_import_failure_falls_back_to_mock(self) -> None:
+        env = {
+            "OBJECTVERSE_TEXT_BACKEND": "llama_cpp",
+            "TEXT_MODEL_PATH": "/tmp/objectverse-text-model.gguf",
+        }
+        with (
+            patch.dict("os.environ", env, clear=False),
+            patch("src.models.llama_cpp_runner._load_llama_model", side_effect=ImportError("no llama_cpp")),
+        ):
+            result = generate_object_diary(None, "old white coffee mug", "Cynical", save=False)
+        self.assertEqual(result.persona.persona.object_name, "coffee mug")
+        self.assertIn("text-fallback-to-mock", result.trace.fallbacks)
+    def test_llama_cpp_invalid_json_falls_back_to_mock(self) -> None:
+        env = {
+            "OBJECTVERSE_TEXT_BACKEND": "llama-cpp",
+            "TEXT_MODEL_PATH": "/tmp/objectverse-text-model.gguf",
+        }
+        with (
+            patch.dict("os.environ", env, clear=False),
+            patch("src.models.llama_cpp_runner._load_llama_model", return_value=FakeLlamaModel(["not json"])),
+        ):
+            result = generate_object_diary(None, "old white coffee mug", "Cynical", save=False)
+        self.assertEqual(result.persona.persona.object_name, "coffee mug")
+        self.assertIn("text-fallback-to-mock", result.trace.fallbacks)
+        self.assertEqual(result.trace.model_runtime["text"], "llama-cpp text generation")
+    def test_minicpm_vision_backend_accepts_valid_json(self) -> None:
+        response = """
+        {"object":{"name":"coffee mug","visible_features":["white ceramic","round handle","desk shadow"],"likely_context":"work desk","confidence":0.88}}
+        """
+        settings = get_runtime_settings(
+            {
+                "OBJECTVERSE_VISION_BACKEND": "minicpm-v",
+                "VISION_MODEL_ID": "openbmb/MiniCPM-V-2_6",
+                "OBJECTVERSE_TEXT_BACKEND": "mock",
+            }
+        )
+        with (
+            patch("src.models.vision_runner._load_rgb_image", return_value=object()),
+            patch("src.models.vision_runner._load_minicpm_components", return_value=(FakeMiniCpmModel(response), object())),
+        ):
+            result = understand_object_with_metadata("/tmp/mug.png", "white mug", settings=settings)
+        self.assertEqual(result.object_understanding.object.name, "coffee mug")
+        self.assertEqual(result.object_understanding.object.confidence, 0.88)
+        self.assertEqual(result.fallbacks, [])
+    def test_minicpm_vision_backend_falls_back_on_invalid_json(self) -> None:
+        settings = get_runtime_settings(
+            {
+                "OBJECTVERSE_VISION_BACKEND": "minicpm-v",
+                "VISION_MODEL_ID": "openbmb/MiniCPM-V-2_6",
+                "OBJECTVERSE_TEXT_BACKEND": "mock",
+            }
+        )
+        with (
+            patch("src.models.vision_runner._load_rgb_image", return_value=object()),
+            patch("src.models.vision_runner._load_minicpm_components", return_value=(FakeMiniCpmModel("not json"), object())),
+        ):
+            result = understand_object_with_metadata("/tmp/keyboard.png", "dusty black keyboard", settings=settings)
+        self.assertEqual(result.object_understanding.object.name, "keyboard")
+        self.assertEqual(result.fallbacks, ["vision-fallback-to-mock"])
     def test_pipeline_saves_generation_result(self) -> None:
         with tempfile.TemporaryDirectory() as tmp_dir:
             result = generate_object_diary(
             self.assertEqual(result.object_understanding.object.name, "coffee mug")
             self.assertEqual(saved_path.stem, result.trace.trace_id)
+    def test_pipeline_records_minicpm_vision_runtime(self) -> None:
+        response = """
+        {"object":{"name":"desk lamp","visible_features":["metal shade","thin neck","warm light"],"likely_context":"desk","confidence":0.91}}
+        """
+        env = {
+            "OBJECTVERSE_VISION_BACKEND": "minicpm-v",
+            "VISION_MODEL_ID": "openbmb/MiniCPM-V-2_6",
+            "OBJECTVERSE_TEXT_BACKEND": "mock",
+        }
+        with (
+            patch.dict("os.environ", env, clear=False),
+            patch("src.models.vision_runner._load_rgb_image", return_value=object()),
+            patch("src.models.vision_runner._load_minicpm_components", return_value=(FakeMiniCpmModel(response), object())),
+        ):
+            result = generate_object_diary("/tmp/lamp.png", "desk lamp", "Dramatic", save=False)
+        self.assertEqual(result.object_understanding.object.name, "desk lamp")
+        self.assertEqual(result.trace.model_runtime["vision"], "minicpm-v object understanding")
+        self.assertIn("mock-text-runtime", result.trace.fallbacks)
+        self.assertNotIn("mock-runtime", result.trace.fallbacks)
     def test_chat_uses_current_persona(self) -> None:
         object_understanding = understand_object(None, "dusty black mechanical keyboard")
         persona = generate_persona(object_understanding, "Philosopher")

tests/test_space_vlm_tooling.py ADDED Viewed

	@@ -0,0 +1,220 @@

+"""Tests for hosted Space VLM validation tooling."""
+from __future__ import annotations
+import tempfile
+import unittest
+from datetime import datetime, timezone
+from pathlib import Path
+from scripts.check_space_vlm import (
+    TEST_ASSETS,
+    ValidationResult,
+    extract_trace_record,
+    parse_space_repo_id,
+    render_report,
+    space_client_url,
+    validate_prediction,
+    write_trace_record,
+)
+from src.models.schema import DiaryEntry, ObjectInfo, ObjectUnderstanding, Persona, PersonaEnvelope, TraceRecord
+from src.utils.zero_gpu import zero_gpu
+class SpaceVlmToolingTest(unittest.TestCase):
+    def test_asset_manifest_covers_three_validation_objects(self) -> None:
+        keys = {asset.key for asset in TEST_ASSETS}
+        self.assertEqual(keys, {"mug", "keyboard", "shoe"})
+        self.assertTrue(all(asset.source_page.startswith("https://commons.wikimedia.org/") for asset in TEST_ASSETS))
+        self.assertTrue(all(asset.download_url.startswith("https://commons.wikimedia.org/") for asset in TEST_ASSETS))
+    def test_parse_space_repo_id_from_space_url(self) -> None:
+        repo_id = parse_space_repo_id("https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary")
+        self.assertEqual(repo_id, "build-small-hackathon/ObjectverseDiary")
+    def test_space_client_url_uses_direct_hf_space_host(self) -> None:
+        client_url = space_client_url("https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary")
+        self.assertEqual(client_url, "https://build-small-hackathon-objectversediary.hf.space")
+    def test_zero_gpu_decorator_is_noop_without_spaces_package(self) -> None:
+        def sample(value: int) -> int:
+            return value + 1
+        decorated = zero_gpu(duration=10)(sample)
+        self.assertEqual(decorated(2), 3)
+    def test_validate_prediction_accepts_minicpm_runtime(self) -> None:
+        asset = TEST_ASSETS[0]
+        trace = _trace_record(
+            object_name="striped coffee mug",
+            visible_features=["ceramic cup", "handle", "striped surface"],
+            runtime_vision="minicpm-v object understanding",
+            fallbacks=["mock-text-runtime"],
+        )
+        response = [None, {}, {}, "", "", "", trace.model_dump(mode="json")]
+        result = validate_prediction(asset, Path("/tmp/mug.jpg"), response)
+        self.assertTrue(result.passed)
+        self.assertEqual(result.object_name, "striped coffee mug")
+        self.assertEqual(result.runtime_text, "mock persona and diary generation")
+    def test_validate_prediction_rejects_vision_fallback(self) -> None:
+        asset = TEST_ASSETS[1]
+        trace = _trace_record(
+            object_name="computer keyboard",
+            visible_features=["black keys"],
+            runtime_vision="minicpm-v object understanding",
+            fallbacks=["vision-fallback-to-mock", "mock-text-runtime"],
+        )
+        response = [None, {}, {}, "", "", "", trace.model_dump(mode="json")]
+        result = validate_prediction(asset, Path("/tmp/keyboard.jpg"), response)
+        self.assertFalse(result.passed)
+        self.assertIn("vision fallback marker", result.error)
+    def test_extract_trace_record_accepts_gradio_response(self) -> None:
+        trace = _trace_record(
+            object_name="running shoe",
+            visible_features=["laces", "rubber sole"],
+            runtime_vision="minicpm-v object understanding",
+            fallbacks=["mock-text-runtime"],
+        )
+        response = [None, {}, {}, "", "", "", trace.model_dump(mode="json")]
+        extracted = extract_trace_record(response)
+        self.assertEqual(extracted.object_understanding.object.name, "running shoe")
+        self.assertEqual(extracted.model_runtime["vision"], "minicpm-v object understanding")
+    def test_write_trace_record_writes_valid_public_json(self) -> None:
+        trace = _trace_record(
+            object_name="striped coffee mug",
+            visible_features=["ceramic cup", "handle"],
+            runtime_vision="minicpm-v object understanding",
+            fallbacks=["mock-text-runtime"],
+        )
+        with tempfile.TemporaryDirectory() as tmp_dir:
+            output_path = write_trace_record(trace, Path(tmp_dir) / "mug.json")
+            payload = output_path.read_text(encoding="utf-8")
+            parsed = TraceRecord.model_validate_json(payload)
+        self.assertEqual(parsed.trace_id, trace.trace_id)
+        self.assertNotIn("HUGGINGFACE_TOKEN", payload)
+        self.assertNotIn("HF_TOKEN", payload)
+        self.assertNotIn("hf_", payload)
+    def test_write_trace_record_rejects_sensitive_token_markers(self) -> None:
+        trace = _trace_record(
+            object_name="computer keyboard",
+            visible_features=["black keys"],
+            runtime_vision="minicpm-v object understanding",
+            fallbacks=["mock-text-runtime"],
+        )
+        trace.model_runtime["runtime"] = "vision model id: openbmb/MiniCPM-V-2_6; token hf_forbidden"
+        with tempfile.TemporaryDirectory() as tmp_dir:
+            output_path = Path(tmp_dir) / "keyboard.json"
+            with self.assertRaises(ValueError):
+                write_trace_record(trace, output_path)
+            self.assertFalse(output_path.exists())
+    def test_render_report_includes_results_and_safe_config(self) -> None:
+        result = ValidationResult(
+            key="shoe",
+            label="Running shoe",
+            source_page="https://commons.wikimedia.org/wiki/File:Running_shoes.jpg",
+            image_path="/tmp/shoe.jpg",
+            passed=True,
+            object_name="running shoe",
+            visible_features=["laces", "athletic sole"],
+            likely_context="sports gear",
+            confidence=0.86,
+            runtime_vision="minicpm-v object understanding",
+            runtime_text="mock persona and diary generation",
+            fallbacks=["mock-text-runtime"],
+        )
+        report = render_report(
+            space_url="https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary",
+            repo_id="build-small-hackathon/ObjectverseDiary",
+            results=[result],
+            configured={"hardware": "l4x1", "OBJECTVERSE_VISION_BACKEND": "minicpm-v"},
+            rollback={"hardware": "cpu-basic", "OBJECTVERSE_VISION_BACKEND": "mock"},
+        )
+        self.assertIn("Overall status: PASS", report)
+        self.assertIn("Running shoe", report)
+        self.assertIn("OBJECTVERSE_VISION_BACKEND", report)
+        self.assertNotIn("hf_", report.lower())
+        self.assertNotIn("HUGGINGFACE_TOKEN", report)
+    def test_render_report_includes_configuration_error(self) -> None:
+        report = render_report(
+            space_url="https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary",
+            repo_id="build-small-hackathon/ObjectverseDiary",
+            results=[],
+            rollback={"hardware": "cpu-basic", "OBJECTVERSE_VISION_BACKEND": "mock"},
+            configuration_error="HfHubHTTPError: 402 Payment Required",
+        )
+        self.assertIn("Overall status: FAIL", report)
+        self.assertIn("Configuration Error", report)
+        self.assertIn("402 Payment Required", report)
+def _trace_record(
+    *,
+    object_name: str,
+    visible_features: list[str],
+    runtime_vision: str,
+    fallbacks: list[str],
+) -> TraceRecord:
+    persona = PersonaEnvelope(
+        persona=Persona(
+            object_name=object_name,
+            character_name="Test Object",
+            mood="watchful",
+            secret_fear="being ignored",
+            core_memory="It remembers the test bench.",
+            complaint="I am more than a fixture.",
+            tags=["test", "object", "archive"],
+        )
+    )
+    return TraceRecord(
+        trace_id="space-vlm-test",
+        created_at=datetime.now(timezone.utc),
+        mode="Cynical",
+        input={"has_image": True, "image_filename": "asset.jpg", "description": "public test asset"},
+        object_understanding=ObjectUnderstanding(
+            object=ObjectInfo(
+                name=object_name,
+                visible_features=visible_features,
+                likely_context="test environment",
+                confidence=0.9,
+            )
+        ),
+        persona=persona,
+        diary=DiaryEntry(
+            title="Secret Diary - Day 1",
+            english="I was tested today.",
+            chinese="今天我被测试了。",
+        ),
+        model_runtime={
+            "vision": runtime_vision,
+            "text": "mock persona and diary generation",
+            "runtime": "vision model id: openbmb/MiniCPM-V-2_6; no llama.cpp model connected yet",
+        },
+        fallbacks=fallbacks,
+    )
+if __name__ == "__main__":
+    unittest.main()