Spaces:
Running on Zero
A newer version of the Gradio SDK is available: 6.17.3
title: Objectverse Diary
emoji: 🗝️
colorFrom: yellow
colorTo: gray
sdk: gradio
sdk_version: 5.50.0
python_version: '3.10'
app_file: app.py
pinned: false
license: mit
short_description: Every object has a secret life.
Objectverse Diary
Every object has a secret life. 万物日记:每个物品都有秘密人生。
Objectverse Diary is a small-model AI toy built for the Build Small Hackathon.
Upload a photo of any everyday object. The app wakes it up, gives it a secret personality, writes its diary, and lets you chat with it.
Hackathon Submission Links
Required public submission package for Build Small Hackathon:
- Hugging Face Space: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
- Demo video: https://youtu.be/5HbhP21hooA
- Social media post: https://x.com/GeekCrafter/status/2064250293001576556?s=20
These links cover the required Space link, short demo video, and social post for the official submission flow.
Current Status
Stable mock-safe local baseline, live MiniCPM-V Space vision, non-secret hosted vision diagnostics, optional llama.cpp text runtime wiring, a passing local LoRA v2 GGUF smoke test, public mock traces, Space validation evidence, a published curated v2 SFT dataset, a published Qwen 1.5B LoRA v2 adapter, and a published Q4_K_M GGUF are available.
By default, local development uses deterministic mock outputs for object understanding, persona generation, diary writing, chat replies, share card rendering, and trace saving. This keeps the local demo reproducible and avoids commercial AI APIs.
The public Hugging Face Space runs on ZeroGPU with OBJECTVERSE_VISION_BACKEND=minicpm-v, VISION_MODEL_ID=openbmb/MiniCPM-V-2_6, and OBJECTVERSE_TEXT_BACKEND=mock. The Space uses an HF_TOKEN secret with access to the gated openbmb/MiniCPM-V-2_6 model; no token is committed to the repository.
OBJECTVERSE_TEXT_BACKEND=llama-cpp can use a local GGUF model through optional llama-cpp-python when TEXT_MODEL_PATH is configured. The Modal-trained LoRA v2 adapter has been merged with Qwen/Qwen2.5-1.5B-Instruct, quantized to Q4_K_M, uploaded to the same model repo, and smoke-tested locally through llama.cpp. No GGUF file is committed in Git, and the public Space intentionally keeps text generation on the mock runtime for this live MiniCPM-V release.
scripts/check_llama_cpp_smoke.py passed locally on June 8, 2026 with models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf. The published GGUF is available in qqyule/objectverse-diary-qwen15b-lora as objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf.
Hugging Face Space:
https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
Track
An Adventure in Thousand Token Wood
Why This Fits the Track
This is a pure digital experience that could not exist without AI:
- vision understanding
- object persona generation
- first-person diary writing
- consistent character chat
- shareable personality cards
Language
The interface is English-first and Chinese-second.
Badge Targets
- Off-Brand — archive-style Gradio UI, English-first with Chinese helper text.
- Sharing is Caring — public mock traces, JSONL export, prompt templates, and failure notes.
- Field Notes — article draft in
docs/FIELD_NOTES.md. - OpenBMB Special — MiniCPM-V 2.6 wiring exists, hosted ZeroGPU validation passed, and the public Space is configured for live MiniCPM-V vision with mock text.
- Llama Champion — local llama.cpp GGUF runtime passed with the published LoRA v2 Q4_K_M model; Space text runtime remains mock for the live vision release.
- Well-Tuned — synthetic curated v2 SFT dataset and Qwen 1.5B LoRA v2 adapter are published.
- Off the Grid — no commercial AI APIs are used; final badge eligibility depends on hackathon review.
Planned Model Stack
- Vision: MiniCPM-V 2.6 or deterministic mock fallback
- Text: deterministic mock text by default; optional published Qwen 1.5B LoRA v2 Q4_K_M GGUF for local llama.cpp runtime
- Runtime: llama.cpp / llama-cpp-python
- UI: Gradio Blocks
Parameter Budget
The hackathon budget is <= 32B total model parameters.
Current live Space budget:
- live vision backend: MiniCPM-V 2.6, about 8B parameters
- live text backend: deterministic mock, 0 active model parameters
- optional text base for published LoRA v2 adapter: Qwen/Qwen2.5-1.5B-Instruct, about 1.5B parameters
- optional text GGUF: published
objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf, about 1.5B base parameters plus a small merged LoRA delta; not committed to Git
The live public Space therefore stays within the 32B budget. MiniCPM-V plus the optional Qwen 1.5B text path would remain about 9.5B plus a small LoRA adapter, safely under the 32B budget.
Run Locally
pip install -r requirements.txt
python app.py
Then open the local Gradio URL printed in the terminal.
Optional llama.cpp Text Runtime
The project does not commit GGUF files. The Space dependencies include llama-cpp-python, but the model is only used when OBJECTVERSE_TEXT_BACKEND=llama-cpp. To try a local GGUF text model:
OBJECTVERSE_TEXT_BACKEND=llama-cpp \
TEXT_MODEL_PATH=/absolute/path/to/text-model.gguf \
python app.py
For Hugging Face Space runtime, use Hub download variables instead of committing the GGUF:
OBJECTVERSE_TEXT_BACKEND=llama-cpp
TEXT_MODEL_REPO_ID=qqyule/objectverse-diary-qwen15b-lora
TEXT_MODEL_FILENAME=objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
If llama-cpp-python is missing, no local or Hub model source is configured, the model cannot download/load, or the model returns invalid JSON, the app falls back to deterministic mock text generation and records text-fallback-to-mock in traces.
Recommended explicit-confirmation smoke path:
.venv/bin/python -B scripts/check_llama_cpp_smoke.py \
--model-path models/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
Published GGUF source:
repo: qqyule/objectverse-diary-qwen15b-lora
file: objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
Initial MVP Flow
The stable submission baseline supports:
- image upload
- optional object description
- personality mode selection
- six stable example objects
- mock object understanding JSON
- mock persona JSON
- English-first secret diary with Chinese helper translation
- object chat with persona consistency
- share card HTML preview
- anonymized trace JSON saved under
data/traces/ - six public mock sample traces under
data/traces/samples/
Stable Submission Evidence
- Live MiniCPM-V Space: https://huggingface.co/spaces/build-small-hackathon/ObjectverseDiary
- Published demo video: https://youtu.be/5HbhP21hooA
- Published social media post: https://x.com/GeekCrafter/status/2064250293001576556?s=20
- Initial acceptance report:
docs/INITIAL_STAGE_REPORT.md - Runtime notes:
docs/RUNTIME.md - Dataset preview notes:
docs/DATASET.md - Synthetic curated v2 dataset: https://huggingface.co/datasets/qqyule/objectverse-diary-sft-curated
- Fine-tuned LoRA v2 adapter: https://huggingface.co/qqyule/objectverse-diary-qwen15b-lora
- LoRA v2 Q4_K_M GGUF: https://huggingface.co/qqyule/objectverse-diary-qwen15b-lora/blob/main/objectverse-diary-qwen15b-lora-v2-q4_k_m.gguf
- Public mock traces:
data/traces/samples/ - Trace JSONL export:
data/traces/samples/objectverse_public_mock_traces.jsonl - Hosted VLM validation evidence:
docs/SPACE_VLM_REPORT.md,docs/SPACE_VLM_REPORT.json,data/traces/space-vlm/ - Hosted VLM diagnostic support: hidden
/vision_runtime_probeAPI and probe-awarescripts/check_space_vlm.py - Field Notes draft:
docs/FIELD_NOTES.md - Demo video script:
docs/DEMO_VIDEO_SCRIPT.md - Social post draft:
docs/SOCIAL_POST.md
Generate Sample Traces
.venv/bin/python -B scripts/generate_sample_traces.py
Generate Dataset Preview
.venv/bin/python -B scripts/generate_dataset.py
This creates deterministic mock SFT preview data for schema and curation planning. See docs/DATASET.md.
Export Public Trace JSONL
.venv/bin/python -B scripts/export_traces.py
Test
.venv/bin/python -B -m unittest discover -s tests
Initial Stage Acceptance
.venv/bin/python -B scripts/check_initial_stage.py
See docs/INITIAL_STAGE_REPORT.md for the local initial-stage evidence.
See docs/EXTERNAL_SETUP.md before changing remote GitHub or Hugging Face resources.
Project Structure
See docs/02-tech-architecture.md, AGENTS.md, and .codex/skills/ for the intended structure and development rules.
Runtime Notes
The local default runtime is mock-only. The public Space is configured for live MiniCPM-V 2.6 vision with mock text generation, and optional llama.cpp text generation can be enabled separately with environment variables while preserving mock fallbacks. See docs/RUNTIME.md.
HF Space README YAML Header
---
title: Objectverse Diary
emoji: 🗝️
colorFrom: amber
colorTo: gray
sdk: gradio
python_version: '3.10'
app_file: app.py
pinned: false
---