ObjectverseDiary / docs /03-dev-schedule.md
qqyule's picture
Deploy latest Objectverse Diary version
1e2c036 verified

A newer version of the Gradio SDK is available: 6.17.3

Upgrade

Objectverse Diary — 开发计划(Day-by-Day)

周期:June 5 - June 15, 2026(共 11 天)
目标:完成 MVP、打磨 UI、冲全部徽章、提交视频与社交文案

Day 1:立项 + 项目骨架

目标:确定项目不可变范围。

  • 配置 GitHub origin
  • 确认并同步 GitHub repo
  • 创建 Hugging Face Space
  • 创建基础 Gradio app
  • 写 README 草稿
  • 确定英文主界面文案
  • 建立 AGENTS.md
  • 建立 .codex/skills/

Day 2:MVP 交互闭环

目标:先不管模型,跑通产品流程。

  • 图片上传
  • 文本描述输入
  • personality mode 选择
  • mock object JSON
  • mock diary 输出
  • trace JSON 保存
  • share card HTML 预览
  • mock example gallery
  • MVP smoke tests
  • six public mock sample traces
  • local initial-stage acceptance script
  • local initial-stage completion report

交付:Upload → Generate → Diary → Share Card → Trace


Day 3:接入 VLM

目标:让 AI 真正看图。

  • 接入 MiniCPM-V 或轻量 VLM
  • 输出 object understanding JSON
  • 做 JSON repair
  • 加 example gallery
  • 新增 Space VLM 验证脚本
  • 新增 ZeroGPU 兼容装饰器
  • ZeroGPU CUDA probe
  • 缓存示例输出
  • Space 真实图片验证(L4 因 HF 402 Payment Required 阻塞;ZeroGPU CUDA probe 成功;2026-06-08 full validation reached the app but fell back to mock vision for mug/keyboard/shoe)

验收:上传杯子/键盘/鞋子,模型能识别物品并提取外观特征。

完成记录:MiniCPM-V 2.6 已作为可配置 vision backend 接入,默认仍是 mock vision;scripts/check_space_vlm.py 已可用三张临时公开图片验证 Space 端 mug/keyboard/shoe。2026-06-06 已尝试切到 L4,但 Hugging Face 返回 402 Payment Required;随后 ZeroGPU CUDA probe 成功。2026-06-08 full validation reached the app through the direct hf.space path, but all three objects included vision-fallback-to-mock。文本生成已接入可选 llama.cpp runtime wiring,但最终 GGUF 模型仍未选择/下载。


Day 4:文本模型 + llama.cpp

目标:让核心人格生成走小模型本地推理。

  • 下载小模型 GGUF
  • 接入可选 llama.cpp / llama-cpp-python runtime wiring
  • 封装 generate_persona()
  • 封装 generate_diary()
  • README 说明运行方式
  • 用真实 GGUF 做本地 smoke test
  • README 说明最终模型参数量

交付:src/models/llama_cpp_runner.py 已支持 TEXT_MODEL_PATH;不提交 models/text_model.gguf。后续仍需确定真实 GGUF、参数量和训练/发布路径。


Day 5:训练数据 + 微调准备

目标:冲 Well-Tuned 勋章。

  • 设计 SFT schema
  • 生成 mock SFT preview 数据
  • 生成 200-500 条 real/candidate object-persona 样本
  • 手工精选 50 条高质量样本
  • 上传 dataset 到 HF
  • 准备 LoRA 训练脚本

数据格式示例:

{
  "instruction": "Create a secret diary persona for this object.",
  "input": {
    "object": "old keyboard",
    "features": ["dusty", "mechanical keys", "developer desk"],
    "mode": "cynical"
  },
  "output": {
    "character_name": "Clackwell",
    "diary": "He calls it productivity. I call it percussion with anxiety.",
    "tags": ["burnout instrument", "debug witness", "plastic philosopher"]
  }
}

Day 6:LoRA 微调 + Hub 发布

目标:拿到可展示的自微调模型。

  • 用 Modal credits 进行训练
  • 导出 LoRA adapter
  • 发布 HF model repo
  • app 中加入模型说明
  • README 加 Well-Tuned section

交付:HF model repo、HF dataset repo、train log、model card

⚠️ Modal credits 兑换码不应公开分享,项目文档里只写"used Modal credits"。


Day 7:UI 魔改

目标:冲 Off-Brand 勋章。

视觉方向:

A strange archive room for everyday objects.
Dark paper texture, amber highlights, typewriter output, museum labels.

界面布局:

Left:   Object Intake
Middle: Object File
Right:  Secret Diary
Bottom: Share Card + Trace
  • 自定义 CSS
  • 自定义 hero section
  • 隐藏 Gradio 默认风格
  • 加 typewriter / archive reveal 视觉感
  • 做英文主文案 + 中文辅助
  • 做 6 个示例卡片

完成记录:Phase 2 UI 已完成为 archive dashboard。MiniCPM-V 2.6 vision backend 和可选 llama.cpp runtime wiring 已接入但默认仍 mock;LoRA 未接入;UI 参考/ 仅作为本地视觉参考,不入库。


Day 8:Trace + Sharing is Caring

目标:公开可复现材料。

  • trace logger
  • sample traces
  • prompt templates
  • dataset preview
  • trace JSONL export
  • 失败案例记录
  • Space VLM validation report 模板
  • 真实模型 traces
  • GitHub repo 同步整理

Day 9:Field Notes

目标:完成技术博客。

英文标题:Building Objectverse Diary: A Small-Model AI Toy Where Everyday Objects Come Alive

博客结构:

  1. Why I built it
  2. Why Track 2
  3. Why small models are enough
  4. Product design
  5. Model architecture
  6. Gradio Off-Brand UI
  7. llama.cpp runtime
  8. Fine-tuning dataset
  9. Traces and reproducibility
  10. What failed
  11. What I would improve next

Day 10:Demo 视频

目标:视频必须比代码更能打。

建议长度:90 秒

 0- 8s  What if every object around you had a secret life?
 8-20s  This is Objectverse Diary, a small-model AI toy built with Gradio.
20-35s  Upload a photo of any everyday object.
35-50s  A vision model reads the object, then a small fine-tuned model creates its hidden personality.
50-70s  Now this coffee mug writes its secret diary and complains about its owner.
70-82s  You can chat with the object and generate a shareable personality card.
82-90s  Built with small models, Gradio, llama.cpp, public traces, and no commercial cloud APIs.

Day 11:提交检查

  • Space under official org
  • Space MiniCPM-V validation passes for mug, keyboard, and shoe(当前 wired but hosted validation falls back to mock)
  • Demo video ready
  • Social post ready
  • README complete
  • Model parameter count documented
  • No commercial API
  • Fine-tuned model linked
  • Dataset linked
  • Traces linked
  • Field Notes linked
  • UI English-first, Chinese-second
  • Submit before June 15, 2026