ObjectverseDiary / docs /03-dev-schedule.md
qqyule's picture
Deploy latest Objectverse Diary version
1e2c036 verified
# Objectverse Diary — 开发计划(Day-by-Day)
```text
周期:June 5 - June 15, 2026(共 11 天)
目标:完成 MVP、打磨 UI、冲全部徽章、提交视频与社交文案
```
---
## Day 1:立项 + 项目骨架
**目标:确定项目不可变范围。**
- [x] 配置 GitHub origin
- [ ] 确认并同步 GitHub repo
- [x] 创建 Hugging Face Space
- [x] 创建基础 Gradio app
- [x] 写 README 草稿
- [x] 确定英文主界面文案
- [x] 建立 `AGENTS.md`
- [x] 建立 `.codex/skills/`
---
## Day 2:MVP 交互闭环
**目标:先不管模型,跑通产品流程。**
- [x] 图片上传
- [x] 文本描述输入
- [x] personality mode 选择
- [x] mock object JSON
- [x] mock diary 输出
- [x] trace JSON 保存
- [x] share card HTML 预览
- [x] mock example gallery
- [x] MVP smoke tests
- [x] six public mock sample traces
- [x] local initial-stage acceptance script
- [x] local initial-stage completion report
交付:`Upload → Generate → Diary → Share Card → Trace`
---
## Day 3:接入 VLM
**目标:让 AI 真正看图。**
- [x] 接入 MiniCPM-V 或轻量 VLM
- [x] 输出 object understanding JSON
- [x] 做 JSON repair
- [x] 加 example gallery
- [x] 新增 Space VLM 验证脚本
- [x] 新增 ZeroGPU 兼容装饰器
- [x] ZeroGPU CUDA probe
- [ ] 缓存示例输出
- [ ] Space 真实图片验证(L4 因 HF `402 Payment Required` 阻塞;ZeroGPU CUDA probe 成功;2026-06-08 full validation reached the app but fell back to mock vision for mug/keyboard/shoe)
验收:上传杯子/键盘/鞋子,模型能识别物品并提取外观特征。
完成记录:MiniCPM-V 2.6 已作为可配置 vision backend 接入,默认仍是 mock vision;`scripts/check_space_vlm.py` 已可用三张临时公开图片验证 Space 端 mug/keyboard/shoe。2026-06-06 已尝试切到 L4,但 Hugging Face 返回 `402 Payment Required`;随后 ZeroGPU CUDA probe 成功。2026-06-08 full validation reached the app through the direct `hf.space` path, but all three objects included `vision-fallback-to-mock`。文本生成已接入可选 llama.cpp runtime wiring,但最终 GGUF 模型仍未选择/下载。
---
## Day 4:文本模型 + llama.cpp
**目标:让核心人格生成走小模型本地推理。**
- [ ] 下载小模型 GGUF
- [x] 接入可选 llama.cpp / llama-cpp-python runtime wiring
- [x] 封装 `generate_persona()`
- [x] 封装 `generate_diary()`
- [x] README 说明运行方式
- [ ] 用真实 GGUF 做本地 smoke test
- [ ] README 说明最终模型参数量
交付:`src/models/llama_cpp_runner.py` 已支持 `TEXT_MODEL_PATH`;不提交 `models/text_model.gguf`。后续仍需确定真实 GGUF、参数量和训练/发布路径。
---
## Day 5:训练数据 + 微调准备
**目标:冲 Well-Tuned 勋章。**
- [x] 设计 SFT schema
- [x] 生成 mock SFT preview 数据
- [ ] 生成 200-500 条 real/candidate object-persona 样本
- [ ] 手工精选 50 条高质量样本
- [ ] 上传 dataset 到 HF
- [ ] 准备 LoRA 训练脚本
数据格式示例:
```json
{
"instruction": "Create a secret diary persona for this object.",
"input": {
"object": "old keyboard",
"features": ["dusty", "mechanical keys", "developer desk"],
"mode": "cynical"
},
"output": {
"character_name": "Clackwell",
"diary": "He calls it productivity. I call it percussion with anxiety.",
"tags": ["burnout instrument", "debug witness", "plastic philosopher"]
}
}
```
---
## Day 6:LoRA 微调 + Hub 发布
**目标:拿到可展示的自微调模型。**
- [ ] 用 Modal credits 进行训练
- [ ] 导出 LoRA adapter
- [ ] 发布 HF model repo
- [ ] app 中加入模型说明
- [ ] README 加 `Well-Tuned` section
交付:HF model repo、HF dataset repo、train log、model card
> ⚠️ Modal credits 兑换码不应公开分享,项目文档里只写"used Modal credits"。
---
## Day 7:UI 魔改
**目标:冲 Off-Brand 勋章。**
视觉方向:
```text
A strange archive room for everyday objects.
Dark paper texture, amber highlights, typewriter output, museum labels.
```
界面布局:
```text
Left: Object Intake
Middle: Object File
Right: Secret Diary
Bottom: Share Card + Trace
```
- [x] 自定义 CSS
- [x] 自定义 hero section
- [x] 隐藏 Gradio 默认风格
- [x] 加 typewriter / archive reveal 视觉感
- [x] 做英文主文案 + 中文辅助
- [x] 做 6 个示例卡片
完成记录:Phase 2 UI 已完成为 archive dashboard。MiniCPM-V 2.6 vision backend 和可选 llama.cpp runtime wiring 已接入但默认仍 mock;LoRA 未接入;`UI 参考/` 仅作为本地视觉参考,不入库。
---
## Day 8:Trace + Sharing is Caring
**目标:公开可复现材料。**
- [x] trace logger
- [x] sample traces
- [x] prompt templates
- [x] dataset preview
- [x] trace JSONL export
- [x] 失败案例记录
- [x] Space VLM validation report 模板
- [ ] 真实模型 traces
- [ ] GitHub repo 同步整理
---
## Day 9:Field Notes
**目标:完成技术博客。**
英文标题:`Building Objectverse Diary: A Small-Model AI Toy Where Everyday Objects Come Alive`
博客结构:
1. Why I built it
2. Why Track 2
3. Why small models are enough
4. Product design
5. Model architecture
6. Gradio Off-Brand UI
7. llama.cpp runtime
8. Fine-tuning dataset
9. Traces and reproducibility
10. What failed
11. What I would improve next
---
## Day 10:Demo 视频
**目标:视频必须比代码更能打。**
建议长度:90 秒
```text
0- 8s What if every object around you had a secret life?
8-20s This is Objectverse Diary, a small-model AI toy built with Gradio.
20-35s Upload a photo of any everyday object.
35-50s A vision model reads the object, then a small fine-tuned model creates its hidden personality.
50-70s Now this coffee mug writes its secret diary and complains about its owner.
70-82s You can chat with the object and generate a shareable personality card.
82-90s Built with small models, Gradio, llama.cpp, public traces, and no commercial cloud APIs.
```
---
## Day 11:提交检查
- [ ] Space under official org
- [ ] Space MiniCPM-V validation passes for mug, keyboard, and shoe(当前 wired but hosted validation falls back to mock)
- [ ] Demo video ready
- [ ] Social post ready
- [ ] README complete
- [ ] Model parameter count documented
- [ ] No commercial API
- [ ] Fine-tuned model linked
- [ ] Dataset linked
- [ ] Traces linked
- [ ] Field Notes linked
- [ ] UI English-first, Chinese-second
- [ ] Submit before June 15, 2026