Spaces:
Running on Zero
Running on Zero
File size: 6,548 Bytes
6f8d8d9 e20e3d9 bc02199 6f8d8d9 bc02199 6f8d8d9 e20e3d9 535bb9d 1e2c036 6f8d8d9 1e2c036 6f8d8d9 1e2c036 e20e3d9 6f8d8d9 e20e3d9 6f8d8d9 e20e3d9 6f8d8d9 bc02199 6f8d8d9 783b7b3 e20e3d9 6f8d8d9 bc02199 e20e3d9 6f8d8d9 1e2c036 6f8d8d9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 | # Objectverse Diary — 开发计划(Day-by-Day)
```text
周期:June 5 - June 15, 2026(共 11 天)
目标:完成 MVP、打磨 UI、冲全部徽章、提交视频与社交文案
```
---
## Day 1:立项 + 项目骨架
**目标:确定项目不可变范围。**
- [x] 配置 GitHub origin
- [ ] 确认并同步 GitHub repo
- [x] 创建 Hugging Face Space
- [x] 创建基础 Gradio app
- [x] 写 README 草稿
- [x] 确定英文主界面文案
- [x] 建立 `AGENTS.md`
- [x] 建立 `.codex/skills/`
---
## Day 2:MVP 交互闭环
**目标:先不管模型,跑通产品流程。**
- [x] 图片上传
- [x] 文本描述输入
- [x] personality mode 选择
- [x] mock object JSON
- [x] mock diary 输出
- [x] trace JSON 保存
- [x] share card HTML 预览
- [x] mock example gallery
- [x] MVP smoke tests
- [x] six public mock sample traces
- [x] local initial-stage acceptance script
- [x] local initial-stage completion report
交付:`Upload → Generate → Diary → Share Card → Trace`
---
## Day 3:接入 VLM
**目标:让 AI 真正看图。**
- [x] 接入 MiniCPM-V 或轻量 VLM
- [x] 输出 object understanding JSON
- [x] 做 JSON repair
- [x] 加 example gallery
- [x] 新增 Space VLM 验证脚本
- [x] 新增 ZeroGPU 兼容装饰器
- [x] ZeroGPU CUDA probe
- [ ] 缓存示例输出
- [ ] Space 真实图片验证(L4 因 HF `402 Payment Required` 阻塞;ZeroGPU CUDA probe 成功;2026-06-08 full validation reached the app but fell back to mock vision for mug/keyboard/shoe)
验收:上传杯子/键盘/鞋子,模型能识别物品并提取外观特征。
完成记录:MiniCPM-V 2.6 已作为可配置 vision backend 接入,默认仍是 mock vision;`scripts/check_space_vlm.py` 已可用三张临时公开图片验证 Space 端 mug/keyboard/shoe。2026-06-06 已尝试切到 L4,但 Hugging Face 返回 `402 Payment Required`;随后 ZeroGPU CUDA probe 成功。2026-06-08 full validation reached the app through the direct `hf.space` path, but all three objects included `vision-fallback-to-mock`。文本生成已接入可选 llama.cpp runtime wiring,但最终 GGUF 模型仍未选择/下载。
---
## Day 4:文本模型 + llama.cpp
**目标:让核心人格生成走小模型本地推理。**
- [ ] 下载小模型 GGUF
- [x] 接入可选 llama.cpp / llama-cpp-python runtime wiring
- [x] 封装 `generate_persona()`
- [x] 封装 `generate_diary()`
- [x] README 说明运行方式
- [ ] 用真实 GGUF 做本地 smoke test
- [ ] README 说明最终模型参数量
交付:`src/models/llama_cpp_runner.py` 已支持 `TEXT_MODEL_PATH`;不提交 `models/text_model.gguf`。后续仍需确定真实 GGUF、参数量和训练/发布路径。
---
## Day 5:训练数据 + 微调准备
**目标:冲 Well-Tuned 勋章。**
- [x] 设计 SFT schema
- [x] 生成 mock SFT preview 数据
- [ ] 生成 200-500 条 real/candidate object-persona 样本
- [ ] 手工精选 50 条高质量样本
- [ ] 上传 dataset 到 HF
- [ ] 准备 LoRA 训练脚本
数据格式示例:
```json
{
"instruction": "Create a secret diary persona for this object.",
"input": {
"object": "old keyboard",
"features": ["dusty", "mechanical keys", "developer desk"],
"mode": "cynical"
},
"output": {
"character_name": "Clackwell",
"diary": "He calls it productivity. I call it percussion with anxiety.",
"tags": ["burnout instrument", "debug witness", "plastic philosopher"]
}
}
```
---
## Day 6:LoRA 微调 + Hub 发布
**目标:拿到可展示的自微调模型。**
- [ ] 用 Modal credits 进行训练
- [ ] 导出 LoRA adapter
- [ ] 发布 HF model repo
- [ ] app 中加入模型说明
- [ ] README 加 `Well-Tuned` section
交付:HF model repo、HF dataset repo、train log、model card
> ⚠️ Modal credits 兑换码不应公开分享,项目文档里只写"used Modal credits"。
---
## Day 7:UI 魔改
**目标:冲 Off-Brand 勋章。**
视觉方向:
```text
A strange archive room for everyday objects.
Dark paper texture, amber highlights, typewriter output, museum labels.
```
界面布局:
```text
Left: Object Intake
Middle: Object File
Right: Secret Diary
Bottom: Share Card + Trace
```
- [x] 自定义 CSS
- [x] 自定义 hero section
- [x] 隐藏 Gradio 默认风格
- [x] 加 typewriter / archive reveal 视觉感
- [x] 做英文主文案 + 中文辅助
- [x] 做 6 个示例卡片
完成记录:Phase 2 UI 已完成为 archive dashboard。MiniCPM-V 2.6 vision backend 和可选 llama.cpp runtime wiring 已接入但默认仍 mock;LoRA 未接入;`UI 参考/` 仅作为本地视觉参考,不入库。
---
## Day 8:Trace + Sharing is Caring
**目标:公开可复现材料。**
- [x] trace logger
- [x] sample traces
- [x] prompt templates
- [x] dataset preview
- [x] trace JSONL export
- [x] 失败案例记录
- [x] Space VLM validation report 模板
- [ ] 真实模型 traces
- [ ] GitHub repo 同步整理
---
## Day 9:Field Notes
**目标:完成技术博客。**
英文标题:`Building Objectverse Diary: A Small-Model AI Toy Where Everyday Objects Come Alive`
博客结构:
1. Why I built it
2. Why Track 2
3. Why small models are enough
4. Product design
5. Model architecture
6. Gradio Off-Brand UI
7. llama.cpp runtime
8. Fine-tuning dataset
9. Traces and reproducibility
10. What failed
11. What I would improve next
---
## Day 10:Demo 视频
**目标:视频必须比代码更能打。**
建议长度:90 秒
```text
0- 8s What if every object around you had a secret life?
8-20s This is Objectverse Diary, a small-model AI toy built with Gradio.
20-35s Upload a photo of any everyday object.
35-50s A vision model reads the object, then a small fine-tuned model creates its hidden personality.
50-70s Now this coffee mug writes its secret diary and complains about its owner.
70-82s You can chat with the object and generate a shareable personality card.
82-90s Built with small models, Gradio, llama.cpp, public traces, and no commercial cloud APIs.
```
---
## Day 11:提交检查
- [ ] Space under official org
- [ ] Space MiniCPM-V validation passes for mug, keyboard, and shoe(当前 wired but hosted validation falls back to mock)
- [ ] Demo video ready
- [ ] Social post ready
- [ ] README complete
- [ ] Model parameter count documented
- [ ] No commercial API
- [ ] Fine-tuned model linked
- [ ] Dataset linked
- [ ] Traces linked
- [ ] Field Notes linked
- [ ] UI English-first, Chinese-second
- [ ] Submit before June 15, 2026
|