File size: 6,548 Bytes
6f8d8d9
 
 
 
 
 
 
 
 
 
 
 
 
e20e3d9
 
 
bc02199
 
 
 
 
6f8d8d9
 
 
 
 
 
 
bc02199
 
 
 
 
 
 
 
 
 
 
 
6f8d8d9
 
 
 
 
 
 
 
 
e20e3d9
 
 
 
 
535bb9d
1e2c036
6f8d8d9
1e2c036
6f8d8d9
 
 
1e2c036
e20e3d9
6f8d8d9
 
 
 
 
 
 
e20e3d9
 
 
 
 
 
6f8d8d9
e20e3d9
6f8d8d9
 
 
 
 
 
 
bc02199
 
 
6f8d8d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
783b7b3
 
 
 
 
 
 
e20e3d9
6f8d8d9
 
 
 
 
 
 
bc02199
 
 
 
 
 
e20e3d9
 
 
6f8d8d9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1e2c036
6f8d8d9
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
# Objectverse Diary — 开发计划(Day-by-Day)

```text
周期:June 5 - June 15, 2026(共 11 天)
目标:完成 MVP、打磨 UI、冲全部徽章、提交视频与社交文案
```

---

## Day 1:立项 + 项目骨架

**目标:确定项目不可变范围。**

- [x] 配置 GitHub origin
- [ ] 确认并同步 GitHub repo
- [x] 创建 Hugging Face Space
- [x] 创建基础 Gradio app
- [x] 写 README 草稿
- [x] 确定英文主界面文案
- [x] 建立 `AGENTS.md`
- [x] 建立 `.codex/skills/`

---

## Day 2:MVP 交互闭环

**目标:先不管模型,跑通产品流程。**

- [x] 图片上传
- [x] 文本描述输入
- [x] personality mode 选择
- [x] mock object JSON
- [x] mock diary 输出
- [x] trace JSON 保存
- [x] share card HTML 预览
- [x] mock example gallery
- [x] MVP smoke tests
- [x] six public mock sample traces
- [x] local initial-stage acceptance script
- [x] local initial-stage completion report

交付:`Upload → Generate → Diary → Share Card → Trace`

---

## Day 3:接入 VLM

**目标:让 AI 真正看图。**

- [x] 接入 MiniCPM-V 或轻量 VLM
- [x] 输出 object understanding JSON
- [x] 做 JSON repair
- [x] 加 example gallery
- [x] 新增 Space VLM 验证脚本
- [x] 新增 ZeroGPU 兼容装饰器
- [x] ZeroGPU CUDA probe
- [ ] 缓存示例输出
- [ ] Space 真实图片验证(L4 因 HF `402 Payment Required` 阻塞;ZeroGPU CUDA probe 成功;2026-06-08 full validation reached the app but fell back to mock vision for mug/keyboard/shoe)

验收:上传杯子/键盘/鞋子,模型能识别物品并提取外观特征。

完成记录:MiniCPM-V 2.6 已作为可配置 vision backend 接入,默认仍是 mock vision;`scripts/check_space_vlm.py` 已可用三张临时公开图片验证 Space 端 mug/keyboard/shoe。2026-06-06 已尝试切到 L4,但 Hugging Face 返回 `402 Payment Required`;随后 ZeroGPU CUDA probe 成功。2026-06-08 full validation reached the app through the direct `hf.space` path, but all three objects included `vision-fallback-to-mock`。文本生成已接入可选 llama.cpp runtime wiring,但最终 GGUF 模型仍未选择/下载。

---

## Day 4:文本模型 + llama.cpp

**目标:让核心人格生成走小模型本地推理。**

- [ ] 下载小模型 GGUF
- [x] 接入可选 llama.cpp / llama-cpp-python runtime wiring
- [x] 封装 `generate_persona()`
- [x] 封装 `generate_diary()`
- [x] README 说明运行方式
- [ ] 用真实 GGUF 做本地 smoke test
- [ ] README 说明最终模型参数量

交付:`src/models/llama_cpp_runner.py` 已支持 `TEXT_MODEL_PATH`;不提交 `models/text_model.gguf`。后续仍需确定真实 GGUF、参数量和训练/发布路径。

---

## Day 5:训练数据 + 微调准备

**目标:冲 Well-Tuned 勋章。**

- [x] 设计 SFT schema
- [x] 生成 mock SFT preview 数据
- [ ] 生成 200-500 条 real/candidate object-persona 样本
- [ ] 手工精选 50 条高质量样本
- [ ] 上传 dataset 到 HF
- [ ] 准备 LoRA 训练脚本

数据格式示例:

```json
{
  "instruction": "Create a secret diary persona for this object.",
  "input": {
    "object": "old keyboard",
    "features": ["dusty", "mechanical keys", "developer desk"],
    "mode": "cynical"
  },
  "output": {
    "character_name": "Clackwell",
    "diary": "He calls it productivity. I call it percussion with anxiety.",
    "tags": ["burnout instrument", "debug witness", "plastic philosopher"]
  }
}
```

---

## Day 6:LoRA 微调 + Hub 发布

**目标:拿到可展示的自微调模型。**

- [ ] 用 Modal credits 进行训练
- [ ] 导出 LoRA adapter
- [ ] 发布 HF model repo
- [ ] app 中加入模型说明
- [ ] README 加 `Well-Tuned` section

交付:HF model repo、HF dataset repo、train log、model card

> ⚠️ Modal credits 兑换码不应公开分享,项目文档里只写"used Modal credits"。

---

## Day 7:UI 魔改

**目标:冲 Off-Brand 勋章。**

视觉方向:

```text
A strange archive room for everyday objects.
Dark paper texture, amber highlights, typewriter output, museum labels.
```

界面布局:

```text
Left:   Object Intake
Middle: Object File
Right:  Secret Diary
Bottom: Share Card + Trace
```

- [x] 自定义 CSS
- [x] 自定义 hero section
- [x] 隐藏 Gradio 默认风格
- [x] 加 typewriter / archive reveal 视觉感
- [x] 做英文主文案 + 中文辅助
- [x] 做 6 个示例卡片

完成记录:Phase 2 UI 已完成为 archive dashboard。MiniCPM-V 2.6 vision backend 和可选 llama.cpp runtime wiring 已接入但默认仍 mock;LoRA 未接入;`UI 参考/` 仅作为本地视觉参考,不入库。

---

## Day 8:Trace + Sharing is Caring

**目标:公开可复现材料。**

- [x] trace logger
- [x] sample traces
- [x] prompt templates
- [x] dataset preview
- [x] trace JSONL export
- [x] 失败案例记录
- [x] Space VLM validation report 模板
- [ ] 真实模型 traces
- [ ] GitHub repo 同步整理

---

## Day 9:Field Notes

**目标:完成技术博客。**

英文标题:`Building Objectverse Diary: A Small-Model AI Toy Where Everyday Objects Come Alive`

博客结构:

1. Why I built it
2. Why Track 2
3. Why small models are enough
4. Product design
5. Model architecture
6. Gradio Off-Brand UI
7. llama.cpp runtime
8. Fine-tuning dataset
9. Traces and reproducibility
10. What failed
11. What I would improve next

---

## Day 10:Demo 视频

**目标:视频必须比代码更能打。**

建议长度:90 秒

```text
 0- 8s  What if every object around you had a secret life?
 8-20s  This is Objectverse Diary, a small-model AI toy built with Gradio.
20-35s  Upload a photo of any everyday object.
35-50s  A vision model reads the object, then a small fine-tuned model creates its hidden personality.
50-70s  Now this coffee mug writes its secret diary and complains about its owner.
70-82s  You can chat with the object and generate a shareable personality card.
82-90s  Built with small models, Gradio, llama.cpp, public traces, and no commercial cloud APIs.
```

---

## Day 11:提交检查

- [ ] Space under official org
- [ ] Space MiniCPM-V validation passes for mug, keyboard, and shoe(当前 wired but hosted validation falls back to mock)
- [ ] Demo video ready
- [ ] Social post ready
- [ ] README complete
- [ ] Model parameter count documented
- [ ] No commercial API
- [ ] Fine-tuned model linked
- [ ] Dataset linked
- [ ] Traces linked
- [ ] Field Notes linked
- [ ] UI English-first, Chinese-second
- [ ] Submit before June 15, 2026