Spaces:

build-small-hackathon
/

ObjectverseDiary

Running on Zero

App Files Files Community

ObjectverseDiary / docs /02-tech-architecture.md

qqyule

feat: initialize project structure for Objectverse Diary

6f8d8d9 7 days ago

preview code

raw

history blame contribute delete

4.08 kB

A newer version of the Gradio SDK is available: 6.18.0

Upgrade

Objectverse Diary — 技术架构

系统架构

Gradio UI
  ↓
Image Input
  ↓
MiniCPM-V / lightweight VLM
  ↓
Object Understanding JSON
  ↓
Fine-tuned small LLM via llama.cpp
  ↓
Persona + Diary JSON
  ↓
Renderer
  ↓
Diary View + Share Card + Trace Export

模型方案

模块	模型 / 工具	目的
Vision Understanding	MiniCPM-V	识别物品、外观、场景
Persona Writer	Fine-tuned small LLM GGUF	生成人格、日记、对话
Runtime	llama.cpp / llama-cpp-python	冲 Llama Champion 勋章
UI	Gradio Blocks	官方硬性要求
Hosting	Hugging Face Space	官方硬性要求
Training / Batch	Modal	$250 Modal credits 做训练/批处理
Demo GPU	HF ZeroGPU / upgraded Space	按需分配 GPU

注意：MiniCPM-V 4.6 面向 edge deployment，基于轻量 LLM 架构，适合本项目需求。 ZeroGPU 是面向 Spaces 的动态 GPU 基础设施，hackathon org 成员有每日免费额度。

降级方案

多模态 + llama.cpp 是高风险点，必须准备降级。

风险	主方案	降级方案
MiniCPM-V 部署慢	MiniCPM-V Space 推理	预置 example gallery + 手动 object description
VLM llama.cpp 不稳定	VLM 用 transformers，文本用 llama.cpp	仍然保证核心文本人格生成走 llama.cpp
微调来不及	LoRA 微调	用 100 条高质量 SFT 数据 + prompt-tuned style
Space 资源不足	HF upgraded Space / ZeroGPU	CPU 模式 + 小模型 + 示例缓存
视频效果不够	实时生成	使用 3 个稳定示例录制 Demo

技术栈

Language:        Python
UI:              Gradio Blocks
Model Runtime:   llama.cpp / llama-cpp-python
VLM:             MiniCPM-V or fallback lightweight VLM
Training:        LoRA / PEFT / TRL
Hosting:         Hugging Face Spaces
Batch/Fine-tune: Modal
Docs:            Markdown
Package Manager: uv or pip

项目目录结构

objectverse-diary/
├─ app.py
├─ README.md
├─ AGENTS.md
├─ requirements.txt
├─ pyproject.toml
├─ .env.example
├─ .gitignore
├─ src/
│  ├─ config.py
│  ├─ ui/
│  │  ├─ layout.py
│  │  ├─ styles.css
│  │  └─ copy.py
│  ├─ models/
│  │  ├─ vision_runner.py
│  │  ├─ llama_cpp_runner.py
│  │  └─ schema.py
│  ├─ prompts/
│  │  ├─ object_understanding.py
│  │  ├─ persona_generation.py
│  │  └─ diary_generation.py
│  ├─ renderer/
│  │  ├─ share_card.py
│  │  └─ html_templates.py
│  ├─ traces/
│  │  ├─ logger.py
│  │  └─ anonymizer.py
│  └─ utils/
│     ├─ json_repair.py
│     └─ image_utils.py
├─ data/
│  ├─ examples/
│  ├─ train/
│  ├─ eval/
│  └─ traces/
├─ scripts/
│  ├─ generate_dataset.py
│  ├─ finetune_lora.py
│  ├─ convert_to_gguf.sh
│  ├─ run_llama_cpp.sh
│  └─ export_traces.py
├─ docs/
│  ├─ PRD.md
│  ├─ FIELD_NOTES.md
│  ├─ SUBMISSION_GUIDE.md
│  └─ MODEL_CARD.md
└─ .codex/
   ├─ project.md
   └─ skills/
      ├─ gradio-ui/SKILL.md
      ├─ model-runtime/SKILL.md
      ├─ dataset-trace/SKILL.md
      ├─ hf-space/SKILL.md
      └─ submission/SKILL.md