Spaces:
Sleeping
Sleeping
File size: 7,317 Bytes
7f9dfed | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 | # Template How-To: Build A New Domain App
This repository is a local-first Gradio AI app template. The base workbench provides shared
patterns for model configuration, field notes, tracking, export planning, tests, docs, and
deployment. A domain app is a focused product built around those patterns.
Use `plant/` as the first reference domain app.
## Core Principle
Do not start by training a model. Start by shipping a useful zero-shot or demo-mode workflow:
```text
domain idea
-> user story
-> schema
-> model choice
-> focused UI
-> correction loop
-> export data
-> optional fine-tune
-> deploy and document
```
Training is a later optimization after you have corrected examples and a reason to tune.
## Recommended Branch Flow
1. Keep `main` as the reusable template.
2. Create a branch for each app:
```powershell
git checkout -b plant-discovery-app
```
3. Build the app under a domain folder such as `plant/`, `invoice/`, `recipe/`, or `field_notes/`.
4. Keep domain-specific heavy requirements in `<domain>/requirements.txt`.
5. Merge reusable improvements back into `main` only after they are generic.
## Domain App File Contract
Each generated app should have these files:
```text
<domain>/
__init__.py
app.py # standalone Gradio entrypoint
models.yaml # domain config, model IDs, data sources, training defaults
<domain>_service.py # optional real model adapter plus demo/no-model fallback
<domain>_loader.py # data loading, schema normalization, export rows
<domain>_tab.py # focused Gradio UI
<domain>_tools.py # optional MCP/local tools with no hard optional imports
requirements.txt # optional heavy dependencies for this app only
```
Add tests under:
```text
tests/unit/test_<domain>_reference_app.py
```
Add docs under:
```text
docs/<DOMAIN>_APP_PLAN.md
```
## Step-By-Step Build Process
### 1. Define The Product
- [ ] Pick one user.
- [ ] Pick one job they need done.
- [ ] Write one sentence: "This app helps X do Y without Z."
- [ ] Choose one golden path that works in under two minutes.
- [ ] Decide whether the app is a standalone product or a tab inside the workbench.
- [ ] Decide whether it must run on a public Hugging Face Space.
Example:
> Plant Discovery helps gardeners identify a plant from a photo, correct mistakes, and export
> local training examples without sending private field notes to a cloud API.
### 2. Define The Domain Schema
- [ ] Create a dataclass for the structured output.
- [ ] Include confidence and model metadata.
- [ ] Include a `to_dict()` method for Gradio JSON.
- [ ] Add a robust parser for model responses.
- [ ] Add tests for valid JSON, fenced JSON, trailing commas, and unparseable text.
Plant example: `PlantID` in `plant/plant_service.py`.
### 3. Pick The Model
- [ ] Pick a small model at or below 32B parameters.
- [ ] Document the exact model ID.
- [ ] Add model metadata to `<domain>/models.yaml`.
- [ ] Avoid loading weights on startup.
- [ ] Add a deterministic demo/no-model service for screenshots and tests.
- [ ] Add an unavailable-path response when optional packages are missing.
- [ ] Add explicit runtime modes such as `demo`, `base-model`, and `finetuned`.
- [ ] Do not claim a fine-tuned model until a real adapter/checkpoint is configured and verified.
For vision apps, start with a VLM such as MiniCPM-V. For text apps, start with a small instruct
model through LM Studio, Ollama, llama.cpp, or Transformers.
### 4. Build The Focused UI
- [ ] Make the first screen the golden path, not a generic dashboard.
- [ ] Add only the controls needed for the user story.
- [ ] Keep advanced setup behind a secondary tab or accordion.
- [ ] Add visible status messages.
- [ ] Add structured JSON output for debugging and reproducibility.
- [ ] Add correction capture if model output can be wrong.
- [ ] Add screenshots through Playwright after the UI is stable.
### 5. Add The Correction Loop
- [ ] Save user corrections locally.
- [ ] Reuse `datasets.field_notes.FieldNoteStore` where possible.
- [ ] Mark training-ready rows explicitly.
- [ ] Export JSONL without starting training.
- [ ] Add tests for save, filter, and export.
### 6. Add Data Loaders
- [ ] Support a small local demo dataset.
- [ ] Support domain data from local folders or CSV/JSONL.
- [ ] Keep Hugging Face dataset loading optional and explicit.
- [ ] Do not download large datasets on startup.
- [ ] Normalize every source into one training row schema.
- [ ] Add loader tests with temporary local files.
### 7. Add Optional Tools
- [ ] Keep MCP/tool imports optional.
- [ ] Tool functions should work locally without starting a server.
- [ ] Add `build_mcp_server()` only if `mcp` is installed.
- [ ] Avoid direct shell execution from tools.
- [ ] Return command plans rather than running commands.
- [ ] Add tests for pure tool functions.
### 8. Add Training Plans
- [ ] Start with a non-executing training plan.
- [ ] Include required dependencies, hardware notes, and command preview.
- [ ] Require enough corrected examples before recommending training.
- [ ] Keep real training as a separate local command or approved action.
- [ ] Add evaluation before/after tuning.
- [ ] Add a small script that prints the training plan as JSON.
### 9. Add Security Guardrails
- [ ] Escape model text rendered as HTML.
- [ ] Restrict file paths in public Space mode.
- [ ] Disable arbitrary backend URL checks in public Space mode.
- [ ] Do not execute subprocesses from Gradio callbacks.
- [ ] Keep tokens, private data, model weights, and exports out of git.
- [ ] Add tests for path traversal and malformed inputs when public deployment is planned.
### 10. Verify The App
Minimum local verification:
```powershell
.venv\Scripts\python.exe -m pytest tests/unit/test_<domain>_reference_app.py -q
.venv\Scripts\ruff.exe check <domain> tests/unit/test_<domain>_reference_app.py --no-cache
.venv\Scripts\python.exe -m mypy <domain> tests/unit/test_<domain>_reference_app.py --cache-dir "$env:TEMP\openbmb-workbench-mypy-cache"
.venv\Scripts\python.exe -c "from <domain>.app import build_app; app=build_app(no_model=True); print(type(app).__name__)"
```
Before claiming it works:
- [ ] Run the standalone app.
- [ ] Generate screenshots.
- [ ] Add screenshot links to README/docs.
- [ ] Run full quality checks.
- [ ] Commit and push.
## When To Integrate Into The Main Workbench
Keep the domain app standalone if:
- it has its own brand/story,
- it needs a focused judging experience,
- it has domain-specific dependencies,
- it should become a Hugging Face Space.
Add it to the main workbench only if:
- it is a generic reusable tab,
- it does not add heavy dependencies,
- it strengthens the template for all future apps.
For the hackathon, standalone `plant/` is the better route because judges need one clear product.
## What "Done" Means For A Domain App
- [ ] Standalone no-model app builds.
- [ ] Optional real model adapter is documented and lazy-loaded.
- [ ] Golden path has tests.
- [ ] Corrections export to training data.
- [ ] Training is planned, not accidentally executed.
- [ ] Screenshots are generated.
- [ ] README explains setup, model choice, demo flow, and limitations.
- [ ] Space deployment is verified or blocker is documented.
|