Spaces:

build-small-hackathon
/

workbench

Running on Zero

App Files Files Community

workbench / docs /PLANT_DISCOVERY_APP_PLAN.md

GitHub Actions

Initial ZeroGPU deployment with spaces shim

7f9dfed 25 days ago

preview code

Raw

History Blame Contribute Delete

5.5 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Plant Discovery Reference App Plan

This checklist turns the rough plant/ sample into the first concrete app built around the template. It is intentionally detailed so future apps can copy the pattern.

Product Story

Plant Discovery helps a gardener, teacher, or field-note collector:

Upload one or more plant images.
Get a structured plant identification.
Correct mistakes.
Browse a small field guide.
Export corrections as local training data.
Later fine-tune or evaluate a small vision model.

This is stronger than a generic workbench for hackathon judging because it has a real user outcome.

Current Implementation Status

Add plant/ package.
Add standalone plant/app.py.
Add clean plant/models.yaml.
Add deterministic no-model demo service.
Add optional MiniCPM-V service adapter.
Add explicit demo, openbmb, and finetuned model modes.
Make OpenBMB MiniCPM-V the default real model mode.
Add fine-tuned adapter loading path through PEFT when configured.
Avoid hard torch/transformers imports at module import time.
Add plant schema/parser through PlantID.
Add local species index builder.
Add local folder loader.
Add field-note export to plant training JSONL.
Add focused Gradio UI with Identify, Field Guide, Corrections, and Stats.
Replace direct training execution with non-executing training plan.
Add optional pure tool functions and lazy MCP server builder.
Add non-executing plant training planner.
Add Plant model/training how-to.
Add plant unit tests.
Add no-model app build verification.

Remaining Work

P0 - Make The Demo Judgeable

Run python -m plant.app --no-model --port 7861.
Install Node.js/npm.
Generate Playwright screenshots for the plant app.
Add Plant Discovery screenshots to README.
Add a 60-90 second demo script focused on plant correction.
Decide whether the hackathon Space should launch app.py or plant/app.py.
If using plant/app.py for Space, add a Space-specific entrypoint or README note.

P1 - Real Model Path

Install optional plant dependencies from plant/requirements.txt.
Verify PlantVisionService.dependency_report() is fully available.
Run one MiniCPM-V identification on a local public-safe plant image.
Capture model ID, hardware, latency, and output JSON.
Add a real-backend integration test profile that is skipped unless dependencies are present.
Document that model weights are not downloaded on startup.

P2 - Data And Correction Loop

Add a tiny public-safe sample plant image folder under ignored local data.
Add a sample plant/data/plantnet_labels.json cache or document how to create it.
Add path allowlist before public Space deployment.
Add JSONL schema documentation for exported corrections.
Add validation for minimum correction count before recommending training.

P3 - Template Extraction

Identify reusable parts that should move from plant/ to generic modules.
Keep domain-specific prompt/schema/UI inside plant/.
Consider domains/<name>/ convention if more than one reference app is added.
Add a cookiecutter-style checklist or scaffold script only after the second app proves the pattern.

P4 - Security And Public Mode

Add public/local mode config.
In public mode, disable arbitrary local file paths.
In public mode, disable arbitrary backend URLs.
Add max upload size and allowed image extension checks.
Add tests for path traversal and malformed image/data inputs.
Add a plant-specific disclaimer: identification can be wrong; do not use for medical, edible, or toxicity-critical decisions without expert verification.

P5 - Training And Evaluation

Collect at least 30 corrected examples before any LoRA experiment.
Split exported corrections into train/eval.
Add exact species-name evaluation.
Add before/after model comparison.
Run SWIFT or LLaMA-Factory locally only after dependencies and hardware are approved.
Publish adapter only if it improves the evaluation set.
Update plant/models.yaml with the real adapter repo.
Verify python -m plant.app --model-mode finetuned --port 7861.

Recommended Next Command Sequence

.venv\Scripts\python.exe -m pytest tests/unit/test_plant_reference_app.py -q
.venv\Scripts\ruff.exe check plant tests/unit/test_plant_reference_app.py --no-cache
.venv\Scripts\python.exe -m mypy plant tests/unit/test_plant_reference_app.py --cache-dir "$env:TEMP\openbmb-workbench-mypy-cache"
.venv\Scripts\python.exe -m plant.app --no-model --port 7861
.venv\Scripts\python.exe scripts\plan_plant_training.py --corrected-examples 30

For the deployed Plant Identification Tool Space, use plant_space_app.py. It launches OpenBMB MiniCPM-V mode and does not enable --no-model/demo mode.

Template Lessons From Plant Discovery

Domain app first screen should be the product, not the infrastructure.
Demo/no-model mode is essential for tests and screenshots.
Heavy model dependencies should be optional.
Correction loops are more valuable than promising immediate fine-tuning.
Tool/MCP functions should be pure and importable without a running server.
Training should be a plan until data, hardware, and dependencies are verified.