Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available: 6.19.0
Full Task Checklist
This is the shared task list for you and Codex. It covers the hackathon MVP, the main PRD, and
the extension PRD. A task is complete only when the matching acceptance criteria are met and
docs/IMPLEMENTATION_STATUS.md is updated.
Legend
[x]done and documented[~]partially implemented or placeholder exists[ ]not started[blocked]blocked by missing local setup, credentials, hardware, or external decision
Phase 0 - Project Memory And Setup
- Add root
README.md. - Add root
AGENTS.md. - Add
.gitignore. - Add
requirements.txt. - Add
docs/folder. - Add docs index.
- Add full task checklist.
- Add implementation status doc.
- Add usage guide.
- Add architecture guide.
- Add extension guide.
- Add acceptance criteria.
- Add roadmap.
- Add critical judge-oriented improvement roadmap.
- Add template how-to for building new domain apps.
- Add Plant Discovery reference app checklist.
- Add Plant Discovery model and training how-to.
- Add PRD implementation matrix.
- Add test folder.
- Add user-story test folder.
- Add dev requirements.
- Add Python quality config.
- Add test runner script.
- Add quality runner script.
- Add CI workflow.
- Add coverage gate.
- Add performance test script.
- Install Python 3.11+.
- Verify
python --version. - Create
.venv. - Install dependencies.
- Run
python app.py. - Capture screenshot or note local URL.
Phase 1 - Hackathon Definition
- Choose track: Backyard AI or Thousand Token Wood.
- Write one-sentence project story.
- Define target user.
- Define measurable user benefit.
- Decide final model family and model IDs.
- Confirm every model is <= 32B parameters.
- Decide local-first badge target.
- Decide llama.cpp badge target.
- Decide open trace badge target.
- Decide field notes/report badge target.
- Write final demo flow.
- Write demo video script.
- Write social post draft.
- Add final submission checklist with exact URLs.
Phase 2 - MVP Gradio App
- Add
app.py. - Add Gradio
Blocksshell. - Add model config loader.
- Add model metadata display.
- Add Chat tab.
- Add Vision tab.
- Add Dataset tab placeholder.
- Add Train tab placeholder.
- Add Export tab placeholder.
- Add Field Notes tab.
- Add placeholder text service.
- Add placeholder vision service.
- Add Traces tab placeholder.
- Add Agent tab placeholder.
- Add Status tab placeholder.
- Add PowerShell structure verification script.
- Run structure verification script.
- Run app locally.
- Fix local launch errors found so far.
- Add screenshot capture path to docs or README.
- Add first demo GIF/video plan.
Phase 3 - Config-Driven Model Registry
- Add
config/models.yaml. - Add text model entry for MiniCPM5-1B.
- Add vision model entry for MiniCPM-V-4.6.
- Add omnimodal model entry for MiniCPM-o-4.5.
- Add typed
ModelInfo. - Add
load_model_catalog(). - Add
model_choices(). - Add
model_summary(). - Add MiniCPM5-1B-Thinking config.
- Add MiniCPM4.1-8B config.
- Add MiniCPM-V-4.6-Thinking config.
- Add GGUF metadata in config.
- Add backend capability metadata.
- Add lightweight catalog validation helper.
- Show warnings for models over 32B parameters.
Phase 4 - Core Architecture
- Add
core/events.py. - Add
EventType. - Add
Event. - Add
EventBus. - Add
core/registry.py. - Add generic
Registry. - Add global app state.
- Register model services in a service registry.
- Emit inference events from UI.
- Emit field note events.
- Add lightweight logging.
- Add unit tests for config and registry.
Phase 5 - Testing And Quality
- Add
tests/unit/. - Add
tests/user_stories/. - Add model catalog unit tests.
- Add field notes unit tests.
- Add new-user user-story test.
- Add
requirements-dev.txt. - Add
pyproject.toml. - Add
scripts/run_tests.ps1. - Add
scripts/run_quality.ps1. - Run unit and user-story tests.
- Install dev quality tools.
- Run
ruff. - Run
mypy. - Run
pylint. - Run
bandit. - Run
pip-audit. - Add rule: failing bug/check requires a new or updated test.
- Add coverage report.
- Add lightweight performance tests.
- Add CI pipeline.
- Add Playwright or equivalent browser e2e test after Gradio runs.
- Add tests for each real backend as it is implemented.
- Add tests for backend service selection.
- Add tests for Ollama unavailable path.
- Add tests for llama.cpp unavailable path and command building.
- Add tests for llama-cpp-python unavailable path.
- Add tests for OpenAI-compatible/LM Studio unavailable and request paths.
Phase 6 - Local Inference Backends
- Choose first real backend.
- Add backend selector in UI.
- Add model status panel.
- Add explicit model load button.
- Ensure no model weights download on startup.
Ollama Backend
- Confirm Ollama is installed.
- Add
models/ollama_service.py. - Add local model list.
- Add pull model command with explicit user action.
- Add text chat through Ollama.
- Add vision chat through Ollama when supported.
- Document Ollama setup.
llama.cpp Backend
- Confirm llama.cpp tools are installed.
- Add
models/llama_cpp_service.py. - Add
models/llama_cpp_python_service.py. - Add GGUF file picker.
- Add
llama-serverlaunch command builder. - Add health check.
- Add text generation through server.
- Add vision
mmprojsupport metadata. - Document llama.cpp setup.
llama-cpp-python Backend
- Add optional Python binding service.
- Add backend selector support.
- Install
llama-cpp-pythonlocally. - Configure local GGUF path.
- Verify real text generation through Python binding.
- Decide whether to keep Python binding as fallback or primary local path.
Transformers Backend
- Add
models/transformers_text.py. - Add
AutoModelForCausalLMloading for text models. - Add tokenizer loading.
- Add explicit trust-remote-code handling.
- Add device/dtype settings.
- Add streaming generation.
- Document hardware expectations.
OpenAI-Compatible / LM Studio Backend
- Add
models/openai_compatible_service.py. - Add backend selector support.
- Add local base URL and served-model-name config.
- Add Status tab setup and reachability check.
- Add text chat through OpenAI-compatible
/v1/chat/completions. - Document LM Studio setup.
- Verify real text generation through LM Studio.
MiniCPM Vision Backend
- Add
models/minicpm_vision.py. - Use
AutoModelForImageTextToText. - Use
AutoProcessor. - Add image prompt formatting.
- Add thinking-mode toggle mapping.
- Add video support plan.
SGLang Backend
- Add
models/sglang_runner.py. - Add server start/stop.
- Add MiniCPM5 tool parser config.
- Add health check.
- Add chat endpoint client.
- Install
sglanglocally.
Phase 7 - UI Tabs From Main PRD
- Chat tab placeholder.
- Vision tab placeholder.
- Dataset tab placeholder.
- Train tab placeholder.
- Export tab placeholder.
- Field Notes tab minimal save.
- Add Traces tab with local event preview.
- Add Agent tab placeholder.
- Add model/backend status tab or panel.
- Add settings panel.
- Add tab-level error messages.
- Add loading/progress states.
- Add compact responsive layout review.
Phase 8 - Dataset Layer
- Add
datasets/package. - Add local CSV loader.
- Add local JSONL loader.
- Add Hugging Face dataset loader.
- Add dataset schema preview.
- Add split selector.
- Add row count and sample preview.
- Add dataset statistics tool.
- Emit
DATASET_LOADEDevent. - Document dataset formats.
Phase 9 - Field Notes And Correction Loop
- Save field notes to CSV.
- Move field note logic out of UI into
datasets/field_notes.py. - Add
FieldNotedataclass. - Add SQLite-backed store.
- Add JSONL export.
- Add local HF Dataset export.
- Add corrected-only filter.
- Add tags filter.
- Add image path support.
- Add video path support.
- Add use-for-training flag.
- Add docs for correction loop.
Phase 10 - Training Pipeline
- Add training config placeholder.
- Add training UI placeholder.
- Add
training/package. - Add LoRA text trainer.
- Add LoRA config parser.
- Add PEFT/TRL dependencies when ready.
- Add training dry-run validation.
- Add local checkpoint output.
- Add Trackio integration.
- Add evaluation after training.
- Add LoRA vs base comparison.
- Add vision fine-tuning plan using SWIFT or LLaMA-Factory.
- Document training hardware requirements.
Phase 11 - Evaluation
- Add
training/evaluation.py. - Add simple prompt test set.
- Add exact-match metric.
- Add qualitative eval table.
- Add perplexity metric where appropriate.
- Add base vs tuned comparison.
- Log eval results.
- Document evaluation method.
Phase 12 - Export And Quantization
- Add export UI placeholder.
- Add
training/export.py. - Add official GGUF download path.
- Add local HF-to-GGUF conversion path.
- Add quantization selector.
- Add llama.cpp tool detection.
- Add exported file listing.
- Add download link in UI.
- Document GGUF export.
Phase 13 - Trackio Tracing
- Add
tracking/package. - Add Trackio config.
- Add
trackio.init(). - Add
trackio.log(). - Add
trackio.finish(). - Log inference events locally.
- Log dataset events locally.
- Log training metrics.
- Add Traces tab.
- Add HF Space sync docs.
Phase 13 - MCP Layer
- Decide MCP path: Gradio native,
gradio.Server - Add MCP tools module.
- Add dataset stats tool.
- Add HF search tool.
- Add safe calculator tool.
- Add model inference tool.
- Expose tools through selected MCP path.
- Document MCP endpoint.
- Verify endpoint locally.
Phase 14 - Agent Mode
- Add
agent/package. - Add agent system prompt.
- Add research-plan-implement loop placeholder.
- Add tool registry integration.
- Add session trace logging.
- Add Agent tab.
- Add trace export to JSONL.
- Add local HF Dataset export for traces.
- Document limitations.
Phase 15 - Hugging Face Space Deployment
- Install/verify
huggingface_hub. - Login with
hf auth login. - Create Space.
- Add Space README metadata if needed.
- Add Space remote.
- Push to Space.
- Verify Space builds.
- Add Space URL to README.
- Document hardware choice.
- Document model download behavior.
Phase 16 - GitHub
- Create GitHub repo.
- Add GitHub remote.
- Commit initial project.
- Push to GitHub.
- Add GitHub URL to README.
- Add issue checklist or project board if desired.
Phase 17 - Hackathon Submission Package
- Finalize app name.
- Finalize track.
- Verify Gradio app polish.
- Verify model-size compliance.
- Verify Space URLs.
- Verify GitHub URL.
- Record demo video.
- Publish social post.
- Add field notes/report link.
- Submit before June 15, 2026.
Extension PRD Backlog
vLLM Serving Tab
- Add vLLM runner.
- Add vLLM start/stop UI.
- Add OpenAI-compatible client.
- Add metrics parsing.
- Add Trackio benchmark logging.
Ollama Quick-Start
- Add Ollama pull/list UI.
- Add Ollama chat service.
- Add Ollama vision service.
- Add setup docs.
Llama.cpp Champion Path
- Add llama.cpp backend selection.
- Add llama.cpp service.
- Add llama-cpp-python service.
- Add llama.cpp status check.
- Install llama.cpp locally.
- Download/pick GGUF model.
- Verify real text generation.
- Verify MiniCPM-V mmproj flow.
Reward Model Eval
- Add reward evaluator.
- Add best-of-N generation.
- Add DPO pair generation.
- Add LoRA vs base reward report.
Synthetic Data Generation
- Add synthetic generator.
- Add JSON validation.
- Add quality filters.
- Add augmentation flow.
- Add dataset save/export.
Paper-To-Code Agent
- Add paper input UI.
- Add research phase.
- Add plan phase.
- Add implementation trace.
- Add safety gates.
HF Spaces Deploy Tool
- Add deployment helper script.
- Add Space creation docs.
- Add remote validation.
- Add build status checks.
VINDEX Integration
- Define integration boundary.
- Add tool stub.
- Add verification report.
- Document dependency.
OCR Pipeline Hook
- Add OCR loader.
- Add confidence threshold.
- Add uncertain prediction import.
- Add correction UI.
- Add corrected export.
MiniCPM Desk-Pet
- Add persona data schema.
- Add persona training plan.
- Add Desk-Pet export plan.
- Add docs.
MiniCPM-o Audio Tab
- Add audio tab.
- Add microphone input.
- Add omnimodal service.
- Add TTS plan.
- Add streaming plan.
Cross-Extension Wiring
- Document OCR -> Field Notes -> Training.
- Document Synthetic Gen -> Reward Eval -> DPO.
- Document Agent -> Desk-Pet Persona.
- Document HF Spaces -> Trackio.
Phase 18 - Template And Reference Apps
Template How-To
- Document branch strategy for new domain apps.
- Document required domain app file contract.
- Document schema, service, loader, UI, tools, tests, and docs pattern.
- Document no-model/demo-mode requirement.
- Document correction-loop-first workflow.
- Document optional training and real-model verification steps.
- Document security requirements for public Space mode.
Plant Discovery Reference App
- Add
plant/package. - Add standalone Plant Discovery Gradio entrypoint.
- Add clean plant model/domain config.
- Add deterministic no-model plant service.
- Add optional MiniCPM-V plant service adapter.
- Make OpenBMB MiniCPM-V the default real model mode.
- Add explicit demo/openbmb/finetuned runtime modes.
- Add optional fine-tuned adapter loading path.
- Keep optional model dependencies lazy.
- Add plant structured result schema and parser.
- Add species index builder.
- Add local image-folder loader.
- Add field-note correction export to plant training JSONL.
- Add focused Identify, Field Guide, Corrections, and Stats UI.
- Replace direct training execution with non-executing training plan.
- Add optional plant tool functions with lazy MCP server construction.
- Add non-executing plant training planner.
- Add
scripts/plan_plant_training.py. - Add Plant Discovery unit tests.
- Verify no-model app shell builds.
- Run Plant Discovery as a long-running local app.
- Generate Plant Discovery screenshots.
- Add Plant Discovery screenshots to README/docs.
- Decide whether hackathon Space launches root workbench or Plant Discovery app.
- Verify real MiniCPM-V plant identification with optional dependencies.
- Train or configure a real Plant Discovery adapter.
- Verify
--model-mode finetunedwith the real adapter. - Add public-mode file/path/url hardening before Space deployment.
Ongoing Maintenance
- Update docs after every implemented feature.
- Keep
IMPLEMENTATION_STATUS.mdcurrent. - Keep unchecked tasks visible.
- Keep secrets and model weights out of git.
- Re-run local app after code changes.