workbench / docs /TASKS.md
GitHub Actions
Initial ZeroGPU deployment with spaces shim
7f9dfed
|
Raw
History Blame Contribute Delete
16.1 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Full Task Checklist

This is the shared task list for you and Codex. It covers the hackathon MVP, the main PRD, and the extension PRD. A task is complete only when the matching acceptance criteria are met and docs/IMPLEMENTATION_STATUS.md is updated.

Legend

  • [x] done and documented
  • [~] partially implemented or placeholder exists
  • [ ] not started
  • [blocked] blocked by missing local setup, credentials, hardware, or external decision

Phase 0 - Project Memory And Setup

  • Add root README.md.
  • Add root AGENTS.md.
  • Add .gitignore.
  • Add requirements.txt.
  • Add docs/ folder.
  • Add docs index.
  • Add full task checklist.
  • Add implementation status doc.
  • Add usage guide.
  • Add architecture guide.
  • Add extension guide.
  • Add acceptance criteria.
  • Add roadmap.
  • Add critical judge-oriented improvement roadmap.
  • Add template how-to for building new domain apps.
  • Add Plant Discovery reference app checklist.
  • Add Plant Discovery model and training how-to.
  • Add PRD implementation matrix.
  • Add test folder.
  • Add user-story test folder.
  • Add dev requirements.
  • Add Python quality config.
  • Add test runner script.
  • Add quality runner script.
  • Add CI workflow.
  • Add coverage gate.
  • Add performance test script.
  • Install Python 3.11+.
  • Verify python --version.
  • Create .venv.
  • Install dependencies.
  • Run python app.py.
  • Capture screenshot or note local URL.

Phase 1 - Hackathon Definition

  • Choose track: Backyard AI or Thousand Token Wood.
  • Write one-sentence project story.
  • Define target user.
  • Define measurable user benefit.
  • Decide final model family and model IDs.
  • Confirm every model is <= 32B parameters.
  • Decide local-first badge target.
  • Decide llama.cpp badge target.
  • Decide open trace badge target.
  • Decide field notes/report badge target.
  • Write final demo flow.
  • Write demo video script.
  • Write social post draft.
  • Add final submission checklist with exact URLs.

Phase 2 - MVP Gradio App

  • Add app.py.
  • Add Gradio Blocks shell.
  • Add model config loader.
  • Add model metadata display.
  • Add Chat tab.
  • Add Vision tab.
  • Add Dataset tab placeholder.
  • Add Train tab placeholder.
  • Add Export tab placeholder.
  • Add Field Notes tab.
  • Add placeholder text service.
  • Add placeholder vision service.
  • Add Traces tab placeholder.
  • Add Agent tab placeholder.
  • Add Status tab placeholder.
  • Add PowerShell structure verification script.
  • Run structure verification script.
  • Run app locally.
  • Fix local launch errors found so far.
  • Add screenshot capture path to docs or README.
  • Add first demo GIF/video plan.

Phase 3 - Config-Driven Model Registry

  • Add config/models.yaml.
  • Add text model entry for MiniCPM5-1B.
  • Add vision model entry for MiniCPM-V-4.6.
  • Add omnimodal model entry for MiniCPM-o-4.5.
  • Add typed ModelInfo.
  • Add load_model_catalog().
  • Add model_choices().
  • Add model_summary().
  • Add MiniCPM5-1B-Thinking config.
  • Add MiniCPM4.1-8B config.
  • Add MiniCPM-V-4.6-Thinking config.
  • Add GGUF metadata in config.
  • Add backend capability metadata.
  • Add lightweight catalog validation helper.
  • Show warnings for models over 32B parameters.

Phase 4 - Core Architecture

  • Add core/events.py.
  • Add EventType.
  • Add Event.
  • Add EventBus.
  • Add core/registry.py.
  • Add generic Registry.
  • Add global app state.
  • Register model services in a service registry.
  • Emit inference events from UI.
  • Emit field note events.
  • Add lightweight logging.
  • Add unit tests for config and registry.

Phase 5 - Testing And Quality

  • Add tests/unit/.
  • Add tests/user_stories/.
  • Add model catalog unit tests.
  • Add field notes unit tests.
  • Add new-user user-story test.
  • Add requirements-dev.txt.
  • Add pyproject.toml.
  • Add scripts/run_tests.ps1.
  • Add scripts/run_quality.ps1.
  • Run unit and user-story tests.
  • Install dev quality tools.
  • Run ruff.
  • Run mypy.
  • Run pylint.
  • Run bandit.
  • Run pip-audit.
  • Add rule: failing bug/check requires a new or updated test.
  • Add coverage report.
  • Add lightweight performance tests.
  • Add CI pipeline.
  • Add Playwright or equivalent browser e2e test after Gradio runs.
  • Add tests for each real backend as it is implemented.
  • Add tests for backend service selection.
  • Add tests for Ollama unavailable path.
  • Add tests for llama.cpp unavailable path and command building.
  • Add tests for llama-cpp-python unavailable path.
  • Add tests for OpenAI-compatible/LM Studio unavailable and request paths.

Phase 6 - Local Inference Backends

  • Choose first real backend.
  • Add backend selector in UI.
  • Add model status panel.
  • Add explicit model load button.
  • Ensure no model weights download on startup.

Ollama Backend

  • Confirm Ollama is installed.
  • Add models/ollama_service.py.
  • Add local model list.
  • Add pull model command with explicit user action.
  • Add text chat through Ollama.
  • Add vision chat through Ollama when supported.
  • Document Ollama setup.

llama.cpp Backend

  • Confirm llama.cpp tools are installed.
  • Add models/llama_cpp_service.py.
  • Add models/llama_cpp_python_service.py.
  • Add GGUF file picker.
  • Add llama-server launch command builder.
  • Add health check.
  • Add text generation through server.
  • Add vision mmproj support metadata.
  • Document llama.cpp setup.

llama-cpp-python Backend

  • Add optional Python binding service.
  • Add backend selector support.
  • Install llama-cpp-python locally.
  • Configure local GGUF path.
  • Verify real text generation through Python binding.
  • Decide whether to keep Python binding as fallback or primary local path.

Transformers Backend

  • Add models/transformers_text.py.
  • Add AutoModelForCausalLM loading for text models.
  • Add tokenizer loading.
  • Add explicit trust-remote-code handling.
  • Add device/dtype settings.
  • Add streaming generation.
  • Document hardware expectations.

OpenAI-Compatible / LM Studio Backend

  • Add models/openai_compatible_service.py.
  • Add backend selector support.
  • Add local base URL and served-model-name config.
  • Add Status tab setup and reachability check.
  • Add text chat through OpenAI-compatible /v1/chat/completions.
  • Document LM Studio setup.
  • Verify real text generation through LM Studio.

MiniCPM Vision Backend

  • Add models/minicpm_vision.py.
  • Use AutoModelForImageTextToText.
  • Use AutoProcessor.
  • Add image prompt formatting.
  • Add thinking-mode toggle mapping.
  • Add video support plan.

SGLang Backend

  • Add models/sglang_runner.py.
  • Add server start/stop.
  • Add MiniCPM5 tool parser config.
  • Add health check.
  • Add chat endpoint client.
  • Install sglang locally.

Phase 7 - UI Tabs From Main PRD

  • Chat tab placeholder.
  • Vision tab placeholder.
  • Dataset tab placeholder.
  • Train tab placeholder.
  • Export tab placeholder.
  • Field Notes tab minimal save.
  • Add Traces tab with local event preview.
  • Add Agent tab placeholder.
  • Add model/backend status tab or panel.
  • Add settings panel.
  • Add tab-level error messages.
  • Add loading/progress states.
  • Add compact responsive layout review.

Phase 8 - Dataset Layer

  • Add datasets/ package.
  • Add local CSV loader.
  • Add local JSONL loader.
  • Add Hugging Face dataset loader.
  • Add dataset schema preview.
  • Add split selector.
  • Add row count and sample preview.
  • Add dataset statistics tool.
  • Emit DATASET_LOADED event.
  • Document dataset formats.

Phase 9 - Field Notes And Correction Loop

  • Save field notes to CSV.
  • Move field note logic out of UI into datasets/field_notes.py.
  • Add FieldNote dataclass.
  • Add SQLite-backed store.
  • Add JSONL export.
  • Add local HF Dataset export.
  • Add corrected-only filter.
  • Add tags filter.
  • Add image path support.
  • Add video path support.
  • Add use-for-training flag.
  • Add docs for correction loop.

Phase 10 - Training Pipeline

  • Add training config placeholder.
  • Add training UI placeholder.
  • Add training/ package.
  • Add LoRA text trainer.
  • Add LoRA config parser.
  • Add PEFT/TRL dependencies when ready.
  • Add training dry-run validation.
  • Add local checkpoint output.
  • Add Trackio integration.
  • Add evaluation after training.
  • Add LoRA vs base comparison.
  • Add vision fine-tuning plan using SWIFT or LLaMA-Factory.
  • Document training hardware requirements.

Phase 11 - Evaluation

  • Add training/evaluation.py.
  • Add simple prompt test set.
  • Add exact-match metric.
  • Add qualitative eval table.
  • Add perplexity metric where appropriate.
  • Add base vs tuned comparison.
  • Log eval results.
  • Document evaluation method.

Phase 12 - Export And Quantization

  • Add export UI placeholder.
  • Add training/export.py.
  • Add official GGUF download path.
  • Add local HF-to-GGUF conversion path.
  • Add quantization selector.
  • Add llama.cpp tool detection.
  • Add exported file listing.
  • Add download link in UI.
  • Document GGUF export.

Phase 13 - Trackio Tracing

  • Add tracking/ package.
  • Add Trackio config.
  • Add trackio.init().
  • Add trackio.log().
  • Add trackio.finish().
  • Log inference events locally.
  • Log dataset events locally.
  • Log training metrics.
  • Add Traces tab.
  • Add HF Space sync docs.

Phase 13 - MCP Layer

  • Decide MCP path: Gradio native, gradio.Server
  • Add MCP tools module.
  • Add dataset stats tool.
  • Add HF search tool.
  • Add safe calculator tool.
  • Add model inference tool.
  • Expose tools through selected MCP path.
  • Document MCP endpoint.
  • Verify endpoint locally.

Phase 14 - Agent Mode

  • Add agent/ package.
  • Add agent system prompt.
  • Add research-plan-implement loop placeholder.
  • Add tool registry integration.
  • Add session trace logging.
  • Add Agent tab.
  • Add trace export to JSONL.
  • Add local HF Dataset export for traces.
  • Document limitations.

Phase 15 - Hugging Face Space Deployment

  • Install/verify huggingface_hub.
  • Login with hf auth login.
  • Create Space.
  • Add Space README metadata if needed.
  • Add Space remote.
  • Push to Space.
  • Verify Space builds.
  • Add Space URL to README.
  • Document hardware choice.
  • Document model download behavior.

Phase 16 - GitHub

  • Create GitHub repo.
  • Add GitHub remote.
  • Commit initial project.
  • Push to GitHub.
  • Add GitHub URL to README.
  • Add issue checklist or project board if desired.

Phase 17 - Hackathon Submission Package

  • Finalize app name.
  • Finalize track.
  • Verify Gradio app polish.
  • Verify model-size compliance.
  • Verify Space URLs.
  • Verify GitHub URL.
  • Record demo video.
  • Publish social post.
  • Add field notes/report link.
  • Submit before June 15, 2026.

Extension PRD Backlog

vLLM Serving Tab

  • Add vLLM runner.
  • Add vLLM start/stop UI.
  • Add OpenAI-compatible client.
  • Add metrics parsing.
  • Add Trackio benchmark logging.

Ollama Quick-Start

  • Add Ollama pull/list UI.
  • Add Ollama chat service.
  • Add Ollama vision service.
  • Add setup docs.

Llama.cpp Champion Path

  • Add llama.cpp backend selection.
  • Add llama.cpp service.
  • Add llama-cpp-python service.
  • Add llama.cpp status check.
  • Install llama.cpp locally.
  • Download/pick GGUF model.
  • Verify real text generation.
  • Verify MiniCPM-V mmproj flow.

Reward Model Eval

  • Add reward evaluator.
  • Add best-of-N generation.
  • Add DPO pair generation.
  • Add LoRA vs base reward report.

Synthetic Data Generation

  • Add synthetic generator.
  • Add JSON validation.
  • Add quality filters.
  • Add augmentation flow.
  • Add dataset save/export.

Paper-To-Code Agent

  • Add paper input UI.
  • Add research phase.
  • Add plan phase.
  • Add implementation trace.
  • Add safety gates.

HF Spaces Deploy Tool

  • Add deployment helper script.
  • Add Space creation docs.
  • Add remote validation.
  • Add build status checks.

VINDEX Integration

  • Define integration boundary.
  • Add tool stub.
  • Add verification report.
  • Document dependency.

OCR Pipeline Hook

  • Add OCR loader.
  • Add confidence threshold.
  • Add uncertain prediction import.
  • Add correction UI.
  • Add corrected export.

MiniCPM Desk-Pet

  • Add persona data schema.
  • Add persona training plan.
  • Add Desk-Pet export plan.
  • Add docs.

MiniCPM-o Audio Tab

  • Add audio tab.
  • Add microphone input.
  • Add omnimodal service.
  • Add TTS plan.
  • Add streaming plan.

Cross-Extension Wiring

  • Document OCR -> Field Notes -> Training.
  • Document Synthetic Gen -> Reward Eval -> DPO.
  • Document Agent -> Desk-Pet Persona.
  • Document HF Spaces -> Trackio.

Phase 18 - Template And Reference Apps

Template How-To

  • Document branch strategy for new domain apps.
  • Document required domain app file contract.
  • Document schema, service, loader, UI, tools, tests, and docs pattern.
  • Document no-model/demo-mode requirement.
  • Document correction-loop-first workflow.
  • Document optional training and real-model verification steps.
  • Document security requirements for public Space mode.

Plant Discovery Reference App

  • Add plant/ package.
  • Add standalone Plant Discovery Gradio entrypoint.
  • Add clean plant model/domain config.
  • Add deterministic no-model plant service.
  • Add optional MiniCPM-V plant service adapter.
  • Make OpenBMB MiniCPM-V the default real model mode.
  • Add explicit demo/openbmb/finetuned runtime modes.
  • Add optional fine-tuned adapter loading path.
  • Keep optional model dependencies lazy.
  • Add plant structured result schema and parser.
  • Add species index builder.
  • Add local image-folder loader.
  • Add field-note correction export to plant training JSONL.
  • Add focused Identify, Field Guide, Corrections, and Stats UI.
  • Replace direct training execution with non-executing training plan.
  • Add optional plant tool functions with lazy MCP server construction.
  • Add non-executing plant training planner.
  • Add scripts/plan_plant_training.py.
  • Add Plant Discovery unit tests.
  • Verify no-model app shell builds.
  • Run Plant Discovery as a long-running local app.
  • Generate Plant Discovery screenshots.
  • Add Plant Discovery screenshots to README/docs.
  • Decide whether hackathon Space launches root workbench or Plant Discovery app.
  • Verify real MiniCPM-V plant identification with optional dependencies.
  • Train or configure a real Plant Discovery adapter.
  • Verify --model-mode finetuned with the real adapter.
  • Add public-mode file/path/url hardening before Space deployment.

Ongoing Maintenance

  • Update docs after every implemented feature.
  • Keep IMPLEMENTATION_STATUS.md current.
  • Keep unchecked tasks visible.
  • Keep secrets and model weights out of git.
  • Re-run local app after code changes.