Spaces:
Sleeping
Sleeping
| # Architecture | |
| The project is intentionally small at first. The PRD describes a large workbench; this repo starts | |
| with the smallest version that can grow into it. | |
| ## High-Level Flow | |
| ```text | |
| app.py | |
| loads config/models.yaml | |
| configures lightweight logging | |
| builds Gradio tabs | |
| passes model catalog to UI modules | |
| ui/* | |
| defines each Gradio tab | |
| calls service classes | |
| emits local app events for inference, datasets, and field notes | |
| uses shared progress settings for callback loading indicators | |
| agent/* | |
| holds deterministic local agent planning and trace export helpers | |
| models/* | |
| holds model catalog, local backend config, and inference services | |
| datasets/* | |
| stores dataset, synthetic data, and correction-loop helpers | |
| mcp_tools/* | |
| holds local tool functions, VINDEX call planning, and Gradio-native MCP bridge metadata | |
| config/* | |
| holds model and training settings | |
| training/* | |
| holds non-executing training, LoRA request, evaluation, and export planning helpers | |
| tracking/* | |
| holds local JSONL tracing and optional Trackio integration | |
| deployment/* | |
| holds Hugging Face Space deployment planning and validation helpers | |
| plant/* | |
| holds the first reference domain app built from the template | |
| can run standalone with python -m plant.app --no-model | |
| keeps heavy model dependencies optional | |
| core/* | |
| shared app state, event, logging, and registry helpers | |
| ``` | |
| ## Files And Classes | |
| ### `app.py` | |
| Builds and launches the Gradio app. | |
| - `build_app()` creates the Gradio `Blocks` app. | |
| - Loads the model catalog from `config/models.yaml`. | |
| - Registers the current UI tabs. | |
| - `APP_CSS` defines compact responsive layout rules for app width, mobile padding, scrollable tabs, | |
| and button touch targets. | |
| ### `plant/app.py` | |
| Standalone Plant Discovery reference app built from the template. | |
| - `build_app(no_model=True)` creates a Gradio app without loading model weights. | |
| - Loads `plant/models.yaml`. | |
| - Builds a local species index. | |
| - Reuses `datasets.field_notes.FieldNoteStore` for corrections. | |
| - Uses `DemoPlantVisionService` for screenshots/tests or `PlantVisionService` for OpenBMB | |
| MiniCPM-V zero-shot and fine-tuned adapter inference. | |
| ### `plant/plant_service.py` | |
| Domain service and schema for Plant Discovery. | |
| - `PlantID` is the structured output schema. | |
| - `DemoPlantVisionService` provides deterministic no-model results. | |
| - `PlantVisionService` lazy-loads optional MiniCPM-V dependencies only during identification. | |
| - `PlantVisionService.from_config(..., "plant_vlm_finetuned")` can load a PEFT adapter after a real | |
| adapter repo is configured. | |
| - `extract_json_object()` and `parse_plant_response()` make model JSON output testable. | |
| ### `plant/training.py` | |
| Non-executing training planner for Plant Discovery. | |
| - `build_plant_training_plan()` returns SWIFT and LLaMA-Factory command previews. | |
| - `plant_training_dependency_report()` reports optional training dependency availability. | |
| - `write_llamafactory_dataset_info()` writes a dataset-info preview for LLaMA-Factory workflows. | |
| - Training is never started by the Gradio UI or script. | |
| ### `plant/plant_loader.py` | |
| Domain data and export helpers for Plant Discovery. | |
| - `PlantRecord` normalizes plant examples into training rows. | |
| - `LocalFolderLoader` maps species folders to image metadata. | |
| - `SpeciesIndexBuilder` builds a no-network species index with demo fallback. | |
| - `FieldNotesPlantExporter` exports corrected field notes to plant training JSONL. | |
| ### `plant/plant_tab.py` | |
| Focused Gradio UI for Plant Discovery. | |
| - Identify tab uploads images and renders a safe escaped result card. | |
| - Field Guide tab searches the species index. | |
| - Corrections tab saves and exports training-ready corrections. | |
| - Stats tab summarizes species and correction counts. | |
| - Training is represented as a non-executing plan, not a subprocess. | |
| ### `plant/plant_tools.py` | |
| Optional local/MCP tools for Plant Discovery. | |
| - Pure functions can be tested without an MCP server. | |
| - `build_mcp_server()` imports `mcp` only when explicitly requested. | |
| - Tools expose identify, species search, correction save/export, stats, and training plan. | |
| ### `models/model_catalog.py` | |
| Reads model configuration and turns it into typed Python objects. | |
| - `ModelInfo` describes one configured model. | |
| - `load_model_catalog(path)` reads YAML and returns all configured models. | |
| - `model_choices(catalog, model_type)` filters models for a UI dropdown. | |
| - `model_summary(model)` returns display metadata for the Gradio JSON panel. | |
| - `backend_capabilities` maps each model to supported local backend capabilities. | |
| ### `models/placeholder_service.py` | |
| Deterministic placeholder model service used before real inference is wired. | |
| - `PlaceholderModelService.chat()` returns a deterministic text response. | |
| - `PlaceholderModelService.vision_chat()` returns a deterministic image/prompt response. | |
| This file should be replaced or complemented by real services such as: | |
| - `ollama_service.py` | |
| - `llama_cpp_service.py` | |
| - `openai_compatible_service.py` | |
| - `sglang_runner.py` | |
| - `minicpm_vision.py` | |
| - `transformers_text.py` | |
| - `sglang_service.py` | |
| ### `models/base.py` | |
| Defines service contracts and backend status records. | |
| - `BackendStatus` describes whether a backend is available. | |
| - `TextModelService` is the text chat protocol. | |
| - `VisionModelService` is the vision chat protocol. | |
| ### `models/ollama_service.py` | |
| Ollama-backed local inference client. | |
| - Checks whether `ollama` is installed and reachable. | |
| - Sends text and vision chat requests to `http://127.0.0.1:11434/api/chat`. | |
| - Lists locally available Ollama models through `/api/tags`. | |
| - Builds explicit `ollama pull <model>` commands for the Status tab. | |
| - Does not pull or download models automatically. | |
| ### `models/llama_cpp_service.py` | |
| llama.cpp HTTP client for local GGUF inference. | |
| - Checks whether `llama-server` is installed and reachable. | |
| - Builds explicit `llama-server -m <model.gguf>` commands. | |
| - Supports `--mmproj <mmproj.gguf>` command metadata for multimodal models. | |
| - Sends text chat requests to `/v1/chat/completions`. | |
| - Does not download GGUF files or start background servers automatically. | |
| ### `models/local_backend_config.py` | |
| User-local backend settings stored under ignored `data/local_backends.yaml`. | |
| - `LocalBackendConfig` stores llama.cpp server URL, OpenAI-compatible base URL, optional served | |
| model name, GGUF path, mmproj path, context length, and GPU layers. | |
| - `save_local_backend_config()` writes local-only settings without touching tracked model config. | |
| - `build_llama_server_command()` returns the explicit command the user can run. | |
| - `local_backend_summary()` reports file status and confirms no startup downloads or automatic model loads. | |
| ### `models/openai_compatible_service.py` | |
| Local OpenAI-compatible chat client for LM Studio, vLLM-style servers, or similar local endpoints. | |
| - Checks `/v1/models` for reachability. | |
| - Sends text chat requests to `/v1/chat/completions`. | |
| - Supports an optional served-model-name override for tools such as LM Studio. | |
| - Returns visible unavailable/request-failed messages instead of crashing the Gradio callback. | |
| - Does not call cloud APIs or download model weights. | |
| ### `models/llama_cpp_python_service.py` | |
| Optional direct Python binding backend for GGUF inference. | |
| - Checks whether `llama_cpp` is importable. | |
| - Requires an explicit local GGUF path. | |
| - Does not download model files. | |
| - Provides text chat through `Llama.create_chat_completion()`. | |
| - Vision support remains routed through llama-server until mmproj/image serialization is wired. | |
| ### `models/minicpm_vision.py` | |
| Optional MiniCPM vision backend. | |
| - Checks whether the `transformers` package is available. | |
| - Lazy-loads `AutoProcessor` and `AutoModelForImageTextToText` only when selected. | |
| - Formats image/text messages for image-text-to-text generation. | |
| - Maps thinking mode into the prompt template. | |
| - Provides a video support plan for future local frame sampling. | |
| ### `models/sglang_runner.py` | |
| SGLang local server planner and OpenAI-compatible chat client. | |
| - Builds an explicit `python -m sglang.launch_server` command. | |
| - Includes MiniCPM tool parser configuration. | |
| - Checks `/health`, sends chat requests to `/v1/chat/completions`, and can request `/shutdown`. | |
| - Does not install SGLang, start a process, download model weights, or load a model on app startup. | |
| ### `models/vllm_runner.py` | |
| vLLM local server planner and OpenAI-compatible chat client. | |
| - Builds explicit `vllm serve <model>` command plans. | |
| - Checks `/health`, parses Prometheus-style `/metrics`, and sends chat requests to | |
| `/v1/chat/completions`. | |
| - Logs parsed benchmark metrics through `TrackingClient`. | |
| - Does not install vLLM, start a process, download model weights, or load a model on app startup. | |
| ### `models/transformers_text.py` | |
| Optional Transformers text backend. | |
| - Checks whether the `transformers` package is installed. | |
| - Lazy-loads `AutoTokenizer` and `AutoModelForCausalLM` only when the backend is selected. | |
| - Reads `trust_remote_code`, device map, dtype, max token, and temperature settings from explicit config. | |
| - Provides a simple token-list streaming helper for future Gradio streaming wiring. | |
| - Does not download model weights on startup. | |
| ### `models/service_factory.py` | |
| Creates the selected backend service for the UI. | |
| - `TEXT_SERVICE_REGISTRY` registers available text backend factories. | |
| - `VISION_SERVICE_REGISTRY` registers available vision backend factories. | |
| - `create_text_service()` chooses placeholder, llama.cpp, llama-cpp-python, Ollama, | |
| OpenAI-compatible, SGLang, or Transformers text service. | |
| - `create_vision_service()` chooses placeholder, llama.cpp, llama-cpp-python, Ollama, or | |
| Transformers MiniCPM vision service. | |
| - `backend_statuses()` reports current backend availability. | |
| - llama.cpp, llama-cpp-python, and OpenAI-compatible services read ignored local backend settings | |
| when selected. | |
| ### `ui/chat_tab.py` | |
| Builds the text chat tab. | |
| - Shows text models from the catalog. | |
| - Displays selected model metadata. | |
| - Calls the selected backend service. | |
| - Emits inference request and response events. | |
| ### `ui/vision_tab.py` | |
| Builds the vision tab. | |
| - Shows vision models from the catalog. | |
| - Accepts an image and prompt. | |
| - Calls the selected backend service. | |
| - Emits inference request and response events. | |
| ### `ui/dataset_tab.py` | |
| Local dataset preview surface. | |
| - Previews local CSV, JSONL, and NDJSON files. | |
| - Previews Hugging Face datasets when the optional external `datasets` package is installed. | |
| - Shows source, row count, columns, and sample rows. | |
| - Calculates basic local dataset statistics. | |
| - Emits dataset loaded events. | |
| Future behavior: | |
| - Serve dataset tools through the selected MCP path. | |
| ### `ui/train_tab.py` | |
| Training planning and local evaluation surface. | |
| - Builds a LoRA dry-run training plan without launching training. | |
| - Builds a non-executing LoRA trainer request with dependency status. | |
| - Shows SWIFT/LLaMA-Factory vision fine-tuning plan. | |
| - Shows checkpoint output path, validation status, and hardware notes. | |
| - Runs local base-vs-tuned evaluation from newline-separated response text. | |
| - Shows exact-match summary and a qualitative eval table. | |
| - Logs tuned evaluation reports to `data/eval_results.jsonl`. | |
| Future behavior: | |
| - Start LoRA training. | |
| - Show loss and metrics. | |
| - Write Trackio traces. | |
| ### `ui/vllm_tab.py` | |
| vLLM local serving planner. | |
| - Builds explicit `vllm serve` command plans. | |
| - Checks local vLLM `/health`. | |
| - Fetches and parses `/metrics`. | |
| - Logs vLLM benchmark metrics through local JSONL/Trackio fallback tracking. | |
| - Does not install vLLM, start a process, download models, or load weights on startup. | |
| ### `ui/export_tab.py` | |
| GGUF export planning surface. | |
| - Selects a configured model and quantization. | |
| - Shows official GGUF download command plans when the model has GGUF metadata. | |
| - Shows local HF-to-GGUF conversion and llama.cpp quantization command plans. | |
| - Lists files already present under the selected export directory. | |
| - Exposes existing exported files through a Gradio download output. | |
| - Does not execute downloads, conversion, or quantization. | |
| Future behavior: | |
| - Execute downloads and conversions after explicit user action. | |
| ### `ui/notes_tab.py` | |
| Field notes implementation. | |
| - Saves prompt, model response, correction, and tags to `data/field_notes.csv`. | |
| - Captures optional image path, video path, and a use-for-training flag. | |
| - Exports corrected notes to JSONL. | |
| - Exports local Hugging Face Dataset-style files under `data/hf_field_notes`. | |
| - Imports uncertain OCR predictions for human correction. | |
| - Exports corrected OCR rows to JSONL. | |
| - Emits field note saved events. | |
| Future behavior: | |
| - Push corrected notes to a remote Hugging Face Dataset after login. | |
| - Feed notes into fine-tuning. | |
| ### `ui/traces_tab.py` | |
| Local trace and tracking preview. | |
| - Shows manual trace event previews. | |
| - Shows recent local app events. | |
| - Shows JSONL trace rows and tracking status. | |
| - Exports local traces to `exports/traces.jsonl`. | |
| - Calls Trackio only when the optional package is installed and enabled. | |
| ### `ui/agent_tab.py` | |
| Local non-autonomous agent mode. | |
| - Drafts a research-plan-implement-verify trace. | |
| - Saves agent traces to `data/agent_traces.jsonl`. | |
| - Exports trace JSONL and local HF Dataset-style trace files. | |
| - Does not execute shell commands, commit, push, deploy, download models, or call external services. | |
| ### `ui/status_tab.py` | |
| Shows configured models and backend metadata. | |
| - Helps verify model-size compliance and backend status. | |
| - Provides local llama.cpp settings, GGUF/mmproj file pickers, and command generation. | |
| - Provides LM Studio/OpenAI-compatible base URL, optional model-name storage, and reachability check. | |
| - Provides SGLang command planning, health check, and shutdown request controls. | |
| ### `datasets/field_notes.py` | |
| Field note data model and CSV store. | |
| - `FieldNote` captures prompt, response, correction, tags, and timestamp. | |
| - `FieldNote` also captures optional image/video paths and a training inclusion flag. | |
| - `FieldNoteStore.save()` persists notes to `data/field_notes.csv`. | |
| - `FieldNoteStore.list_notes()` filters by correction, tag, and training inclusion. | |
| - `FieldNoteStore.export_jsonl()` writes training-ready JSONL. | |
| - `FieldNoteStore.export_hf_dataset()` writes local HF Dataset-style files. | |
| - `SQLiteFieldNoteStore` stores and lists notes in SQLite for larger correction loops. | |
| ### `datasets/loader.py` | |
| Dataset preview and statistics helpers. | |
| - `preview_local_dataset()` previews CSV, JSONL, and NDJSON files. | |
| - `dataset_statistics()` reports row count, column count, names, and non-empty counts. | |
| - `preview_huggingface_dataset()` optionally uses the external Hugging Face `datasets` package. | |
| ### `datasets/synthetic.py` | |
| Deterministic local synthetic data helpers. | |
| - `generate_synthetic_examples()` creates local prompt/response/correction examples. | |
| - `validate_synthetic_example()` checks schema requirements. | |
| - `quality_filter_examples()` removes incomplete or low-value examples. | |
| - `augment_examples()` creates deterministic variants for workflow testing. | |
| - `export_synthetic_jsonl()` writes JSONL without external services. | |
| ### `datasets/ocr.py` | |
| Local OCR correction helpers. | |
| - `OCRPrediction` stores source path, predicted text, confidence, and optional page. | |
| - `load_ocr_predictions()` loads local `.csv`, `.jsonl`, and `.ndjson` prediction files. | |
| - `uncertain_predictions()` filters rows at or below a confidence threshold or with empty text. | |
| - `import_uncertain_predictions()` creates Field Notes correction tasks for uncertain rows. | |
| - `export_corrected_ocr_notes()` writes corrected OCR examples to JSONL for evaluation or training. | |
| - `ocr_import_summary()` previews uncertain rows for the Field Notes tab. | |
| ### `mcp_tools/tools.py` | |
| Local MCP-style tools. | |
| - `dataset_stats_tool()` returns local dataset statistics. | |
| - `hf_dataset_preview_tool()` previews Hugging Face datasets when optional dependencies exist. | |
| - `safe_calculator_tool()` evaluates numeric arithmetic only. | |
| - `model_inference_tool()` routes text prompts through the selected model service. | |
| - `tool_registry()` returns the local tool map for a future MCP endpoint. | |
| ### `mcp_tools/vindex_tool.py` | |
| Non-executing VINDEX integration boundary. | |
| - Defines the eight VINDEX PRD methods and their local FastAPI paths. | |
| - `build_vindex_call_plan()` validates method names and builds endpoint/payload plans. | |
| - Caps `star_spread.n_neighbors` at 5 and `calibrated_edit.causal_window` at 3 based on the PRD | |
| safety notes. | |
| - `vindex_dependency_report()` checks whether the optional `vindex` package or local health | |
| endpoint is available. | |
| - `vindex_verification_report()` combines dependency status with a safe call plan and keeps | |
| execution disabled until the local VINDEX install is verified. | |
| ### `mcp_tools/bridge.py` | |
| Gradio-native MCP bridge metadata and local invocation helper. | |
| - `MCP_PATH` documents `/gradio_api/mcp/sse`. | |
| - `mcp_manifest()` returns the selected mode, path, and tool definitions. | |
| - `invoke_mcp_tool()` verifies local tool invocation by name. | |
| ### `agent/runner.py` | |
| Deterministic local agent trace runner. | |
| - `AGENT_SYSTEM_PROMPT` defines the agent behavior contract. | |
| - `run_agent_loop()` produces research, plan, implement, and verify trace steps. | |
| - `run_paper_to_code_loop()` produces paper-to-code research, plan, implement, and verify trace steps. | |
| - `default_safety_gates()` lists the non-autonomous safety requirements. | |
| - `save_agent_trace()` appends traces to JSONL. | |
| - `export_agent_traces()` exports trace JSONL. | |
| - `export_agent_traces_hf_dataset()` writes local HF Dataset-style trace files. | |
| - The runner can call safe local tools, but it is not autonomous. | |
| ### `core/file_exports.py` | |
| Shared export helper. | |
| - `copy_text_file_or_empty()` copies a text artifact to an export path or creates an empty one. | |
| ### `training/export.py` | |
| Non-executing GGUF export planning. | |
| - `detect_llama_cpp_tools()` checks `llama-server`, `llama-cli`, and `llama-quantize`. | |
| - `build_export_plan()` creates explicit download, conversion, and quantization command plans. | |
| - `list_exported_files()` lists generated/local export files. | |
| - `ExportPlan.as_dict()` marks that commands are not executed and no startup downloads occur. | |
| ### `training/evaluation.py` | |
| Local deterministic evaluation helpers. | |
| - `default_prompt_cases()` returns a small built-in prompt test set. | |
| - `load_prompt_cases()` loads prompt/expected pairs from JSONL. | |
| - `evaluate_responses()` computes exact-match rows and a qualitative table. | |
| - `perplexity_from_losses()` computes perplexity from explicit negative log likelihood values. | |
| - `compare_base_vs_tuned()` reports exact-match delta. | |
| - `log_eval_report()` appends JSONL evaluation results. | |
| ### `training/lora_trainer.py` | |
| Non-executing LoRA trainer request builder. | |
| - `lora_dependency_report()` reports PEFT, TRL, Transformers, and Torch availability. | |
| - `build_lora_training_request()` combines the training plan with dependency status and a command | |
| preview. | |
| - `vision_finetuning_plan()` documents SWIFT/LLaMA-Factory as the future MiniCPM-V fine-tuning path. | |
| - Keeps `execute_training` false until dependencies, hardware, and dataset schema are approved. | |
| ### `training/reward_eval.py` | |
| Deterministic local reward-style evaluation helpers. | |
| - `RewardEvaluator.evaluate()` scores supplied responses with transparent lexical heuristics. | |
| - `best_of_n()` selects the highest-scoring candidate without model calls. | |
| - `create_dpo_pairs()` creates chosen/rejected pairs for DPO-style datasets. | |
| - `eval_lora_vs_base()` compares base and LoRA response rewards. | |
| ### `training/planner.py` | |
| Non-executing LoRA training planner. | |
| - `load_training_config()` reads LoRA and training settings from `config/training.yaml`. | |
| - `build_training_plan()` creates a dry-run plan with checkpoint output path. | |
| - `validate_training_plan()` checks dataset existence and numeric training settings. | |
| - `training_hardware_notes()` documents practical local hardware expectations. | |
| ### `tracking/trackio_client.py` | |
| Tracking client with JSONL fallback. | |
| - `load_tracking_config()` reads Trackio settings from `config/training.yaml`. | |
| - `TrackingClient.init()` starts Trackio only when enabled and installed. | |
| - `TrackingClient.log()` always writes local JSONL and optionally forwards to Trackio. | |
| - `TrackingClient.finish()` closes optional Trackio state. | |
| - `export_traces()` copies local traces to `exports/traces.jsonl`. | |
| - `read_trace_rows()` returns recent local trace rows for the UI. | |
| ### `core/events.py` | |
| Small event bus reserved for future cross-module events. | |
| - `EventType` names app events. | |
| - `UI_ERROR` records visible tab-level failures. | |
| - `Event` carries event data. | |
| - `EventBus` registers handlers and emits events. | |
| ### `core/app_state.py` | |
| Shared local app state. | |
| - `AppState.emit()` records events, logs them, and dispatches them through `EventBus`. | |
| - `AppState.emit()` also writes trace events through `TrackingClient`. | |
| - `AppState.recent_events()` returns local trace previews for the Traces tab. | |
| - `emit_inference_response()` records shared response metadata. | |
| ### `core/tab_feedback.py` | |
| Formats tab status text and emits `ui_error` events for visible tab-level failures. | |
| ### `ui/progress.py` | |
| Defines the shared Gradio progress mode used by tab button callbacks. | |
| ### `core/app_logging.py` | |
| Lightweight logging setup. | |
| - `configure_app_logging()` configures compact process logging once. | |
| ### `core/registry.py` | |
| Generic registry helper. | |
| - `Registry.register(name, item)` stores a service. | |
| - `Registry.get(name)` retrieves a service. | |
| - `Registry.list()` lists registered services. | |
| ## Current Design Rule | |
| The app must not download model weights on startup. Model loading should happen only after the | |
| user chooses a backend/model and clicks an explicit action. | |