Spaces:
Running
Running
| title: ResearchHarness | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: yellow | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| license: mit | |
| short_description: Lightweight harness for tool-using LLM agents. | |
| # ResearchHarness Space Maintenance Notes | |
| This repository is the Hugging Face Docker Space deployment for | |
| [`ResearchHarness`](https://github.com/InternScience/ResearchHarness). It is an online | |
| app mirror, not the public open-source documentation and not a full source mirror. | |
| The public project README, tutorials, benchmark notes, API server documentation, | |
| and local CLI documentation belong in the main GitHub repository. This Space | |
| README should stay focused on long-term deployment maintenance: what is copied | |
| from the main repo, what is intentionally changed for hosted use, and what is | |
| new in the Space. | |
| ## Repository Relationship | |
| | Repository | Role | | |
| | --- | --- | | |
| | `github.com/InternScience/ResearchHarness` | Main open-source runtime, CLI, API server, frontend, docs, tests, and benchmark adapters. | | |
| | `huggingface.co/spaces/InternScience/ResearchHarness` | Hugging Face Space app that hosts the browser frontend with managed temporary workspaces. | | |
| | `huggingface.co/datasets/InternScience/ResearchHarness-Data` | Hugging Face dataset receiving collected hosted-run trajectory PRs. | | |
| Maintenance rule: | |
| - Copy only the runtime/frontend pieces needed by the hosted app. | |
| - Do not blindly sync the whole main repository into this Space. | |
| - Space-only deployment logic must not be copied back into the main repo unless | |
| it is genuinely general-purpose. | |
| - Public documentation should be updated in the main repo, not duplicated here. | |
| - Treat the tables below as the sync boundary. Fully synced files may be copied | |
| from the main repo and diff-checked. Partially synced files must be updated | |
| with targeted patches only; do not overwrite them with main-repo files. | |
| ## Sync Policy | |
| The Space should stay small and deployment-focused. When the main repository | |
| changes, sync only the files needed by the hosted browser app, then inspect the | |
| diff manually. Do not copy the whole main repository into this Space. | |
| For partially synced files, a clean sync means the diff contains only the | |
| specific hunk needed for the current bug or feature. If a full-file copy creates | |
| large unrelated changes, restore the Space version from `HEAD` and reapply the | |
| minimal patch. | |
| ### Fully Synced From The Main Repository | |
| These files/directories should normally match the main repo exactly, unless a | |
| future Space-specific need is documented here: | |
| | Path | Purpose | | |
| | --- | --- | | |
| | `agent_base/base.py` | Base agent interface. | | |
| | `agent_base/console_utils.py` | Shared console/event formatting helpers. | | |
| | `agent_base/context_compact.py` | Context compaction logic. | | |
| | `agent_base/model_profiles.py` | Provider/model profile helpers. | | |
| | `agent_base/prompt.py` | Base system prompt. | | |
| | `agent_base/prompts/system_base.md` | Shared base prompt text. | | |
| | `agent_base/provider_compat.py` | Provider compatibility normalization. | | |
| | `agent_base/session_state.py` | Session state serialization. | | |
| | `agent_base/tools/*.py` | Tool implementations exposed by the Space app. | | |
| | `agent_base/trace_utils.py` | Trace writing utilities. | | |
| | `agent_base/utils.py` | Shared runtime utilities, including default `.env` loading. | | |
| | `VERSION` | Version marker shown by the app/runtime when needed. | | |
| ### Partially Synced And Space-Modified | |
| These files are related to main-repo files, but must be merged manually because | |
| the hosted Space has different deployment semantics: | |
| | Path | Maintenance rule | | |
| | --- | --- | | |
| | `agent_base/react_agent.py` | Keep core ReAct/runtime behavior aligned with main. Preserve Space compatibility only when it is genuinely required by the hosted app. | | |
| | `frontend/local_server.py` | Based on the main local frontend server, but Space-modified for managed temporary workspaces, forced `agent_workspace/` + `agent_trace/` layout, workspace zip download, automatic cleanup, trajectory collection hooks, and no arbitrary server-folder picker semantics. Never overwrite this file blindly from main. | | |
| | `frontend/static/index.html` | Starts from the main frontend HTML, but removes the local workspace picker and adds hosted workspace download UI. | | |
| | `frontend/static/app.js` | Starts from the main frontend client, but removes local folder selection and adds download-token / workspace-zip handling. | | |
| | `frontend/static/app.css` | Starts from the main frontend CSS, but includes Space-only hosted workspace/download styling and omits local folder picker modal styles. | | |
| | `requirements.txt` | Starts from the main runtime dependencies, but keeps Space-only hosted dependencies such as `huggingface_hub` and `uvicorn[standard]`. | | |
| | `app.py` | Space-only FastAPI/Hugging Face entrypoint. It owns startup, cleanup scheduling, static mounting, and hosted defaults. | | |
| | `check_space_runtime.py` | Space-only smoke test for deployment import/runtime sanity. | | |
| | `Dockerfile` | Space-only Docker build. | | |
| | `.dockerignore` | Space-only Docker context pruning. | | |
| | `.gitattributes` | Space repository metadata. | | |
| | `.gitignore` | Space-only generated files, cache, and temporary run ignores. | | |
| | `README.md` | Space maintenance notes only. Public project docs belong in the main repo. | | |
| ### Out Of Scope For The Space | |
| These main-repo areas should not be copied into this Space unless the hosted app | |
| explicitly starts using them: | |
| | Main-repo path | Reason | | |
| | --- | --- | | |
| | `pyproject.toml`, `MANIFEST.in`, `researchharness/` | PyPI packaging belongs to the main open-source repo, not the hosted app mirror. | | |
| | `.github/` | GitHub CI/release automation does not run in the Hugging Face Space repo. | | |
| | `run_agent.py`, `run_server.py`, `run_frontend.py` | Local CLI/API/frontend entrypoints are not how the Space is launched. | | |
| | `api/` | OpenAI-compatible API server is not part of the Space app. | | |
| | `benchmarks/` | Benchmark adapters and benchmark docs belong to the main repo. | | |
| | `docs/` | Long-form tutorials belong to the main repo. | | |
| | `tests/` | Main local/CI tests belong to the main repo; Space keeps only focused smoke checks. | | |
| | `.env.example` | Public environment template belongs to the main repo. | | |
| | `agent_base/tools/README.md` | Tool documentation belongs to the main repo; Space keeps only runtime code. | | |
| | `agent_base/prompts/plugins/` | Plugin prompt assets are not used by the hosted app unless a future Space feature explicitly needs them. | | |
| | `workspace/`, `api_runs/`, `traces/` | Local placeholder/runtime directories are not checked into Space. | | |
| | local benchmark helpers such as `benchmarks/**/local_*` | Local development helpers must not be deployed. | | |
| Keeping these files out prevents stale code paths and misleading documentation | |
| from accumulating in the Space. | |
| ## Space-Specific Runtime Behavior | |
| These behaviors are intentional hosted-app deltas: | |
| - Users cannot select arbitrary server folders. Each new chat gets an isolated | |
| managed run directory under `RH_SPACE_RUNS_DIR`. | |
| - The runtime layout is always: | |
| `run_.../agent_workspace/` for agent-visible files and | |
| `run_.../agent_trace/` for traces and `session_state_*.json`. | |
| - Uploaded images are saved under `agent_workspace/inputs/images/` and are also | |
| passed to the model as image inputs when supported. | |
| - Users can download files created or handled by the agent with the | |
| `Download workspace.zip` button. The zip contains only the current chat's | |
| `agent_workspace/`; it does not include `agent_trace/`, server files, or | |
| Space secrets. | |
| - The frontend exposes a per-run model dropdown. Current options are `gpt-5.5` | |
| and `claude-opus-4-8`; the selection must stay local to that run and must not | |
| mutate global process environment variables. | |
| - Completed runs are packaged for trajectory collection and submitted as pull | |
| requests to the configured Hugging Face dataset after the batch threshold is | |
| reached. | |
| - Old inactive runs are cleaned periodically so the Space does not grow without | |
| bound. | |
| ## Required Secrets | |
| Configure these as Hugging Face Space secrets before starting the app: | |
| | Secret | Purpose | | |
| | --- | --- | | |
| | `API_KEY` | API key for your OpenAI-compatible LLM provider. | | |
| | `API_BASE` | OpenAI-compatible `/v1` endpoint. | | |
| | `MODEL_NAME` | Main model used by ResearchHarness. | | |
| | `SERPER_KEY` | WebSearch / ScholarSearch key from <https://serper.dev/>. | | |
| | `JINA_KEY` | WebFetch key from <https://jina.ai/>. | | |
| | `MINERU_TOKEN` | ReadPDF key from <https://mineru.net/>. | | |
| | `HF_TOKEN` | Hugging Face token with write access to `InternScience/ResearchHarness-Data`. | | |
| ## Optional Runtime Variables | |
| | Variable | Default | Meaning | | |
| | --- | --- | --- | | |
| | `RH_SPACE_RUNS_DIR` | `/tmp/researchharness_space/runs` | Parent directory for temporary per-chat runs. | | |
| | `RH_SPACE_RETENTION_SECONDS` | `21600` | Delete inactive runs older than this many seconds. | | |
| | `RH_SPACE_MAX_RUNS` | `40` | Keep at most this many inactive runs. | | |
| | `RH_SPACE_CLEANUP_INTERVAL_SECONDS` | `900` | Background cleanup interval. | | |
| | `WEBFETCH_TIMEOUT_SECONDS` | `180` | Overall timeout for one WebFetch tool call. | | |
| | `WEBFETCH_MAX_CHARS` | `30000` | Hard maximum characters returned by one URL-only WebFetch call. | | |
| | `RH_COLLECTION_ENABLED` | `true` | Automatically collect completed hosted runs. | | |
| | `RH_COLLECTION_DATASET_REPO` | `InternScience/ResearchHarness-Data` | Dataset repo that receives trajectory PRs. | | |
| | `RH_COLLECTION_BATCH_SIZE` | `5` | Create one dataset PR after this many collected runs. | | |
| | `RH_COLLECTION_MAX_BUNDLE_BYTES` | `20971520` | Drop a single run bundle if it exceeds this byte limit. | | |
| | `PORT` | `7860` | Port used by Hugging Face Docker Spaces. | | |
| ## Runtime Layout | |
| ```text | |
| /tmp/researchharness_space/runs/ | |
| βββ run_YYYYMMDD_HHMMSS_<random>/ | |
| βββ agent_workspace/ | |
| β βββ inputs/images/ # user uploaded images, when present | |
| βββ agent_trace/ # trace JSONL and session_state_*.json | |
| ``` | |
| The frontend exposes the chat UI and a single `Download workspace.zip` action | |
| for the current chat. The workspace path is managed by the server so hosted | |
| users cannot browse or select server folders. | |
| ## Trajectory Collection | |
| Hosted mode automatically collects completed runs without exposing extra UI to users: | |
| - Each completed run is zipped from `agent_workspace/` and `agent_trace/`. | |
| - A `manifest.json` is included inside the zip, and a sidecar `.json` file is kept beside the pending zip. | |
| - If a single bundle is larger than `RH_COLLECTION_MAX_BUNDLE_BYTES` (`20MB` by default), it is dropped immediately. | |
| - Once `RH_COLLECTION_BATCH_SIZE` pending bundles exist, the Space creates a pull request in the configured Hugging Face dataset repo. | |
| - After the dataset PR is created successfully, those local pending bundles are deleted. | |
| - If upload fails, pending bundles are retained and `last_upload_error.json` is written under the local collection directory. | |
| - No redaction is applied in this core hosted collector; keep the dataset private unless you intentionally want to publish the collected traces. | |
| ## Local Smoke Test | |
| ```bash | |
| python app.py | |
| ``` | |
| Then open `http://127.0.0.1:7860`. | |
| Before pushing Space changes, run at least: | |
| ```bash | |
| python3 check_space_runtime.py | |
| python3 -B - <<'PY' | |
| from pathlib import Path | |
| import py_compile | |
| for path in Path(".").rglob("*.py"): | |
| if ".git" not in path.parts: | |
| py_compile.compile(str(path), doraise=True) | |
| print("syntax ok") | |
| PY | |
| RH_COLLECTION_ENABLED=false python3 -B - <<'PY' | |
| from fastapi.testclient import TestClient | |
| import app | |
| client = TestClient(app.app) | |
| response = client.get("/") | |
| assert response.status_code == 200 | |
| assert "ResearchHarness" in response.text | |
| print("app ok") | |
| PY | |
| node --check frontend/static/app.js | |
| git diff --check | |
| ``` | |