Spaces:
Running
Running
File size: 11,784 Bytes
e5e4fd4 f209a8f e5e4fd4 f209a8f e5e4fd4 f209a8f e5e4fd4 353ee9f f209a8f 353ee9f 9a926d3 353ee9f f209a8f 353ee9f 9a926d3 353ee9f 0dc4dd6 353ee9f 4aa8f49 353ee9f 4aa8f49 353ee9f 0dc4dd6 4aa8f49 353ee9f 4aa8f49 353ee9f 4aa8f49 353ee9f 49476b4 353ee9f e81bd91 353ee9f 23e9251 353ee9f f209a8f a58ab5d f209a8f 771e544 f209a8f 86726df 1fefce0 771e544 1fefce0 f209a8f 49476b4 f209a8f e81bd91 f209a8f 1fefce0 f209a8f 353ee9f 245f73a 353ee9f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 | ---
title: ResearchHarness
emoji: π
colorFrom: blue
colorTo: yellow
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: Lightweight harness for tool-using LLM agents.
---
# ResearchHarness Space Maintenance Notes
This repository is the Hugging Face Docker Space deployment for
[`ResearchHarness`](https://github.com/InternScience/ResearchHarness). It is an online
app mirror, not the public open-source documentation and not a full source mirror.
The public project README, tutorials, benchmark notes, API server documentation,
and local CLI documentation belong in the main GitHub repository. This Space
README should stay focused on long-term deployment maintenance: what is copied
from the main repo, what is intentionally changed for hosted use, and what is
new in the Space.
## Repository Relationship
| Repository | Role |
| --- | --- |
| `github.com/InternScience/ResearchHarness` | Main open-source runtime, CLI, API server, frontend, docs, tests, and benchmark adapters. |
| `huggingface.co/spaces/InternScience/ResearchHarness` | Hugging Face Space app that hosts the browser frontend with managed temporary workspaces. |
| `huggingface.co/datasets/InternScience/ResearchHarness-Data` | Hugging Face dataset receiving collected hosted-run trajectory PRs. |
Maintenance rule:
- Copy only the runtime/frontend pieces needed by the hosted app.
- Do not blindly sync the whole main repository into this Space.
- Space-only deployment logic must not be copied back into the main repo unless
it is genuinely general-purpose.
- Public documentation should be updated in the main repo, not duplicated here.
- Treat the tables below as the sync boundary. Fully synced files may be copied
from the main repo and diff-checked. Partially synced files must be updated
with targeted patches only; do not overwrite them with main-repo files.
## Sync Policy
The Space should stay small and deployment-focused. When the main repository
changes, sync only the files needed by the hosted browser app, then inspect the
diff manually. Do not copy the whole main repository into this Space.
For partially synced files, a clean sync means the diff contains only the
specific hunk needed for the current bug or feature. If a full-file copy creates
large unrelated changes, restore the Space version from `HEAD` and reapply the
minimal patch.
### Fully Synced From The Main Repository
These files/directories should normally match the main repo exactly, unless a
future Space-specific need is documented here:
| Path | Purpose |
| --- | --- |
| `agent_base/base.py` | Base agent interface. |
| `agent_base/console_utils.py` | Shared console/event formatting helpers. |
| `agent_base/context_compact.py` | Context compaction logic. |
| `agent_base/model_profiles.py` | Provider/model profile helpers. |
| `agent_base/prompt.py` | Base system prompt. |
| `agent_base/prompts/system_base.md` | Shared base prompt text. |
| `agent_base/provider_compat.py` | Provider compatibility normalization. |
| `agent_base/session_state.py` | Session state serialization. |
| `agent_base/tools/*.py` | Tool implementations exposed by the Space app. |
| `agent_base/trace_utils.py` | Trace writing utilities. |
| `agent_base/utils.py` | Shared runtime utilities, including default `.env` loading. |
| `VERSION` | Version marker shown by the app/runtime when needed. |
### Partially Synced And Space-Modified
These files are related to main-repo files, but must be merged manually because
the hosted Space has different deployment semantics:
| Path | Maintenance rule |
| --- | --- |
| `agent_base/react_agent.py` | Keep core ReAct/runtime behavior aligned with main. Preserve Space compatibility only when it is genuinely required by the hosted app. |
| `frontend/local_server.py` | Based on the main local frontend server, but Space-modified for managed temporary workspaces, forced `agent_workspace/` + `agent_trace/` layout, workspace zip download, automatic cleanup, trajectory collection hooks, and no arbitrary server-folder picker semantics. Never overwrite this file blindly from main. |
| `frontend/static/index.html` | Starts from the main frontend HTML, but removes the local workspace picker and adds hosted workspace download UI. |
| `frontend/static/app.js` | Starts from the main frontend client, but removes local folder selection and adds download-token / workspace-zip handling. |
| `frontend/static/app.css` | Starts from the main frontend CSS, but includes Space-only hosted workspace/download styling and omits local folder picker modal styles. |
| `requirements.txt` | Starts from the main runtime dependencies, but keeps Space-only hosted dependencies such as `huggingface_hub` and `uvicorn[standard]`. |
| `app.py` | Space-only FastAPI/Hugging Face entrypoint. It owns startup, cleanup scheduling, static mounting, and hosted defaults. |
| `check_space_runtime.py` | Space-only smoke test for deployment import/runtime sanity. |
| `Dockerfile` | Space-only Docker build. |
| `.dockerignore` | Space-only Docker context pruning. |
| `.gitattributes` | Space repository metadata. |
| `.gitignore` | Space-only generated files, cache, and temporary run ignores. |
| `README.md` | Space maintenance notes only. Public project docs belong in the main repo. |
### Out Of Scope For The Space
These main-repo areas should not be copied into this Space unless the hosted app
explicitly starts using them:
| Main-repo path | Reason |
| --- | --- |
| `pyproject.toml`, `MANIFEST.in`, `researchharness/` | PyPI packaging belongs to the main open-source repo, not the hosted app mirror. |
| `.github/` | GitHub CI/release automation does not run in the Hugging Face Space repo. |
| `run_agent.py`, `run_server.py`, `run_frontend.py` | Local CLI/API/frontend entrypoints are not how the Space is launched. |
| `api/` | OpenAI-compatible API server is not part of the Space app. |
| `benchmarks/` | Benchmark adapters and benchmark docs belong to the main repo. |
| `docs/` | Long-form tutorials belong to the main repo. |
| `tests/` | Main local/CI tests belong to the main repo; Space keeps only focused smoke checks. |
| `.env.example` | Public environment template belongs to the main repo. |
| `agent_base/tools/README.md` | Tool documentation belongs to the main repo; Space keeps only runtime code. |
| `agent_base/prompts/plugins/` | Plugin prompt assets are not used by the hosted app unless a future Space feature explicitly needs them. |
| `workspace/`, `api_runs/`, `traces/` | Local placeholder/runtime directories are not checked into Space. |
| local benchmark helpers such as `benchmarks/**/local_*` | Local development helpers must not be deployed. |
Keeping these files out prevents stale code paths and misleading documentation
from accumulating in the Space.
## Space-Specific Runtime Behavior
These behaviors are intentional hosted-app deltas:
- Users cannot select arbitrary server folders. Each new chat gets an isolated
managed run directory under `RH_SPACE_RUNS_DIR`.
- The runtime layout is always:
`run_.../agent_workspace/` for agent-visible files and
`run_.../agent_trace/` for traces and `session_state_*.json`.
- Uploaded images are saved under `agent_workspace/inputs/images/` and are also
passed to the model as image inputs when supported.
- Users can download files created or handled by the agent with the
`Download workspace.zip` button. The zip contains only the current chat's
`agent_workspace/`; it does not include `agent_trace/`, server files, or
Space secrets.
- The frontend exposes a per-run model dropdown. Current options are `gpt-5.5`
and `claude-opus-4-8`; the selection must stay local to that run and must not
mutate global process environment variables.
- Completed runs are packaged for trajectory collection and submitted as pull
requests to the configured Hugging Face dataset after the batch threshold is
reached.
- Old inactive runs are cleaned periodically so the Space does not grow without
bound.
## Required Secrets
Configure these as Hugging Face Space secrets before starting the app:
| Secret | Purpose |
| --- | --- |
| `API_KEY` | API key for your OpenAI-compatible LLM provider. |
| `API_BASE` | OpenAI-compatible `/v1` endpoint. |
| `MODEL_NAME` | Main model used by ResearchHarness. |
| `SERPER_KEY` | WebSearch / ScholarSearch key from <https://serper.dev/>. |
| `JINA_KEY` | WebFetch key from <https://jina.ai/>. |
| `MINERU_TOKEN` | ReadPDF key from <https://mineru.net/>. |
| `HF_TOKEN` | Hugging Face token with write access to `InternScience/ResearchHarness-Data`. |
## Optional Runtime Variables
| Variable | Default | Meaning |
| --- | --- | --- |
| `RH_SPACE_RUNS_DIR` | `/tmp/researchharness_space/runs` | Parent directory for temporary per-chat runs. |
| `RH_SPACE_RETENTION_SECONDS` | `21600` | Delete inactive runs older than this many seconds. |
| `RH_SPACE_MAX_RUNS` | `40` | Keep at most this many inactive runs. |
| `RH_SPACE_CLEANUP_INTERVAL_SECONDS` | `900` | Background cleanup interval. |
| `WEBFETCH_TIMEOUT_SECONDS` | `180` | Overall timeout for one WebFetch tool call. |
| `WEBFETCH_MAX_CHARS` | `30000` | Hard maximum characters returned by one URL-only WebFetch call. |
| `RH_COLLECTION_ENABLED` | `true` | Automatically collect completed hosted runs. |
| `RH_COLLECTION_DATASET_REPO` | `InternScience/ResearchHarness-Data` | Dataset repo that receives trajectory PRs. |
| `RH_COLLECTION_BATCH_SIZE` | `5` | Create one dataset PR after this many collected runs. |
| `RH_COLLECTION_MAX_BUNDLE_BYTES` | `20971520` | Drop a single run bundle if it exceeds this byte limit. |
| `PORT` | `7860` | Port used by Hugging Face Docker Spaces. |
## Runtime Layout
```text
/tmp/researchharness_space/runs/
βββ run_YYYYMMDD_HHMMSS_<random>/
βββ agent_workspace/
β βββ inputs/images/ # user uploaded images, when present
βββ agent_trace/ # trace JSONL and session_state_*.json
```
The frontend exposes the chat UI and a single `Download workspace.zip` action
for the current chat. The workspace path is managed by the server so hosted
users cannot browse or select server folders.
## Trajectory Collection
Hosted mode automatically collects completed runs without exposing extra UI to users:
- Each completed run is zipped from `agent_workspace/` and `agent_trace/`.
- A `manifest.json` is included inside the zip, and a sidecar `.json` file is kept beside the pending zip.
- If a single bundle is larger than `RH_COLLECTION_MAX_BUNDLE_BYTES` (`20MB` by default), it is dropped immediately.
- Once `RH_COLLECTION_BATCH_SIZE` pending bundles exist, the Space creates a pull request in the configured Hugging Face dataset repo.
- After the dataset PR is created successfully, those local pending bundles are deleted.
- If upload fails, pending bundles are retained and `last_upload_error.json` is written under the local collection directory.
- No redaction is applied in this core hosted collector; keep the dataset private unless you intentionally want to publish the collected traces.
## Local Smoke Test
```bash
python app.py
```
Then open `http://127.0.0.1:7860`.
Before pushing Space changes, run at least:
```bash
python3 check_space_runtime.py
python3 -B - <<'PY'
from pathlib import Path
import py_compile
for path in Path(".").rglob("*.py"):
if ".git" not in path.parts:
py_compile.compile(str(path), doraise=True)
print("syntax ok")
PY
RH_COLLECTION_ENABLED=false python3 -B - <<'PY'
from fastapi.testclient import TestClient
import app
client = TestClient(app.app)
response = client.get("/")
assert response.status_code == 200
assert "ResearchHarness" in response.text
print("app ok")
PY
node --check frontend/static/app.js
git diff --check
```
|