Spaces:
Running
title: ResearchHarness
emoji: π
colorFrom: blue
colorTo: yellow
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: Lightweight harness for tool-using LLM agents.
ResearchHarness Space Maintenance Notes
This repository is the Hugging Face Docker Space deployment for
ResearchHarness. It is an online
app mirror, not the public open-source documentation and not a full source mirror.
The public project README, tutorials, benchmark notes, API server documentation, and local CLI documentation belong in the main GitHub repository. This Space README should stay focused on long-term deployment maintenance: what is copied from the main repo, what is intentionally changed for hosted use, and what is new in the Space.
Repository Relationship
| Repository | Role |
|---|---|
github.com/InternScience/ResearchHarness |
Main open-source runtime, CLI, API server, frontend, docs, tests, and benchmark adapters. |
huggingface.co/spaces/InternScience/ResearchHarness |
Hugging Face Space app that hosts the browser frontend with managed temporary workspaces. |
huggingface.co/datasets/InternScience/ResearchHarness-Data |
Hugging Face dataset receiving collected hosted-run trajectory PRs. |
Maintenance rule:
- Copy only the runtime/frontend pieces needed by the hosted app.
- Do not blindly sync the whole main repository into this Space.
- Space-only deployment logic must not be copied back into the main repo unless it is genuinely general-purpose.
- Public documentation should be updated in the main repo, not duplicated here.
- Treat the tables below as the sync boundary. Fully synced files may be copied from the main repo and diff-checked. Partially synced files must be updated with targeted patches only; do not overwrite them with main-repo files.
Sync Policy
The Space should stay small and deployment-focused. When the main repository changes, sync only the files needed by the hosted browser app, then inspect the diff manually. Do not copy the whole main repository into this Space.
For partially synced files, a clean sync means the diff contains only the
specific hunk needed for the current bug or feature. If a full-file copy creates
large unrelated changes, restore the Space version from HEAD and reapply the
minimal patch.
Fully Synced From The Main Repository
These files/directories should normally match the main repo exactly, unless a future Space-specific need is documented here:
| Path | Purpose |
|---|---|
agent_base/base.py |
Base agent interface. |
agent_base/console_utils.py |
Shared console/event formatting helpers. |
agent_base/context_compact.py |
Context compaction logic. |
agent_base/model_profiles.py |
Provider/model profile helpers. |
agent_base/prompt.py |
Base system prompt. |
agent_base/prompts/system_base.md |
Shared base prompt text. |
agent_base/provider_compat.py |
Provider compatibility normalization. |
agent_base/session_state.py |
Session state serialization. |
agent_base/tools/*.py |
Tool implementations exposed by the Space app. |
agent_base/trace_utils.py |
Trace writing utilities. |
agent_base/utils.py |
Shared runtime utilities, including default .env loading. |
VERSION |
Version marker shown by the app/runtime when needed. |
Partially Synced And Space-Modified
These files are related to main-repo files, but must be merged manually because the hosted Space has different deployment semantics:
| Path | Maintenance rule |
|---|---|
agent_base/react_agent.py |
Keep core ReAct/runtime behavior aligned with main. Preserve Space compatibility only when it is genuinely required by the hosted app. |
frontend/local_server.py |
Based on the main local frontend server, but Space-modified for managed temporary workspaces, forced agent_workspace/ + agent_trace/ layout, workspace zip download, automatic cleanup, trajectory collection hooks, and no arbitrary server-folder picker semantics. Never overwrite this file blindly from main. |
frontend/static/index.html |
Starts from the main frontend HTML, but removes the local workspace picker and adds hosted workspace download UI. |
frontend/static/app.js |
Starts from the main frontend client, but removes local folder selection and adds download-token / workspace-zip handling. |
frontend/static/app.css |
Starts from the main frontend CSS, but includes Space-only hosted workspace/download styling and omits local folder picker modal styles. |
requirements.txt |
Starts from the main runtime dependencies, but keeps Space-only hosted dependencies such as huggingface_hub and uvicorn[standard]. |
app.py |
Space-only FastAPI/Hugging Face entrypoint. It owns startup, cleanup scheduling, static mounting, and hosted defaults. |
check_space_runtime.py |
Space-only smoke test for deployment import/runtime sanity. |
Dockerfile |
Space-only Docker build. |
.dockerignore |
Space-only Docker context pruning. |
.gitattributes |
Space repository metadata. |
.gitignore |
Space-only generated files, cache, and temporary run ignores. |
README.md |
Space maintenance notes only. Public project docs belong in the main repo. |
Out Of Scope For The Space
These main-repo areas should not be copied into this Space unless the hosted app explicitly starts using them:
| Main-repo path | Reason |
|---|---|
pyproject.toml, MANIFEST.in, researchharness/ |
PyPI packaging belongs to the main open-source repo, not the hosted app mirror. |
.github/ |
GitHub CI/release automation does not run in the Hugging Face Space repo. |
run_agent.py, run_server.py, run_frontend.py |
Local CLI/API/frontend entrypoints are not how the Space is launched. |
api/ |
OpenAI-compatible API server is not part of the Space app. |
benchmarks/ |
Benchmark adapters and benchmark docs belong to the main repo. |
docs/ |
Long-form tutorials belong to the main repo. |
tests/ |
Main local/CI tests belong to the main repo; Space keeps only focused smoke checks. |
.env.example |
Public environment template belongs to the main repo. |
agent_base/tools/README.md |
Tool documentation belongs to the main repo; Space keeps only runtime code. |
agent_base/prompts/plugins/ |
Plugin prompt assets are not used by the hosted app unless a future Space feature explicitly needs them. |
workspace/, api_runs/, traces/ |
Local placeholder/runtime directories are not checked into Space. |
local benchmark helpers such as benchmarks/**/local_* |
Local development helpers must not be deployed. |
Keeping these files out prevents stale code paths and misleading documentation from accumulating in the Space.
Space-Specific Runtime Behavior
These behaviors are intentional hosted-app deltas:
- Users cannot select arbitrary server folders. Each new chat gets an isolated
managed run directory under
RH_SPACE_RUNS_DIR. - The runtime layout is always:
run_.../agent_workspace/for agent-visible files andrun_.../agent_trace/for traces andsession_state_*.json. - Uploaded images are saved under
agent_workspace/inputs/images/and are also passed to the model as image inputs when supported. - Users can download files created or handled by the agent with the
Download workspace.zipbutton. The zip contains only the current chat'sagent_workspace/; it does not includeagent_trace/, server files, or Space secrets. - The frontend exposes a per-run model dropdown. Current options are
gpt-5.5andclaude-opus-4-8; the selection must stay local to that run and must not mutate global process environment variables. - Completed runs are packaged for trajectory collection and submitted as pull requests to the configured Hugging Face dataset after the batch threshold is reached.
- Old inactive runs are cleaned periodically so the Space does not grow without bound.
Required Secrets
Configure these as Hugging Face Space secrets before starting the app:
| Secret | Purpose |
|---|---|
API_KEY |
API key for your OpenAI-compatible LLM provider. |
API_BASE |
OpenAI-compatible /v1 endpoint. |
MODEL_NAME |
Main model used by ResearchHarness. |
SERPER_KEY |
WebSearch / ScholarSearch key from https://serper.dev/. |
JINA_KEY |
WebFetch key from https://jina.ai/. |
MINERU_TOKEN |
ReadPDF key from https://mineru.net/. |
HF_TOKEN |
Hugging Face token with write access to InternScience/ResearchHarness-Data. |
Optional Runtime Variables
| Variable | Default | Meaning |
|---|---|---|
RH_SPACE_RUNS_DIR |
/tmp/researchharness_space/runs |
Parent directory for temporary per-chat runs. |
RH_SPACE_RETENTION_SECONDS |
21600 |
Delete inactive runs older than this many seconds. |
RH_SPACE_MAX_RUNS |
40 |
Keep at most this many inactive runs. |
RH_SPACE_CLEANUP_INTERVAL_SECONDS |
900 |
Background cleanup interval. |
WEBFETCH_TIMEOUT_SECONDS |
180 |
Overall timeout for one WebFetch tool call. |
WEBFETCH_MAX_CHARS |
30000 |
Hard maximum characters returned by one URL-only WebFetch call. |
RH_COLLECTION_ENABLED |
true |
Automatically collect completed hosted runs. |
RH_COLLECTION_DATASET_REPO |
InternScience/ResearchHarness-Data |
Dataset repo that receives trajectory PRs. |
RH_COLLECTION_BATCH_SIZE |
5 |
Create one dataset PR after this many collected runs. |
RH_COLLECTION_MAX_BUNDLE_BYTES |
20971520 |
Drop a single run bundle if it exceeds this byte limit. |
PORT |
7860 |
Port used by Hugging Face Docker Spaces. |
Runtime Layout
/tmp/researchharness_space/runs/
βββ run_YYYYMMDD_HHMMSS_<random>/
βββ agent_workspace/
β βββ inputs/images/ # user uploaded images, when present
βββ agent_trace/ # trace JSONL and session_state_*.json
The frontend exposes the chat UI and a single Download workspace.zip action
for the current chat. The workspace path is managed by the server so hosted
users cannot browse or select server folders.
Trajectory Collection
Hosted mode automatically collects completed runs without exposing extra UI to users:
- Each completed run is zipped from
agent_workspace/andagent_trace/. - A
manifest.jsonis included inside the zip, and a sidecar.jsonfile is kept beside the pending zip. - If a single bundle is larger than
RH_COLLECTION_MAX_BUNDLE_BYTES(20MBby default), it is dropped immediately. - Once
RH_COLLECTION_BATCH_SIZEpending bundles exist, the Space creates a pull request in the configured Hugging Face dataset repo. - After the dataset PR is created successfully, those local pending bundles are deleted.
- If upload fails, pending bundles are retained and
last_upload_error.jsonis written under the local collection directory. - No redaction is applied in this core hosted collector; keep the dataset private unless you intentionally want to publish the collected traces.
Local Smoke Test
python app.py
Then open http://127.0.0.1:7860.
Before pushing Space changes, run at least:
python3 check_space_runtime.py
python3 -B - <<'PY'
from pathlib import Path
import py_compile
for path in Path(".").rglob("*.py"):
if ".git" not in path.parts:
py_compile.compile(str(path), doraise=True)
print("syntax ok")
PY
RH_COLLECTION_ENABLED=false python3 -B - <<'PY'
from fastapi.testclient import TestClient
import app
client = TestClient(app.app)
response = client.get("/")
assert response.status_code == 200
assert "ResearchHarness" in response.text
print("app ok")
PY
node --check frontend/static/app.js
git diff --check