ResearchHarness / README.md
black-yt's picture
Prevent AskUser card fade in Space
0dc4dd6
|
Raw
History Blame Contribute Delete
11.8 kB
metadata
title: ResearchHarness
emoji: πŸš€
colorFrom: blue
colorTo: yellow
sdk: docker
app_port: 7860
pinned: false
license: mit
short_description: Lightweight harness for tool-using LLM agents.

ResearchHarness Space Maintenance Notes

This repository is the Hugging Face Docker Space deployment for ResearchHarness. It is an online app mirror, not the public open-source documentation and not a full source mirror.

The public project README, tutorials, benchmark notes, API server documentation, and local CLI documentation belong in the main GitHub repository. This Space README should stay focused on long-term deployment maintenance: what is copied from the main repo, what is intentionally changed for hosted use, and what is new in the Space.

Repository Relationship

Repository Role
github.com/InternScience/ResearchHarness Main open-source runtime, CLI, API server, frontend, docs, tests, and benchmark adapters.
huggingface.co/spaces/InternScience/ResearchHarness Hugging Face Space app that hosts the browser frontend with managed temporary workspaces.
huggingface.co/datasets/InternScience/ResearchHarness-Data Hugging Face dataset receiving collected hosted-run trajectory PRs.

Maintenance rule:

  • Copy only the runtime/frontend pieces needed by the hosted app.
  • Do not blindly sync the whole main repository into this Space.
  • Space-only deployment logic must not be copied back into the main repo unless it is genuinely general-purpose.
  • Public documentation should be updated in the main repo, not duplicated here.
  • Treat the tables below as the sync boundary. Fully synced files may be copied from the main repo and diff-checked. Partially synced files must be updated with targeted patches only; do not overwrite them with main-repo files.

Sync Policy

The Space should stay small and deployment-focused. When the main repository changes, sync only the files needed by the hosted browser app, then inspect the diff manually. Do not copy the whole main repository into this Space.

For partially synced files, a clean sync means the diff contains only the specific hunk needed for the current bug or feature. If a full-file copy creates large unrelated changes, restore the Space version from HEAD and reapply the minimal patch.

Fully Synced From The Main Repository

These files/directories should normally match the main repo exactly, unless a future Space-specific need is documented here:

Path Purpose
agent_base/base.py Base agent interface.
agent_base/console_utils.py Shared console/event formatting helpers.
agent_base/context_compact.py Context compaction logic.
agent_base/model_profiles.py Provider/model profile helpers.
agent_base/prompt.py Base system prompt.
agent_base/prompts/system_base.md Shared base prompt text.
agent_base/provider_compat.py Provider compatibility normalization.
agent_base/session_state.py Session state serialization.
agent_base/tools/*.py Tool implementations exposed by the Space app.
agent_base/trace_utils.py Trace writing utilities.
agent_base/utils.py Shared runtime utilities, including default .env loading.
VERSION Version marker shown by the app/runtime when needed.

Partially Synced And Space-Modified

These files are related to main-repo files, but must be merged manually because the hosted Space has different deployment semantics:

Path Maintenance rule
agent_base/react_agent.py Keep core ReAct/runtime behavior aligned with main. Preserve Space compatibility only when it is genuinely required by the hosted app.
frontend/local_server.py Based on the main local frontend server, but Space-modified for managed temporary workspaces, forced agent_workspace/ + agent_trace/ layout, workspace zip download, automatic cleanup, trajectory collection hooks, and no arbitrary server-folder picker semantics. Never overwrite this file blindly from main.
frontend/static/index.html Starts from the main frontend HTML, but removes the local workspace picker and adds hosted workspace download UI.
frontend/static/app.js Starts from the main frontend client, but removes local folder selection and adds download-token / workspace-zip handling.
frontend/static/app.css Starts from the main frontend CSS, but includes Space-only hosted workspace/download styling and omits local folder picker modal styles.
requirements.txt Starts from the main runtime dependencies, but keeps Space-only hosted dependencies such as huggingface_hub and uvicorn[standard].
app.py Space-only FastAPI/Hugging Face entrypoint. It owns startup, cleanup scheduling, static mounting, and hosted defaults.
check_space_runtime.py Space-only smoke test for deployment import/runtime sanity.
Dockerfile Space-only Docker build.
.dockerignore Space-only Docker context pruning.
.gitattributes Space repository metadata.
.gitignore Space-only generated files, cache, and temporary run ignores.
README.md Space maintenance notes only. Public project docs belong in the main repo.

Out Of Scope For The Space

These main-repo areas should not be copied into this Space unless the hosted app explicitly starts using them:

Main-repo path Reason
pyproject.toml, MANIFEST.in, researchharness/ PyPI packaging belongs to the main open-source repo, not the hosted app mirror.
.github/ GitHub CI/release automation does not run in the Hugging Face Space repo.
run_agent.py, run_server.py, run_frontend.py Local CLI/API/frontend entrypoints are not how the Space is launched.
api/ OpenAI-compatible API server is not part of the Space app.
benchmarks/ Benchmark adapters and benchmark docs belong to the main repo.
docs/ Long-form tutorials belong to the main repo.
tests/ Main local/CI tests belong to the main repo; Space keeps only focused smoke checks.
.env.example Public environment template belongs to the main repo.
agent_base/tools/README.md Tool documentation belongs to the main repo; Space keeps only runtime code.
agent_base/prompts/plugins/ Plugin prompt assets are not used by the hosted app unless a future Space feature explicitly needs them.
workspace/, api_runs/, traces/ Local placeholder/runtime directories are not checked into Space.
local benchmark helpers such as benchmarks/**/local_* Local development helpers must not be deployed.

Keeping these files out prevents stale code paths and misleading documentation from accumulating in the Space.

Space-Specific Runtime Behavior

These behaviors are intentional hosted-app deltas:

  • Users cannot select arbitrary server folders. Each new chat gets an isolated managed run directory under RH_SPACE_RUNS_DIR.
  • The runtime layout is always: run_.../agent_workspace/ for agent-visible files and run_.../agent_trace/ for traces and session_state_*.json.
  • Uploaded images are saved under agent_workspace/inputs/images/ and are also passed to the model as image inputs when supported.
  • Users can download files created or handled by the agent with the Download workspace.zip button. The zip contains only the current chat's agent_workspace/; it does not include agent_trace/, server files, or Space secrets.
  • The frontend exposes a per-run model dropdown. Current options are gpt-5.5 and claude-opus-4-8; the selection must stay local to that run and must not mutate global process environment variables.
  • Completed runs are packaged for trajectory collection and submitted as pull requests to the configured Hugging Face dataset after the batch threshold is reached.
  • Old inactive runs are cleaned periodically so the Space does not grow without bound.

Required Secrets

Configure these as Hugging Face Space secrets before starting the app:

Secret Purpose
API_KEY API key for your OpenAI-compatible LLM provider.
API_BASE OpenAI-compatible /v1 endpoint.
MODEL_NAME Main model used by ResearchHarness.
SERPER_KEY WebSearch / ScholarSearch key from https://serper.dev/.
JINA_KEY WebFetch key from https://jina.ai/.
MINERU_TOKEN ReadPDF key from https://mineru.net/.
HF_TOKEN Hugging Face token with write access to InternScience/ResearchHarness-Data.

Optional Runtime Variables

Variable Default Meaning
RH_SPACE_RUNS_DIR /tmp/researchharness_space/runs Parent directory for temporary per-chat runs.
RH_SPACE_RETENTION_SECONDS 21600 Delete inactive runs older than this many seconds.
RH_SPACE_MAX_RUNS 40 Keep at most this many inactive runs.
RH_SPACE_CLEANUP_INTERVAL_SECONDS 900 Background cleanup interval.
WEBFETCH_TIMEOUT_SECONDS 180 Overall timeout for one WebFetch tool call.
WEBFETCH_MAX_CHARS 30000 Hard maximum characters returned by one URL-only WebFetch call.
RH_COLLECTION_ENABLED true Automatically collect completed hosted runs.
RH_COLLECTION_DATASET_REPO InternScience/ResearchHarness-Data Dataset repo that receives trajectory PRs.
RH_COLLECTION_BATCH_SIZE 5 Create one dataset PR after this many collected runs.
RH_COLLECTION_MAX_BUNDLE_BYTES 20971520 Drop a single run bundle if it exceeds this byte limit.
PORT 7860 Port used by Hugging Face Docker Spaces.

Runtime Layout

/tmp/researchharness_space/runs/
└── run_YYYYMMDD_HHMMSS_<random>/
    β”œβ”€β”€ agent_workspace/
    β”‚   └── inputs/images/        # user uploaded images, when present
    └── agent_trace/              # trace JSONL and session_state_*.json

The frontend exposes the chat UI and a single Download workspace.zip action for the current chat. The workspace path is managed by the server so hosted users cannot browse or select server folders.

Trajectory Collection

Hosted mode automatically collects completed runs without exposing extra UI to users:

  • Each completed run is zipped from agent_workspace/ and agent_trace/.
  • A manifest.json is included inside the zip, and a sidecar .json file is kept beside the pending zip.
  • If a single bundle is larger than RH_COLLECTION_MAX_BUNDLE_BYTES (20MB by default), it is dropped immediately.
  • Once RH_COLLECTION_BATCH_SIZE pending bundles exist, the Space creates a pull request in the configured Hugging Face dataset repo.
  • After the dataset PR is created successfully, those local pending bundles are deleted.
  • If upload fails, pending bundles are retained and last_upload_error.json is written under the local collection directory.
  • No redaction is applied in this core hosted collector; keep the dataset private unless you intentionally want to publish the collected traces.

Local Smoke Test

python app.py

Then open http://127.0.0.1:7860.

Before pushing Space changes, run at least:

python3 check_space_runtime.py

python3 -B - <<'PY'
from pathlib import Path
import py_compile

for path in Path(".").rglob("*.py"):
    if ".git" not in path.parts:
        py_compile.compile(str(path), doraise=True)
print("syntax ok")
PY

RH_COLLECTION_ENABLED=false python3 -B - <<'PY'
from fastapi.testclient import TestClient
import app

client = TestClient(app.app)
response = client.get("/")
assert response.status_code == 200
assert "ResearchHarness" in response.text
print("app ok")
PY

node --check frontend/static/app.js
git diff --check