Spaces:

HuggingFaceH4
/

harbor-visualiser

Running

File size: 2,824 Bytes

f718aea
a301de7
 
 
88369c2
a301de7
 
f718aea
 
a301de7
f718aea
 
a301de7
f718aea
a301de7
f718aea
a301de7
f718aea
 
 
8ae19ba
f718aea
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a301de7
f718aea
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a301de7
 
 
f718aea

---
title: Hugging Face Harbor Visualiser
emoji: 🤗
colorFrom: yellow
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: Browse Harbor task specs from HF Hub, GitHub, or local
---

# 🤗 Hugging Face Harbor Visualiser

A FastAPI Space for browsing [Harbor](https://www.harborframework.com/) task spec directories — the dataset format used by Harbor for agent evaluation + RL environments.

Drop in a Hugging Face dataset id, a GitHub repo, or a local Harbor dataset directory; the viewer renders every task's metadata, instruction, oracle patch, test script, and Dockerfile side-by-side. Large datasets (2k+ tasks) list and open instantly — task ids come from a shallow Hub listing and only the opened task's files are fetched, so nothing is bulk-downloaded.

## Use it

Open the Space and paste a dataset URI in the input box.

Prefill via URL param:
```
https://huggingface.co/spaces/AdithyaSK/harbor-visualiser?dataset=<owner>/<dataset>
```

**Inputs accepted:**

| Form | Source |
|---|---|
| `owner/name` | HF Hub dataset (default) |
| `hf://owner/name` | HF Hub (explicit) |
| `hf://owner/name@<rev>` | HF Hub revision pin |
| `gh://owner/repo` | GitHub repo |
| `gh://owner/repo@<ref>` | GitHub at branch / tag / SHA |
| `https://github.com/owner/repo` | Full GitHub URL |

## Run locally

```bash
pip install -r requirements.txt
uvicorn app:app --port 7860
# → http://127.0.0.1:7860
```

## What it shows per task

| Tab | Source file |
|---|---|
| Overview | parsed `task.toml` ([task], [metadata]) + `[metadata.repo2env]` if present |
| Instruction | `instruction.md` |
| Patch (oracle) | `solution/patch.diff` |
| `test.sh` | `tests/test.sh` |
| Dockerfile | `environment/Dockerfile` |
| `solve.sh` | `solution/solve.sh` (when present) |
| Raw `task.toml` | full file |

## Dataset layout it expects (Harbor's standard)

Either of these:

```
# Layout A — flat (what Repo2RLEnv emits + most git repos use)
<dataset-root>/
├── <task-id>/
│   ├── task.toml
│   ├── instruction.md
│   ├── solution/
│   │   ├── patch.diff
│   │   └── solve.sh
│   ├── tests/test.sh
│   └── environment/Dockerfile
└── <task-id>/...

# Layout B — nested (what `repo2rlenv push` stages on the Hub)
<dataset-root>/
├── registry.json
├── README.md
└── tasks/
    └── <task-id>/
        └── ... (same as Layout A)
```

## Stack

- [FastAPI](https://fastapi.tiangolo.com/) + [uvicorn](https://www.uvicorn.org/) — server
- Vanilla-JS single-page UI (hash-routed) with a Hugging Face theme
- [huggingface_hub](https://github.com/huggingface/huggingface_hub) — Hub listing + per-task download
- `git` (system binary) — GitHub clone
- Python stdlib `tomllib` — task.toml parsing