File size: 2,824 Bytes
f718aea
a301de7
 
 
88369c2
a301de7
 
f718aea
 
a301de7
f718aea
 
a301de7
f718aea
a301de7
f718aea
a301de7
f718aea
 
 
8ae19ba
f718aea
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a301de7
f718aea
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a301de7
 
 
f718aea
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
title: Hugging Face Harbor Visualiser
emoji: πŸ€—
colorFrom: yellow
colorTo: red
sdk: docker
app_port: 7860
pinned: false
license: apache-2.0
short_description: Browse Harbor task specs from HF Hub, GitHub, or local
---

# πŸ€— Hugging Face Harbor Visualiser

A FastAPI Space for browsing [Harbor](https://www.harborframework.com/) task spec directories β€” the dataset format used by Harbor for agent evaluation + RL environments.

Drop in a Hugging Face dataset id, a GitHub repo, or a local Harbor dataset directory; the viewer renders every task's metadata, instruction, oracle patch, test script, and Dockerfile side-by-side. Large datasets (2k+ tasks) list and open instantly β€” task ids come from a shallow Hub listing and only the opened task's files are fetched, so nothing is bulk-downloaded.

## Use it

Open the Space and paste a dataset URI in the input box.

Prefill via URL param:
```
https://huggingface.co/spaces/AdithyaSK/harbor-visualiser?dataset=<owner>/<dataset>
```

**Inputs accepted:**

| Form | Source |
|---|---|
| `owner/name` | HF Hub dataset (default) |
| `hf://owner/name` | HF Hub (explicit) |
| `hf://owner/name@<rev>` | HF Hub revision pin |
| `gh://owner/repo` | GitHub repo |
| `gh://owner/repo@<ref>` | GitHub at branch / tag / SHA |
| `https://github.com/owner/repo` | Full GitHub URL |

## Run locally

```bash
pip install -r requirements.txt
uvicorn app:app --port 7860
# β†’ http://127.0.0.1:7860
```

## What it shows per task

| Tab | Source file |
|---|---|
| Overview | parsed `task.toml` ([task], [metadata]) + `[metadata.repo2env]` if present |
| Instruction | `instruction.md` |
| Patch (oracle) | `solution/patch.diff` |
| `test.sh` | `tests/test.sh` |
| Dockerfile | `environment/Dockerfile` |
| `solve.sh` | `solution/solve.sh` (when present) |
| Raw `task.toml` | full file |

## Dataset layout it expects (Harbor's standard)

Either of these:

```
# Layout A β€” flat (what Repo2RLEnv emits + most git repos use)
<dataset-root>/
β”œβ”€β”€ <task-id>/
β”‚   β”œβ”€β”€ task.toml
β”‚   β”œβ”€β”€ instruction.md
β”‚   β”œβ”€β”€ solution/
β”‚   β”‚   β”œβ”€β”€ patch.diff
β”‚   β”‚   └── solve.sh
β”‚   β”œβ”€β”€ tests/test.sh
β”‚   └── environment/Dockerfile
└── <task-id>/...

# Layout B β€” nested (what `repo2rlenv push` stages on the Hub)
<dataset-root>/
β”œβ”€β”€ registry.json
β”œβ”€β”€ README.md
└── tasks/
    └── <task-id>/
        └── ... (same as Layout A)
```

## Stack

- [FastAPI](https://fastapi.tiangolo.com/) + [uvicorn](https://www.uvicorn.org/) β€” server
- Vanilla-JS single-page UI (hash-routed) with a Hugging Face theme
- [huggingface_hub](https://github.com/huggingface/huggingface_hub) β€” Hub listing + per-task download
- `git` (system binary) β€” GitHub clone
- Python stdlib `tomllib` β€” task.toml parsing