File size: 6,428 Bytes
cdf803d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 | # Managing AGG_VIS_PRESETS Programmatically
## Overview
The agg_visualizer stores presets in the HuggingFace dataset repo `reasoning-degeneration-dev/AGG_VIS_PRESETS`. Each visualizer type has its own JSON file:
| Type | File | Extra Fields |
|------|------|-------------|
| `model` | `model_presets.json` | `column` (default: `"model_responses"`) |
| `arena` | `arena_presets.json` | none |
| `rlm` | `rlm_presets.json` | `config` (default: `"rlm_call_traces"`) |
| `harbor` | `harbor_presets.json` | none |
## Preset Schema
Every preset has these base fields:
```json
{
"id": "8-char hex",
"name": "Human-readable name",
"repo": "org/dataset-name",
"split": "train"
}
```
Plus type-specific fields listed above.
## How to Add Presets from Experiment Markdown Files
### Step 1: Identify repos and their visualizer type
Read the experiment markdown file(s) and extract all HuggingFace repo links. Categorize each:
- **Countdown / MuSR datasets** (model response traces) β `model` type, set `column: "response"`
- **FrozenLake / arena datasets** (game episodes) β `arena` type
- **Harbor / SWE-bench datasets** β `harbor` type
- **RLM call traces** β `rlm` type, set `config: "rlm_call_traces"`
### Step 2: Download existing presets from HF
```python
from huggingface_hub import hf_hub_download
import json
PRESETS_REPO = "reasoning-degeneration-dev/AGG_VIS_PRESETS"
def load_hf_presets(vis_type):
try:
path = hf_hub_download(PRESETS_REPO, f"{vis_type}_presets.json", repo_type="dataset")
with open(path) as f:
return json.load(f)
except Exception:
return []
existing_model = load_hf_presets("model")
existing_arena = load_hf_presets("arena")
# ... etc for rlm, harbor
# Build set of repos already present
existing_repos = set()
for presets_list in [existing_model, existing_arena]:
for p in presets_list:
existing_repos.add(p["repo"])
```
### Step 3: Build new presets, skipping duplicates
```python
import uuid
new_presets = [] # list of (vis_type, name, repo)
# Example: adding strategy compliance countdown presets
new_presets.append(("model", "SC Countdown K2-Inst TreeSearch",
"reasoning-degeneration-dev/t1-strategy-countdown-treesearch-kimi-k2-instruct-kimi-inst"))
# ... add all repos from the markdown ...
# Filter out existing
to_add = {"model": [], "arena": [], "rlm": [], "harbor": []}
for vis_type, name, repo in new_presets:
if repo in existing_repos:
continue # skip duplicates
preset = {
"id": uuid.uuid4().hex[:8],
"name": name,
"repo": repo,
"split": "train",
}
if vis_type == "model":
preset["column"] = "response"
elif vis_type == "rlm":
preset["config"] = "rlm_call_traces"
to_add[vis_type].append(preset)
```
### Step 4: Merge and upload to HF
```python
import tempfile, os
from huggingface_hub import HfApi
api = HfApi()
# Merge new presets with existing
final_model = existing_model + to_add["model"]
final_arena = existing_arena + to_add["arena"]
for vis_type, presets in [("model", final_model), ("arena", final_arena)]:
if not presets:
continue
with tempfile.NamedTemporaryFile("w", suffix=".json", delete=False) as f:
json.dump(presets, f, indent=2)
tmp = f.name
api.upload_file(
path_or_fileobj=tmp,
path_in_repo=f"{vis_type}_presets.json",
repo_id=PRESETS_REPO,
repo_type="dataset",
)
os.unlink(tmp)
```
### Step 5: Sync the deployed HF Space
After uploading to the HF dataset, tell the running Space to re-download presets:
```bash
curl -X POST "https://reasoning-degeneration-dev-agg-trace-visualizer.hf.space/api/presets/sync"
```
This forces the Space to re-download all preset files from `AGG_VIS_PRESETS` without needing a restart or redeployment.
### Step 6: Sync local preset files
```python
import shutil
from huggingface_hub import hf_hub_download
local_dir = "/Users/rs2020/Research/tools/visualizers/agg_visualizer/backend/presets"
for vis_type in ["model", "arena", "rlm", "harbor"]:
try:
path = hf_hub_download(PRESETS_REPO, f"{vis_type}_presets.json", repo_type="dataset")
shutil.copy2(path, f"{local_dir}/{vis_type}_presets.json")
except Exception:
pass
```
## Naming Convention
Preset names follow this pattern to be descriptive and avoid future conflicts:
```
{Experiment} {Task} {Model} {Variant}
```
### Experiment prefixes
- `SC` β Strategy Compliance
- `Wing` β Wingdings Compliance
### Model abbreviations
- `K2-Inst` β Kimi-K2-Instruct (RLHF)
- `K2-Think` β Kimi-K2-Thinking (RLVR)
- `Q3-Inst` β Qwen3-Next-80B Instruct (RLHF)
- `Q3-Think` β Qwen3-Next-80B Thinking (RLVR)
### Task names
- `Countdown` β 8-arg arithmetic countdown
- `MuSR` β MuSR murder mysteries
- `FrozenLake` β FrozenLake grid navigation
### Variant names (strategy compliance only)
- `TreeSearch` / `Baseline` / `Anti` β countdown tree search experiment
- `CritFirst` / `Anti-CritFirst` β criterion-first cross-cutting analysis
- `Counterfactual` / `Anti-Counterfactual` β counterfactual hypothesis testing
- `BackChain` β backward chaining (FrozenLake)
### Examples
```
SC Countdown K2-Inst TreeSearch # Strategy compliance, countdown, Kimi instruct, tree search variant
SC MuSR Q3-Think Counterfactual # Strategy compliance, MuSR, Qwen thinking, counterfactual variant
SC FrozenLake K2-Think BackChain # Strategy compliance, FrozenLake, Kimi thinking, backward chaining
Wing Countdown Q3-Inst # Wingdings, countdown, Qwen instruct (no variant β wingdings has one condition)
Wing MuSR K2-Think # Wingdings, MuSR, Kimi thinking
```
## Important Notes
- **Always check for existing repos** before adding. The script above uses `existing_repos` set to skip duplicates.
- **The `column` field matters for model presets.** Strategy compliance and wingdings datasets use `"response"` as the response column, not the default `"model_responses"`.
- **Local files are fallback cache.** The agg_visualizer downloads from HF on startup and caches locally. After uploading to HF, sync the local files so the running app picks up changes without restart (or hit the `/api/presets/sync` endpoint).
- **Don't modify rlm or harbor presets** unless adding datasets of those types. The script above only touches model and arena.
|