Add BrowseComp+ tab: dataset explorer with question, evidence docs, gold answer 5cc2a94 Running timchen0618 commited on 1 day ago
Derive new_status from new_trajectory; fix sidebar check mark; fix question for incomplete d14bce3 timchen0618 commited on 6 days ago
Show incomplete runs as incorrect; fix missing questions via BrowseComp JSONL fallback 1eb493c timchen0618 commited on 7 days ago
Add question/answer/accuracy to Scout Runs tab; fix selected-tools reload cache 30fb9c5 timchen0618 commited on 7 days ago
Remove debug traceback from reload error; add fix_readme_features script 8026e0e timchen0618 commited on 7 days ago
Fix reload endpoint to force-redownload when schema changed 94039e3 timchen0618 commited on 7 days ago
Add question, correct answer, and accuracy to Selected Tools tab d0a0739 timchen0618 commited on 7 days ago
Fix selected_indices parsing for new datasets (stored as JSON string) 58fe58d timchen0618 commited on 15 days ago
Fix Selected Tools default variant (was hardcoded to removed traj_summary_ext) 74c4b8c timchen0618 commited on 15 days ago
Relabel selected tools variants; remove traj_summary_ext; rename gemini/less-chars 5b00900 timchen0618 commited on 15 days ago
Add 11 test300 Selected Tools variants extracted from trajectory_summary tag cffb305 timchen0618 commited on 15 days ago
Fix Scout Runs default variant (was hardcoded to removed budget5) 921df29 timchen0618 commited on 15 days ago
Remove test150 variant; rename mode-c/d to filtered/unfiltered for best8-random and best4-gemini af3d4e1 timchen0618 commited on 15 days ago
Add Scout Runs tab: new backend endpoint, ScoutRunsApp frontend, budget5 variant 52b0adf timchen0618 commited on Apr 24
Add traj_summary_orig_ext variant to Selected Tools tab with variant switcher fa6b40b timchen0618 commited on Apr 23
SFT Diff: add qwen template dropdown, 830 rows (both templates) a45ef56 timchen0618 commited on Apr 17
Add SFT Diff tab: side-by-side original excerpt vs converted Axolotl messages 47d3d12 timchen0618 commited on Apr 16
traj_ext: fix 1.1GB response — lazy-load formatted_prompt, truncate tool results in list 3c34d20 timchen0618 commited on Apr 13
fix: use correct index.html from fresh build (was stale from rebase conflict) 52302c1 timchen0618 commited on Apr 13
traj_ext: add run selector, run_name badge, prompt-on-top in both view; load bcp-full-runs-v1 7ea306b timchen0618 commited on Apr 13
Add Selected Tools tab with side-by-side excerpt/trajectory viewer bd51d10 verified timchen0618 commited on Apr 9
Add Traj Ext viewer tab with trajectory block rendering and HF dataset c682eb7 timchen0618 commited on Apr 8