Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
MukulRay
/
recon
like
0
Sleeping
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
recon
/
eval
215 kB
Ctrl+K
Ctrl+K
3 contributors
History:
8 commits
MukulRay
Phase 2: add v2 full eval runner script
dffe992
21 days ago
archived
Phase 1.1: archive patch_contradiction.py β research integrity fix
22 days ago
results
docs: commit eval summary; clarify critic as LLM-assisted-judge; fix test imports
22 days ago
calibration.py
Safe
8.63 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
about 2 months ago
check_progress.py
Safe
1.16 kB
Phase 1.3: smoke test scripts, eval runner, check_progress, fix unicode in run_eval.py
22 days ago
contradiction_viz.py
Safe
12.5 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
about 2 months ago
curate_surveys.py
Safe
11.4 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
about 2 months ago
debug_contradiction.py
Safe
3.04 kB
Phase 1.3: eval results, test scripts, gap filter reverted β no improvement, changelog update
21 days ago
generate_leaderboard.py
Safe
11.8 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
about 2 months ago
ground_truth.json
Safe
51.6 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
about 2 months ago
questions.json
Safe
64.5 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
about 2 months ago
run_eval.py
Safe
28 kB
Phase 1.3: smoke test scripts, eval runner, check_progress, fix unicode in run_eval.py
22 days ago
run_recon_linear.py
Safe
2.39 kB
Phase 1.3: smoke test scripts, eval runner, check_progress, fix unicode in run_eval.py
22 days ago
run_v2_eval.py
Safe
1.82 kB
Phase 2: add v2 full eval runner script
21 days ago
smoke_test.py
Safe
1.17 kB
Phase 1.3: smoke test scripts, eval runner, check_progress, fix unicode in run_eval.py
22 days ago
test_catc.py
Safe
1.26 kB
Phase 1.3: eval results, test scripts, gap filter reverted β no improvement, changelog update
21 days ago