Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Spaces:
MukulRay
/
recon
like
0
Sleeping
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
main
recon
/
eval
215 kB
Ctrl+K
Ctrl+K
3 contributors
History:
8 commits
MukulRay
Phase 2: add v2 full eval runner script
dffe992
2 months ago
archived
Phase 1.1: archive patch_contradiction.py β research integrity fix
2 months ago
results
docs: commit eval summary; clarify critic as LLM-assisted-judge; fix test imports
2 months ago
calibration.py
Safe
8.63 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
3 months ago
check_progress.py
Safe
1.16 kB
Phase 1.3: smoke test scripts, eval runner, check_progress, fix unicode in run_eval.py
2 months ago
contradiction_viz.py
Safe
12.5 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
3 months ago
curate_surveys.py
Safe
11.4 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
3 months ago
debug_contradiction.py
Safe
3.04 kB
Phase 1.3: eval results, test scripts, gap filter reverted β no improvement, changelog update
2 months ago
generate_leaderboard.py
Safe
11.8 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
3 months ago
ground_truth.json
Safe
51.6 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
3 months ago
questions.json
Safe
64.5 kB
Phase 13: HF Spaces deploy ready - verdict logging, clean requirements
3 months ago
run_eval.py
Safe
28 kB
Phase 1.3: smoke test scripts, eval runner, check_progress, fix unicode in run_eval.py
2 months ago
run_recon_linear.py
Safe
2.39 kB
Phase 1.3: smoke test scripts, eval runner, check_progress, fix unicode in run_eval.py
2 months ago
run_v2_eval.py
Safe
1.82 kB
Phase 2: add v2 full eval runner script
2 months ago
smoke_test.py
Safe
1.17 kB
Phase 1.3: smoke test scripts, eval runner, check_progress, fix unicode in run_eval.py
2 months ago
test_catc.py
Safe
1.26 kB
Phase 1.3: eval results, test scripts, gap filter reverted β no improvement, changelog update
2 months ago