Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available: 6.15.1
metadata
title: StructViz-Bench Human Eval
emoji: 📊
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.44.1
python_version: 3.10.13
app_file: app.py
pinned: false
StructViz-Bench Human Evaluation Space
This Hugging Face Space hosts the human evaluation workflow for StructViz-Bench.
What It Supports
- Task A: answer correctness verification on 100 stratified items
- Task B: visualization-sensitivity plausibility on 50 paired items
Data Files Required
Place these files in data/ before pushing the Space:
task_a_items.jsonltask_b_pairs.jsonl
Place the referenced images under either:
images/<safe_filename>.png- or
benchmark/rendered/benchmark/rendered/<modality>/<question_id>_<viz_type>.png
Response Storage Format
Responses are written to both JSONL and CSV under responses/:
responses/task_a_responses.jsonlresponses/task_a_responses.csvresponses/task_b_responses.jsonlresponses/task_b_responses.csv
Each record contains:
timestampsession_idevaluatortaskitem_indexquestion_id- task metadata (
modality,difficulty,source,viz_typeorviz_a/viz_b) ratingnotes
Deployment
From the project root, build a minimal Space bundle with:
python3 scripts/export_human_eval_space.py
Then push release/huggingface/human_eval_space/ to a new Hugging Face Space.
If you want responses to persist across restarts, enable Hugging Face persistent storage and keep the responses/ directory mounted there.
Note: this repo can be pushed without images first; add PNG assets later using Hugging Face Xet/LFS-compatible storage.