suanlab's picture
fix Space python version metadata
991b693

A newer version of the Gradio SDK is available: 6.15.1

Upgrade
metadata
title: StructViz-Bench Human Eval
emoji: 📊
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.44.1
python_version: 3.10.13
app_file: app.py
pinned: false

StructViz-Bench Human Evaluation Space

This Hugging Face Space hosts the human evaluation workflow for StructViz-Bench.

What It Supports

  • Task A: answer correctness verification on 100 stratified items
  • Task B: visualization-sensitivity plausibility on 50 paired items

Data Files Required

Place these files in data/ before pushing the Space:

  • task_a_items.jsonl
  • task_b_pairs.jsonl

Place the referenced images under either:

  • images/<safe_filename>.png
  • or benchmark/rendered/benchmark/rendered/<modality>/<question_id>_<viz_type>.png

Response Storage Format

Responses are written to both JSONL and CSV under responses/:

  • responses/task_a_responses.jsonl
  • responses/task_a_responses.csv
  • responses/task_b_responses.jsonl
  • responses/task_b_responses.csv

Each record contains:

  • timestamp
  • session_id
  • evaluator
  • task
  • item_index
  • question_id
  • task metadata (modality, difficulty, source, viz_type or viz_a/viz_b)
  • rating
  • notes

Deployment

From the project root, build a minimal Space bundle with:

python3 scripts/export_human_eval_space.py

Then push release/huggingface/human_eval_space/ to a new Hugging Face Space.

If you want responses to persist across restarts, enable Hugging Face persistent storage and keep the responses/ directory mounted there.

Note: this repo can be pushed without images first; add PNG assets later using Hugging Face Xet/LFS-compatible storage.