Spaces:

AnonymousECCV15285
/

MMIB_Dataset_Analysis_Tool

Sleeping

App Files Files Community

MMIB_Dataset_Analysis_Tool / README.md

AnonymousECCV15285

Update README.md

e2e96d2 verified 3 days ago

preview code

raw

history blame contribute delete

3.78 kB

metadata

title: MMIB Dataset Visualizer
sdk: docker
app_port: 7860

MMB Dataset Visualizer

Visualizes MMIB-style CSVs from upload or output/ / data/. Dataset: scholo/MMB_dataset

Data source

Sidebar: choose Upload dataset or CSV files.

Upload dataset

Upload your own dataset as a ZIP or CSV:

Select Upload dataset in the sidebar.
Upload a ZIP (recommended) or CSV file.
Click Use this file. You can Clear uploaded dataset to remove it and upload another.

ZIP structure (same layout as on disk): include your CSV at the root (or in a subfolder), an images/ folder with the files referenced by the CSV, and optionally scenes/ for counterfactual types. Example:

mydata.zip
  image_mapping_with_questions.csv
  images/
    scene_0001_original.png
    scene_0001_cf1.png
    ...
  scenes/   (optional)
    scene_0001_cf1.json
    ...

CSV-only upload: you can upload just a CSV. Image columns will show filenames only (no thumbnails) unless you use a ZIP with images/.

CSV files

The app discovers CSVs under output/, data/, and hf_dataset/ (recursive).
On the Hugging Face Space, data/ and hf_dataset/ are not in the repo (binary images excluded); use Upload dataset to visualize. Use the sidebar CSV file dropdown to pick one.

image_mapping_with_questions.csv — Original + counterfactual images, questions, difficulties, answer matrix (e.g. from the MMB Counterfactual Image Generation Tool).
image_mapping.csv — Images only (original_image, counterfactual1_image, counterfactual2_image).

Put your CSV next to images/ and optionally scenes/ (e.g. data/example/image_mapping_with_questions.csv, data/example/images/, data/example/scenes/). You get:

Image sets — Each scene set as a row with thumbnails (Original, CF1, CF2), counterfactual types (e.g. CF1 change_color, CF2 change_lighting), scene ID, questions & difficulties. Optional “Include answer matrix in each row.”
Overview — Scene-set count, column summary.
Difficulty & questions — Bar charts of difficulty (easy/medium/hard) by question type (Original, CF1, CF2).
Counterfactual types — Table of scene → CF1 type, CF2 type; bar chart of type counts by slot (from scenes/*_cf1.json, *_cf2.json).
Answer matrix — 3×3 grid per scene (image × question).

Thumbnails use <csv_directory>/images/. If missing, filenames are shown. Counterfactual types come from <csv_directory>/scenes/ (cf_metadata.cf_type in *_cf1.json, *_cf2.json). If scenes/ is missing, types are omitted.

Deploy to Hugging Face (dataset + Space)

Push to both the dataset and the Streamlit Space:

pip install -r requirements-upload.txt huggingface_hub
hf auth login   # if not in PATH: python -m huggingface_hub.cli.hf auth login

python scripts/deploy_both.py

This uploads:

Dataset → scholo/MMB_dataset
Space → scholo/Datasetviewer

Options:

--dataset-only — Only push the dataset
--space-only — Only push the Space

Dataset source: The dataset is read from the hf_dataset/ folder. Put your CSV, images/, and scenes/ there. See hf_dataset/README.md.

To push only the dataset manually:

python scripts/upload_to_huggingface.py hf_dataset/image_mapping_with_questions.csv --repo-id scholo/MMB_dataset

Setup and run

pip install -r requirements.txt
streamlit run app.py

Open the URL shown (e.g. http://localhost:8501).