AnonymousECCV15285's picture
Update README.md
e2e96d2 verified
metadata
title: MMIB Dataset Visualizer
sdk: docker
app_port: 7860

MMB Dataset Visualizer

Visualizes MMIB-style CSVs from upload or output/ / data/. Dataset: scholo/MMB_dataset

Data source

Sidebar: choose Upload dataset or CSV files.

Upload dataset

Upload your own dataset as a ZIP or CSV:

  1. Select Upload dataset in the sidebar.
  2. Upload a ZIP (recommended) or CSV file.
  3. Click Use this file. You can Clear uploaded dataset to remove it and upload another.

ZIP structure (same layout as on disk): include your CSV at the root (or in a subfolder), an images/ folder with the files referenced by the CSV, and optionally scenes/ for counterfactual types. Example:

mydata.zip
  image_mapping_with_questions.csv
  images/
    scene_0001_original.png
    scene_0001_cf1.png
    ...
  scenes/   (optional)
    scene_0001_cf1.json
    ...

CSV-only upload: you can upload just a CSV. Image columns will show filenames only (no thumbnails) unless you use a ZIP with images/.

CSV files

The app discovers CSVs under output/, data/, and hf_dataset/ (recursive).
On the Hugging Face Space, data/ and hf_dataset/ are not in the repo (binary images excluded); use Upload dataset to visualize. Use the sidebar CSV file dropdown to pick one.

  • image_mapping_with_questions.csv β€” Original + counterfactual images, questions, difficulties, answer matrix (e.g. from the MMB Counterfactual Image Generation Tool).
  • image_mapping.csv β€” Images only (original_image, counterfactual1_image, counterfactual2_image).

Put your CSV next to images/ and optionally scenes/ (e.g. data/example/image_mapping_with_questions.csv, data/example/images/, data/example/scenes/). You get:

  • Image sets β€” Each scene set as a row with thumbnails (Original, CF1, CF2), counterfactual types (e.g. CF1 change_color, CF2 change_lighting), scene ID, questions & difficulties. Optional β€œInclude answer matrix in each row.”
  • Overview β€” Scene-set count, column summary.
  • Difficulty & questions β€” Bar charts of difficulty (easy/medium/hard) by question type (Original, CF1, CF2).
  • Counterfactual types β€” Table of scene β†’ CF1 type, CF2 type; bar chart of type counts by slot (from scenes/*_cf1.json, *_cf2.json).
  • Answer matrix β€” 3Γ—3 grid per scene (image Γ— question).

Thumbnails use <csv_directory>/images/. If missing, filenames are shown. Counterfactual types come from <csv_directory>/scenes/ (cf_metadata.cf_type in *_cf1.json, *_cf2.json). If scenes/ is missing, types are omitted.

Deploy to Hugging Face (dataset + Space)

Push to both the dataset and the Streamlit Space:

pip install -r requirements-upload.txt huggingface_hub
hf auth login   # if not in PATH: python -m huggingface_hub.cli.hf auth login

python scripts/deploy_both.py

This uploads:

Options:

  • --dataset-only β€” Only push the dataset
  • --space-only β€” Only push the Space

Dataset source: The dataset is read from the hf_dataset/ folder. Put your CSV, images/, and scenes/ there. See hf_dataset/README.md.

To push only the dataset manually:

python scripts/upload_to_huggingface.py hf_dataset/image_mapping_with_questions.csv --repo-id scholo/MMB_dataset

Setup and run

pip install -r requirements.txt
streamlit run app.py

Open the URL shown (e.g. http://localhost:8501).