File size: 3,779 Bytes
6371d28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
title: MMB Dataset Visualizer
sdk: docker
app_port: 7860
---

# MMB Dataset Visualizer

Visualizes **MMB-style CSVs** from **upload** or **`output/`** / **`data/`**. Dataset: [scholo/MMB_dataset](https://huggingface.co/datasets/scholo/MMB_dataset)

## Data source

**Sidebar:** choose **Upload dataset** or **CSV files**.

### Upload dataset

Upload your own dataset as a **ZIP** or **CSV**:

1. Select **Upload dataset** in the sidebar.
2. Upload a **ZIP** (recommended) or **CSV** file.
3. Click **Use this file**. You can **Clear uploaded dataset** to remove it and upload another.

**ZIP structure** (same layout as on disk): include your CSV at the root (or in a subfolder), an **`images/`** folder with the files referenced by the CSV, and optionally **`scenes/`** for counterfactual types. Example:

```
mydata.zip
  image_mapping_with_questions.csv
  images/
    scene_0001_original.png
    scene_0001_cf1.png
    ...
  scenes/   (optional)
    scene_0001_cf1.json
    ...
```

**CSV-only upload**: you can upload just a CSV. Image columns will show filenames only (no thumbnails) unless you use a ZIP with `images/`.

### CSV files

The app discovers CSVs under **`output/`**, **`data/`**, and **`hf_dataset/`** (recursive).  
**On the Hugging Face Space**, `data/` and `hf_dataset/` are not in the repo (binary images excluded); use **Upload dataset** to visualize. Use the sidebar **CSV file** dropdown to pick one.

- **`image_mapping_with_questions.csv`** β€” Original + counterfactual images, questions, difficulties, answer matrix (e.g. from the MMB Counterfactual Image Generation Tool).
- **`image_mapping.csv`** β€” Images only (`original_image`, `counterfactual1_image`, `counterfactual2_image`).

Put your CSV next to **`images/`** and optionally **`scenes/`** (e.g. `data/example/image_mapping_with_questions.csv`, `data/example/images/`, `data/example/scenes/`). You get:

- **Image sets** β€” Each scene set as a row with **thumbnails** (Original, CF1, CF2), **counterfactual types** (e.g. CF1 `change_color`, CF2 `change_lighting`), scene ID, questions & difficulties. Optional β€œInclude answer matrix in each row.”
- **Overview** β€” Scene-set count, column summary.
- **Difficulty & questions** β€” Bar charts of difficulty (easy/medium/hard) by question type (Original, CF1, CF2).
- **Counterfactual types** β€” Table of scene β†’ CF1 type, CF2 type; bar chart of type counts by slot (from `scenes/*_cf1.json`, `*_cf2.json`).
- **Answer matrix** β€” 3Γ—3 grid per scene (image Γ— question).

Thumbnails use **`<csv_directory>/images/`**. If missing, filenames are shown. Counterfactual types come from **`<csv_directory>/scenes/`** (`cf_metadata.cf_type` in `*_cf1.json`, `*_cf2.json`). If `scenes/` is missing, types are omitted.

## Deploy to Hugging Face (dataset + Space)

Push to both the dataset and the Streamlit Space:

```bash
pip install -r requirements-upload.txt huggingface_hub
hf auth login   # if not in PATH: python -m huggingface_hub.cli.hf auth login

python scripts/deploy_both.py
```

This uploads:
- **Dataset** β†’ [scholo/MMB_dataset](https://huggingface.co/datasets/scholo/MMB_dataset)
- **Space** β†’ [scholo/Datasetviewer](https://huggingface.co/spaces/scholo/Datasetviewer)

Options:
- `--dataset-only` β€” Only push the dataset
- `--space-only` β€” Only push the Space

**Dataset source:** The dataset is read from the **`hf_dataset/`** folder. Put your CSV, `images/`, and `scenes/` there. See `hf_dataset/README.md`.

To push only the dataset manually:

```bash
python scripts/upload_to_huggingface.py hf_dataset/image_mapping_with_questions.csv --repo-id scholo/MMB_dataset
```

## Setup and run

```bash
pip install -r requirements.txt
streamlit run app.py
```

Open the URL shown (e.g. `http://localhost:8501`).