File size: 3,512 Bytes

0453c63

## Dataset

To train our affordance segmentation model, we use two types of data:
* **General Segmentation Data**: This follows [LISA](https://github.com/dvlab-research/LISA).
* **Affordance Segmentation Data**: This is a large-scale dataset that we collect.

### General Segmentation Data
These data is organized as follows:
```
./data/
├── lisa_data
│   ├── ade20k
│   ├── coco
│   ├── cocostuff
│   ├── llava_dataset
│   ├── mapillary
│   ├── reason_seg
│   ├── refer_seg
│   ├── vlpart
```

### Affordance Segmentation Data

We employ images from HANDAL, Open-X, GraspNet, EgoObjects, and RLBench in our affordance segmentation task. 

The HANDAL data is downloaded and organized according to its official [repo](https://github.com/NVlabs/HANDAL).
Other data can be downloaded from the [Hugging Face](https://huggingface.co/datasets/Dongming97/RAGNet).

The training data is organized as follows:
```
./data/
├── openx_train.pkl
├── graspnet_train.pkl
├── egoobjects_train.pkl
├── rlbench_train.pkl
├── handal_hard_reasoning_train.pkl
├── egoobjects_easy_reasoning_train.pkl
├── egoobjects_hard_reasoning_train.pkl
├── HANDAL
│   ├── without_depth
│       ├── handal_dataset_adjustable_wrenches
│       ├── handal_dataset_combinational_wrenches
│       ├── handal_dataset_fixed_joint_pliers
│       ├── ...
├── openx
│   ├── images
│       ├── fractal20220817_data
│       ├── bridge
│   ├── masks
│       ├── fractal20220817_data
│       ├── bridge
├── graspnet
│   ├── images
│   ├── masks
│   ├── test_seen
│   ├── test_novel
├── egoobjects
│   ├── images
│   ├── masks
├── rlbench
│   ├── images
│   ├── masks
├── 3doi
│   ├── images
│   ├── masks
```

The evaluation data is also in the same dictory, but with the `*_eval.pkl` files instead of `*_train.pkl`.

```
./data/
├── handal_mini_val.pkl
├── graspnet_test_seen_val.pkl
├── graspnet_test_novel_val.pkl
├── 3doi_val.pkl
├── handal_easy_reasoning_val.pkl
├── handal_hard_reasoning_val.pkl
├── 3doi_easy_reasoning_val.pkl
```

You can use the following script to confirm if data is organized correctly:
```bash
python data_curation/check_dataset.py
```

### About data curation
1. **SAM2**: We use SAM2 to generate affordance mask if the dataset provides box annotation.
2. **Florence-2 + SAM2**: We use Florence-2 to generate the initial segmentation masks of some complete objects, and then refine them with SAM2. Please see [Florence-2+SAM2](https://github.com/IDEA-Research/Grounded-SAM-2).
3. **VLPart + SAM2**: We use VLPart to generate box of object part, and then refine them with SAM2. We refer to [VLPart](https://github.com/facebookresearch/VLPart). 
We provide our inference demo scripts in `data_curation/build_vlpart.py` and `data_curation/vlpart_sam2_tracking.py`.
4. **Reasoning Instruction**: We provide two example scripts to generate reasoning instructions for the affordance segmentation task:
   - `data_curation/prompt_generation_handal_easy_reasoning.py`: This script generates easy reasoning instructions for the HANDAL dataset.
   - `data_curation/prompt_generation_handal_hard_reasoning.py`: This script generates hard reasoning instructions for the HANDAL dataset.