| ## Dataset |
|
|
| To train our affordance segmentation model, we use two types of data: |
| * **General Segmentation Data**: This follows [LISA](https://github.com/dvlab-research/LISA). |
| * **Affordance Segmentation Data**: This is a large-scale dataset that we collect. |
|
|
| ### General Segmentation Data |
| These data is organized as follows: |
| ``` |
| ./data/ |
| βββ lisa_data |
| β βββ ade20k |
| β βββ coco |
| β βββ cocostuff |
| β βββ llava_dataset |
| β βββ mapillary |
| β βββ reason_seg |
| β βββ refer_seg |
| β βββ vlpart |
| ``` |
|
|
| ### Affordance Segmentation Data |
|
|
| We employ images from HANDAL, Open-X, GraspNet, EgoObjects, and RLBench in our affordance segmentation task. |
|
|
| The HANDAL data is downloaded and organized according to its official [repo](https://github.com/NVlabs/HANDAL). |
| Other data can be downloaded from the [Hugging Face](https://huggingface.co/datasets/Dongming97/RAGNet). |
|
|
| The training data is organized as follows: |
| ``` |
| ./data/ |
| βββ openx_train.pkl |
| βββ graspnet_train.pkl |
| βββ egoobjects_train.pkl |
| βββ rlbench_train.pkl |
| βββ handal_hard_reasoning_train.pkl |
| βββ egoobjects_easy_reasoning_train.pkl |
| βββ egoobjects_hard_reasoning_train.pkl |
| βββ HANDAL |
| β βββ without_depth |
| β βββ handal_dataset_adjustable_wrenches |
| β βββ handal_dataset_combinational_wrenches |
| β βββ handal_dataset_fixed_joint_pliers |
| β βββ ... |
| βββ openx |
| β βββ images |
| β βββ fractal20220817_data |
| β βββ bridge |
| β βββ masks |
| β βββ fractal20220817_data |
| β βββ bridge |
| βββ graspnet |
| β βββ images |
| β βββ masks |
| β βββ test_seen |
| β βββ test_novel |
| βββ egoobjects |
| β βββ images |
| β βββ masks |
| βββ rlbench |
| β βββ images |
| β βββ masks |
| βββ 3doi |
| β βββ images |
| β βββ masks |
| ``` |
|
|
| The evaluation data is also in the same dictory, but with the `*_eval.pkl` files instead of `*_train.pkl`. |
|
|
| ``` |
| ./data/ |
| βββ handal_mini_val.pkl |
| βββ graspnet_test_seen_val.pkl |
| βββ graspnet_test_novel_val.pkl |
| βββ 3doi_val.pkl |
| βββ handal_easy_reasoning_val.pkl |
| βββ handal_hard_reasoning_val.pkl |
| βββ 3doi_easy_reasoning_val.pkl |
| ``` |
|
|
| You can use the following script to confirm if data is organized correctly: |
| ```bash |
| python data_curation/check_dataset.py |
| ``` |
|
|
| ### About data curation |
| 1. **SAM2**: We use SAM2 to generate affordance mask if the dataset provides box annotation. |
| 2. **Florence-2 + SAM2**: We use Florence-2 to generate the initial segmentation masks of some complete objects, and then refine them with SAM2. Please see [Florence-2+SAM2](https://github.com/IDEA-Research/Grounded-SAM-2). |
| 3. **VLPart + SAM2**: We use VLPart to generate box of object part, and then refine them with SAM2. We refer to [VLPart](https://github.com/facebookresearch/VLPart). |
| We provide our inference demo scripts in `data_curation/build_vlpart.py` and `data_curation/vlpart_sam2_tracking.py`. |
| 4. **Reasoning Instruction**: We provide two example scripts to generate reasoning instructions for the affordance segmentation task: |
| - `data_curation/prompt_generation_handal_easy_reasoning.py`: This script generates easy reasoning instructions for the HANDAL dataset. |
| - `data_curation/prompt_generation_handal_hard_reasoning.py`: This script generates hard reasoning instructions for the HANDAL dataset. |