File size: 3,512 Bytes
0453c63 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | ## Dataset
To train our affordance segmentation model, we use two types of data:
* **General Segmentation Data**: This follows [LISA](https://github.com/dvlab-research/LISA).
* **Affordance Segmentation Data**: This is a large-scale dataset that we collect.
### General Segmentation Data
These data is organized as follows:
```
./data/
βββ lisa_data
β βββ ade20k
β βββ coco
β βββ cocostuff
β βββ llava_dataset
β βββ mapillary
β βββ reason_seg
β βββ refer_seg
β βββ vlpart
```
### Affordance Segmentation Data
We employ images from HANDAL, Open-X, GraspNet, EgoObjects, and RLBench in our affordance segmentation task.
The HANDAL data is downloaded and organized according to its official [repo](https://github.com/NVlabs/HANDAL).
Other data can be downloaded from the [Hugging Face](https://huggingface.co/datasets/Dongming97/RAGNet).
The training data is organized as follows:
```
./data/
βββ openx_train.pkl
βββ graspnet_train.pkl
βββ egoobjects_train.pkl
βββ rlbench_train.pkl
βββ handal_hard_reasoning_train.pkl
βββ egoobjects_easy_reasoning_train.pkl
βββ egoobjects_hard_reasoning_train.pkl
βββ HANDAL
β βββ without_depth
β βββ handal_dataset_adjustable_wrenches
β βββ handal_dataset_combinational_wrenches
β βββ handal_dataset_fixed_joint_pliers
β βββ ...
βββ openx
β βββ images
β βββ fractal20220817_data
β βββ bridge
β βββ masks
β βββ fractal20220817_data
β βββ bridge
βββ graspnet
β βββ images
β βββ masks
β βββ test_seen
β βββ test_novel
βββ egoobjects
β βββ images
β βββ masks
βββ rlbench
β βββ images
β βββ masks
βββ 3doi
β βββ images
β βββ masks
```
The evaluation data is also in the same dictory, but with the `*_eval.pkl` files instead of `*_train.pkl`.
```
./data/
βββ handal_mini_val.pkl
βββ graspnet_test_seen_val.pkl
βββ graspnet_test_novel_val.pkl
βββ 3doi_val.pkl
βββ handal_easy_reasoning_val.pkl
βββ handal_hard_reasoning_val.pkl
βββ 3doi_easy_reasoning_val.pkl
```
You can use the following script to confirm if data is organized correctly:
```bash
python data_curation/check_dataset.py
```
### About data curation
1. **SAM2**: We use SAM2 to generate affordance mask if the dataset provides box annotation.
2. **Florence-2 + SAM2**: We use Florence-2 to generate the initial segmentation masks of some complete objects, and then refine them with SAM2. Please see [Florence-2+SAM2](https://github.com/IDEA-Research/Grounded-SAM-2).
3. **VLPart + SAM2**: We use VLPart to generate box of object part, and then refine them with SAM2. We refer to [VLPart](https://github.com/facebookresearch/VLPart).
We provide our inference demo scripts in `data_curation/build_vlpart.py` and `data_curation/vlpart_sam2_tracking.py`.
4. **Reasoning Instruction**: We provide two example scripts to generate reasoning instructions for the affordance segmentation task:
- `data_curation/prompt_generation_handal_easy_reasoning.py`: This script generates easy reasoning instructions for the HANDAL dataset.
- `data_curation/prompt_generation_handal_hard_reasoning.py`: This script generates hard reasoning instructions for the HANDAL dataset. |