Dataset
To train our affordance segmentation model, we use two types of data:
- General Segmentation Data: This follows LISA.
- Affordance Segmentation Data: This is a large-scale dataset that we collect.
General Segmentation Data
These data is organized as follows:
./data/
βββ lisa_data
β βββ ade20k
β βββ coco
β βββ cocostuff
β βββ llava_dataset
β βββ mapillary
β βββ reason_seg
β βββ refer_seg
β βββ vlpart
Affordance Segmentation Data
We employ images from HANDAL, Open-X, GraspNet, EgoObjects, and RLBench in our affordance segmentation task.
The HANDAL data is downloaded and organized according to its official repo. Other data can be downloaded from the Hugging Face.
The training data is organized as follows:
./data/
βββ openx_train.pkl
βββ graspnet_train.pkl
βββ egoobjects_train.pkl
βββ rlbench_train.pkl
βββ handal_hard_reasoning_train.pkl
βββ egoobjects_easy_reasoning_train.pkl
βββ egoobjects_hard_reasoning_train.pkl
βββ HANDAL
β βββ without_depth
β βββ handal_dataset_adjustable_wrenches
β βββ handal_dataset_combinational_wrenches
β βββ handal_dataset_fixed_joint_pliers
β βββ ...
βββ openx
β βββ images
β βββ fractal20220817_data
β βββ bridge
β βββ masks
β βββ fractal20220817_data
β βββ bridge
βββ graspnet
β βββ images
β βββ masks
β βββ test_seen
β βββ test_novel
βββ egoobjects
β βββ images
β βββ masks
βββ rlbench
β βββ images
β βββ masks
βββ 3doi
β βββ images
β βββ masks
The evaluation data is also in the same dictory, but with the *_eval.pkl files instead of *_train.pkl.
./data/
βββ handal_mini_val.pkl
βββ graspnet_test_seen_val.pkl
βββ graspnet_test_novel_val.pkl
βββ 3doi_val.pkl
βββ handal_easy_reasoning_val.pkl
βββ handal_hard_reasoning_val.pkl
βββ 3doi_easy_reasoning_val.pkl
You can use the following script to confirm if data is organized correctly:
python data_curation/check_dataset.py
About data curation
- SAM2: We use SAM2 to generate affordance mask if the dataset provides box annotation.
- Florence-2 + SAM2: We use Florence-2 to generate the initial segmentation masks of some complete objects, and then refine them with SAM2. Please see Florence-2+SAM2.
- VLPart + SAM2: We use VLPart to generate box of object part, and then refine them with SAM2. We refer to VLPart.
We provide our inference demo scripts in
data_curation/build_vlpart.pyanddata_curation/vlpart_sam2_tracking.py. - Reasoning Instruction: We provide two example scripts to generate reasoning instructions for the affordance segmentation task:
data_curation/prompt_generation_handal_easy_reasoning.py: This script generates easy reasoning instructions for the HANDAL dataset.data_curation/prompt_generation_handal_hard_reasoning.py: This script generates hard reasoning instructions for the HANDAL dataset.