RAGNet / docs /dataset.md
wangzeze's picture
Upload folder using huggingface_hub
0453c63 verified
## Dataset
To train our affordance segmentation model, we use two types of data:
* **General Segmentation Data**: This follows [LISA](https://github.com/dvlab-research/LISA).
* **Affordance Segmentation Data**: This is a large-scale dataset that we collect.
### General Segmentation Data
These data is organized as follows:
```
./data/
β”œβ”€β”€ lisa_data
β”‚ β”œβ”€β”€ ade20k
β”‚ β”œβ”€β”€ coco
β”‚ β”œβ”€β”€ cocostuff
β”‚ β”œβ”€β”€ llava_dataset
β”‚ β”œβ”€β”€ mapillary
β”‚ β”œβ”€β”€ reason_seg
β”‚ β”œβ”€β”€ refer_seg
β”‚ β”œβ”€β”€ vlpart
```
### Affordance Segmentation Data
We employ images from HANDAL, Open-X, GraspNet, EgoObjects, and RLBench in our affordance segmentation task.
The HANDAL data is downloaded and organized according to its official [repo](https://github.com/NVlabs/HANDAL).
Other data can be downloaded from the [Hugging Face](https://huggingface.co/datasets/Dongming97/RAGNet).
The training data is organized as follows:
```
./data/
β”œβ”€β”€ openx_train.pkl
β”œβ”€β”€ graspnet_train.pkl
β”œβ”€β”€ egoobjects_train.pkl
β”œβ”€β”€ rlbench_train.pkl
β”œβ”€β”€ handal_hard_reasoning_train.pkl
β”œβ”€β”€ egoobjects_easy_reasoning_train.pkl
β”œβ”€β”€ egoobjects_hard_reasoning_train.pkl
β”œβ”€β”€ HANDAL
β”‚ β”œβ”€β”€ without_depth
β”‚ β”œβ”€β”€ handal_dataset_adjustable_wrenches
β”‚ β”œβ”€β”€ handal_dataset_combinational_wrenches
β”‚ β”œβ”€β”€ handal_dataset_fixed_joint_pliers
β”‚ β”œβ”€β”€ ...
β”œβ”€β”€ openx
β”‚ β”œβ”€β”€ images
β”‚ β”œβ”€β”€ fractal20220817_data
β”‚ β”œβ”€β”€ bridge
β”‚ β”œβ”€β”€ masks
β”‚ β”œβ”€β”€ fractal20220817_data
β”‚ β”œβ”€β”€ bridge
β”œβ”€β”€ graspnet
β”‚ β”œβ”€β”€ images
β”‚ β”œβ”€β”€ masks
β”‚ β”œβ”€β”€ test_seen
β”‚ β”œβ”€β”€ test_novel
β”œβ”€β”€ egoobjects
β”‚ β”œβ”€β”€ images
β”‚ β”œβ”€β”€ masks
β”œβ”€β”€ rlbench
β”‚ β”œβ”€β”€ images
β”‚ β”œβ”€β”€ masks
β”œβ”€β”€ 3doi
β”‚ β”œβ”€β”€ images
β”‚ β”œβ”€β”€ masks
```
The evaluation data is also in the same dictory, but with the `*_eval.pkl` files instead of `*_train.pkl`.
```
./data/
β”œβ”€β”€ handal_mini_val.pkl
β”œβ”€β”€ graspnet_test_seen_val.pkl
β”œβ”€β”€ graspnet_test_novel_val.pkl
β”œβ”€β”€ 3doi_val.pkl
β”œβ”€β”€ handal_easy_reasoning_val.pkl
β”œβ”€β”€ handal_hard_reasoning_val.pkl
β”œβ”€β”€ 3doi_easy_reasoning_val.pkl
```
You can use the following script to confirm if data is organized correctly:
```bash
python data_curation/check_dataset.py
```
### About data curation
1. **SAM2**: We use SAM2 to generate affordance mask if the dataset provides box annotation.
2. **Florence-2 + SAM2**: We use Florence-2 to generate the initial segmentation masks of some complete objects, and then refine them with SAM2. Please see [Florence-2+SAM2](https://github.com/IDEA-Research/Grounded-SAM-2).
3. **VLPart + SAM2**: We use VLPart to generate box of object part, and then refine them with SAM2. We refer to [VLPart](https://github.com/facebookresearch/VLPart).
We provide our inference demo scripts in `data_curation/build_vlpart.py` and `data_curation/vlpart_sam2_tracking.py`.
4. **Reasoning Instruction**: We provide two example scripts to generate reasoning instructions for the affordance segmentation task:
- `data_curation/prompt_generation_handal_easy_reasoning.py`: This script generates easy reasoning instructions for the HANDAL dataset.
- `data_curation/prompt_generation_handal_hard_reasoning.py`: This script generates hard reasoning instructions for the HANDAL dataset.