## Download Raw Data Our final model uses a subset of [Objaverse](https://huggingface.co/datasets/allenai/objaverse) provided by [LGM](https://github.com/ashawkey/objaverse_filter) ## Data Preprocess We provide several scripts to preprocess the raw GLB files [here](./preprocess/). These scripts are minimal implementations and illustrate the whole preprocessing pipeline on a single 3D object. 1. Sample points from mesh surface ``` python datasets/preprocess/mesh_to_point.py --input assets/objects/scissors.glb --output preprocessed_data ``` 2. Render images ``` python datasets/preprocess/render.py --input assets/objects/scissors.glb --output preprocessed_data ``` 3. Remove background for rendered images and resize to 90% ``` python datasets/preprocess/rmbg.py --input preprocessed_data/scissors/rendering.png --output preprocessed_data ``` 4. (Optional) Calculate IoU ``` python datasets/preprocess/calculate_iou.py --input assets/objects/scissors.glb --output preprocessed_data ``` After preprocessing, you can generate a dataset configuration file according to the example configuration file with your own data path. To preprocess a folder of meshes, run ``` python datasets/preprocess/preprocess.py --input assets/objects --output preprocessed_data ``` This will also generate a configuration file in `./preprocessed_data/object_part_configs.json`. ## Dataset Configuration The training code requires specific format of dataset configuration. I provide an example configuration [here](example_configs.json). You can use it as a template to configure your own dataset. A minimal legal configuration file should be like: ``` [ { "mesh_path": "/path/to/object.glb", "surface_path": "/path/to/object.npy", "image_path": "/path/to/object.png", "num_parts": 4, "iou_mean": 0.5, "iou_max": 0.9, "valid": true }, { ... }, ... ] ``` Explaination: - `mesh_path`: The path to the GLB file of the object. - `surface_path`: The path to the npy file of the object surface points. - `image_path`: The path to the rendered image of the object (after removing background). - `num_parts`: The number of parts of the object. - `iou_mean`: The mean IoU of the object parts. - `iou_max`: The max IoU of the object parts. - `valid`: Whether the object is valid. If set to false, the object will be filtered out during training.