--- license: mit --- # [CVPR 2025] GFS-VL: Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model ## Overview GFS-VL is a novel framework proposed in our CVPR 2025 paper: [**Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model**](https://arxiv.org/pdf/2503.16282). Our approach leverages the synergy between: - **Dense but noisy pseudo-labels** from 3D Vision-Language Models - **Precise yet sparse few-shot samples** by maximizing the strengths of both data sources for effective generalized few-shot 3D point cloud segmentation. ## Released Model Weights This repository contains the following pre-trained weights: - **PTv3 Backbones**: Our pre-trained point transformer v3 backbones - **GFS-VL Models**: Complete GFS_VL few-shot segmentation framework ## Usage For detailed usage instructions, model implementation, and training code, please refer to our [GitHub repository](https://github.com/ZhaochongAn/GFS-VL). ## Benchmarks We introduce **two new challenging GFS-PCS benchmarks** with diverse novel classes for comprehensive generalization evaluation. These benchmarks lay a solid foundation for real-world GFS-PCS advancements. The benchmark datasets can be downloaded from our [Huggingface dataset repository](https://huggingface.co/datasets/ZhaochongAn/GFS_PCS_Datasets). ## Citation If you find our work useful, please consider citing our paper: ```bibtex @inproceedings{an2025generalized, title={Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model}, author={An, Zhaochong and Sun, Guolei and Liu, Yun and Li, Runjia and Han, Junlin and Konukoglu, Ender and Belongie, Serge}, booktitle=CVPR, year={2025} } ```