GFS_VL / README.md
ZhaochongAn
readme
403277f
---
license: mit
---
# [CVPR 2025] GFS-VL: Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model
## Overview
GFS-VL is a novel framework proposed in our CVPR 2025 paper: [**Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model**](https://arxiv.org/pdf/2503.16282).
Our approach leverages the synergy between:
- **Dense but noisy pseudo-labels** from 3D Vision-Language Models
- **Precise yet sparse few-shot samples**
by maximizing the strengths of both data sources for effective generalized few-shot 3D point cloud segmentation.
## Released Model Weights
This repository contains the following pre-trained weights:
- **PTv3 Backbones**: Our pre-trained point transformer v3 backbones
- **GFS-VL Models**: Complete GFS_VL few-shot segmentation framework
## Usage
For detailed usage instructions, model implementation, and training code, please refer to our [GitHub repository](https://github.com/ZhaochongAn/GFS-VL).
## Benchmarks
We introduce **two new challenging GFS-PCS benchmarks** with diverse novel classes for comprehensive generalization evaluation. These benchmarks lay a solid foundation for real-world GFS-PCS advancements.
The benchmark datasets can be downloaded from our [Huggingface dataset repository](https://huggingface.co/datasets/ZhaochongAn/GFS_PCS_Datasets).
## Citation
If you find our work useful, please consider citing our paper:
```bibtex
@inproceedings{an2025generalized,
title={Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model},
author={An, Zhaochong and Sun, Guolei and Liu, Yun and Li, Runjia and Han, Junlin and Konukoglu, Ender and Belongie, Serge},
booktitle=CVPR,
year={2025}
}
```