---
license: mit
---

# [CVPR 2025] GFS-VL: Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model

## Overview

GFS-VL is a novel framework proposed in our CVPR 2025 paper: [**Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model**](https://arxiv.org/pdf/2503.16282).

Our approach leverages the synergy between:
- **Dense but noisy pseudo-labels** from 3D Vision-Language Models
- **Precise yet sparse few-shot samples**

by maximizing the strengths of both data sources for effective generalized few-shot 3D point cloud segmentation.

## Released Model Weights

This repository contains the following pre-trained weights:
- **PTv3 Backbones**: Our pre-trained point transformer v3 backbones
- **GFS-VL Models**: Complete GFS_VL few-shot segmentation framework

## Usage

For detailed usage instructions, model implementation, and training code, please refer to our [GitHub repository](https://github.com/ZhaochongAn/GFS-VL).

## Benchmarks

We introduce **two new challenging GFS-PCS benchmarks** with diverse novel classes for comprehensive generalization evaluation. These benchmarks lay a solid foundation for real-world GFS-PCS advancements.

The benchmark datasets can be downloaded from our [Huggingface dataset repository](https://huggingface.co/datasets/ZhaochongAn/GFS_PCS_Datasets).

## Citation

If you find our work useful, please consider citing our paper:

```bibtex
@inproceedings{an2025generalized,
  title={Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model},
  author={An, Zhaochong and Sun, Guolei and Liu, Yun and Li, Runjia and Han, Junlin and Konukoglu, Ender and Belongie, Serge},
  booktitle=CVPR,
  year={2025}
}
```