| --- |
| license: cc-by-nc-sa-4.0 |
| library_name: transformers |
| pipeline_tag: image-classification |
| --- |
| |
| # SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning (ICLR 2024) |
|
|
| This repository contains the model described in https://arxiv.org/abs/2403.13684. |
|
|
| Code: https://github.com/Visual-AI/SPTNet |
|
|
| <p align="center"> |
| <a href="https://arxiv.org/abs/2403.13684"><img src="https://img.shields.io/badge/arXiv-2403.13684-b31b1b"></a> <a href="https://visual-ai.github.io/sptnet/"><img src="https://img.shields.io/badge/Project-Website-blue"></a><a href="#jump"><img src="https://img.shields.io/badge/Citation-8A2BE2"></a> |
| </p> |
| <p align="center"> |
| SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning <br> |
| By |
| <a href="https://whj363636.github.io/">Hongjun Wang</a>, |
| <a href="https://sgvaze.github.io/">Sagar Vaze</a>, and |
| <a href="https://www.kaihan.org/">Kai Han</a>. |
| </p> |
| |
|
|
| [05.2024] We update the results of SPTNet with DINOv2 on CUB, please check our latest version in [Arxiv](https://arxiv.org/abs/2403.13684) |
|
|
| | | All | Old | New | |
| |---------------|------|------|------| |
| | CUB (DINO) | 65.8 | 68.8 | 65.1 | |
| | CUB (DINOv2) | 76.3 | 79.5 | 74.6 | |
|
|
|
|
|
|
| ## Results |
| Generic results: |
| | | All | Old | New | |
| |--------------|------|------|------| |
| | CIFAR-10 | 97.3 | 95.0 | 98.6 | |
| | CIFAR-100 | 81.3 | 84.3 | 75.6 | |
| | ImageNet-100 | 85.4 | 93.2 | 81.4 | |
|
|
| Fine-grained results: |
| | | All | Old | New | |
| |---------------|------|------|------| |
| | CUB | 65.8 | 68.8 | 65.1 | |
| | Stanford Cars | 59.0 | 79.2 | 49.3 | |
| | FGVC-Aircraft | 59.3 | 61.8 | 58.1 | |
| | Herbarium19 | 43.4 | 58.7 | 35.2 | |
|
|
|
|
|
|
| ## Citing this work |
| <span id="jump"></span> |
| If you find this repo useful for your research, please consider citing our paper: |
|
|
| ``` |
| @inproceedings{wang2024sptnet, |
| author = {Wang, Hongjun and Vaze, Sagar and Han, Kai}, |
| title = {SPTNet: An Efficient Alternative Framework for Generalized Category Discovery with Spatial Prompt Tuning}, |
| booktitle = {International Conference on Learning Representations (ICLR)}, |
| year = {2024} |
| } |
| ``` |