Enhance model card for Label Anything with metadata, links, abstract, and usage
#1
by
nielsr
HF Staff
- opened
README.md
CHANGED
|
@@ -1,12 +1,102 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
| 2 |
tags:
|
| 3 |
- model_hub_mixin
|
| 4 |
- pytorch_model_hub_mixin
|
| 5 |
---
|
| 6 |
|
| 7 |
-
|
| 8 |
-
- Code: [More Information Needed]
|
| 9 |
-
- ArXiv: https://arxiv.org/abs/2407.02075
|
| 10 |
-
- Docs: [More Information Needed]
|
| 11 |
|
| 12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
license: mit
|
| 3 |
+
pipeline_tag: image-segmentation
|
| 4 |
tags:
|
| 5 |
- model_hub_mixin
|
| 6 |
- pytorch_model_hub_mixin
|
| 7 |
---
|
| 8 |
|
| 9 |
+
# π·οΈ [Label Anything](https://pasqualedem.github.io/LabelAnything/)
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
+
### Multi-Class Few-Shot Semantic Segmentation with Visual Prompts
|
| 12 |
+
|
| 13 |
+
[](https://pasqualedem.github.io/LabelAnything/)
|
| 14 |
+
[](https://paperswithcode.com/sota/few-shot-semantic-segmentation-on-coco-20i-2-1?p=label-anything-multi-class-few-shot-semantic)
|
| 15 |
+
[](https://arxiv.org/abs/2407.02075)
|
| 16 |
+
[](https://ecai2025.org/)
|
| 17 |
+
[](https://www.python.org/downloads/)
|
| 18 |
+
[](https://github.com/pasqualedem/LabelAnything/blob/main/LICENSE)
|
| 19 |
+
|
| 20 |
+
## Overview
|
| 21 |
+
|
| 22 |
+
**Label Anything** is a novel method for multi-class few-shot semantic segmentation using visual prompts. This repository contains the official implementation of our ECAI 2025 paper, enabling precise segmentation with just a few prompted examples.
|
| 23 |
+
|
| 24 |
+
The model was presented in the paper [Label Anything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts](https://huggingface.co/papers/2407.02075).
|
| 25 |
+
|
| 26 |
+
### Abstract
|
| 27 |
+
|
| 28 |
+
Few-shot semantic segmentation aims to segment objects from previously unseen classes using only a limited number of labeled examples. In this paper, we introduce Label Anything, a novel transformer-based architecture designed for multi-prompt, multi-way few-shot semantic segmentation. Our approach leverages diverse visual prompts -- points, bounding boxes, and masks -- to create a highly flexible and generalizable framework that significantly reduces annotation burden while maintaining high accuracy. Label Anything makes three key contributions: ($\textit{i}$) we introduce a new task formulation that relaxes conventional few-shot segmentation constraints by supporting various types of prompts, multi-class classification, and enabling multiple prompts within a single image; ($\textit{ii}$) we propose a novel architecture based on transformers and attention mechanisms; and ($\textit{iii}$) we design a versatile training procedure allowing our model to operate seamlessly across different $N$-way $K$-shot and prompt-type configurations with a single trained model. Our extensive experimental evaluation on the widely used COCO-$20^i$ benchmark demonstrates that Label Anything achieves state-of-the-art performance among existing multi-way few-shot segmentation methods, while significantly outperforming leading single-class models when evaluated in multi-class settings. Code and trained models are available at this https URL .
|
| 29 |
+
|
| 30 |
+
<div align="center">
|
| 31 |
+
|
| 32 |
+

|
| 33 |
+
|
| 34 |
+
*Visual prompting meets few-shot learning with a new fast and efficient architecture.*
|
| 35 |
+
|
| 36 |
+
</div>
|
| 37 |
+
|
| 38 |
+
## Links
|
| 39 |
+
|
| 40 |
+
* **Paper on Hugging Face:** [https://huggingface.co/papers/2407.02075](https://huggingface.co/papers/2407.02075)
|
| 41 |
+
* **Project Page:** [https://pasqualedem.github.io/LabelAnything/](https://pasqualedem.github.io/LabelAnything/)
|
| 42 |
+
* **GitHub Repository:** [https://github.com/pasqualedem/LabelAnything](https://github.com/pasqualedem/LabelAnything)
|
| 43 |
+
|
| 44 |
+
## π Quick Start
|
| 45 |
+
|
| 46 |
+
### β‘ One-Line Demo
|
| 47 |
+
|
| 48 |
+
Experience Label Anything instantly with our streamlined demo:
|
| 49 |
+
|
| 50 |
+
```bash
|
| 51 |
+
uvx --from git+https://github.com/pasqualedem/LabelAnything app
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
> **π‘ Pro Tip**: This command uses [uv](https://docs.astral.sh/uv/) for lightning-fast package management and execution.
|
| 55 |
+
|
| 56 |
+
### π οΈ Manual Installation
|
| 57 |
+
|
| 58 |
+
For development and customization:
|
| 59 |
+
|
| 60 |
+
```bash
|
| 61 |
+
# Clone the repository
|
| 62 |
+
git clone https://github.com/pasqualedem/LabelAnything.git
|
| 63 |
+
cd LabelAnything
|
| 64 |
+
|
| 65 |
+
# Create virtual environment with uv
|
| 66 |
+
uv sync
|
| 67 |
+
source .venv/bin/activate
|
| 68 |
+
```
|
| 69 |
+
|
| 70 |
+
> **β οΈ System Requirements**: Linux environment with CUDA 12.1 support
|
| 71 |
+
|
| 72 |
+
### π Model Loading
|
| 73 |
+
|
| 74 |
+
```python
|
| 75 |
+
from label_anything.models import LabelAnything
|
| 76 |
+
|
| 77 |
+
# Load pre-trained model
|
| 78 |
+
model = LabelAnything.from_pretrained("pasqualedem/label_anything_sam_1024_coco")
|
| 79 |
+
```
|
| 80 |
+
|
| 81 |
+
## π¦ Pre-trained Models
|
| 82 |
+
|
| 83 |
+
Access our collection of state-of-the-art checkpoints:
|
| 84 |
+
|
| 85 |
+
| π§ Encoder | π Embedding Size | πΌοΈ Image Size | π Fold | π Checkpoint |
|
| 86 |
+
|------------|-------------------|----------------|----------|---------------|
|
| 87 |
+
| **SAM** | 512 | 1024 | - | [](https://huggingface.co/pasqualedem/label_anything_sam_1024_coco) |
|
| 88 |
+
| **ViT-MAE** | 256 | 480 | - | [](https://huggingface.co/pasqualedem/label_anything_mae_480_coco) |
|
| 89 |
+
| **ViT-MAE** | 256 | 480 | 0 | [](https://huggingface.co/pasqualedem/label_anything_coco_fold0_mae_7a5p0t63) |
|
| 90 |
+
|
| 91 |
+
## π Citation
|
| 92 |
+
|
| 93 |
+
If you find Label Anything useful in your research, please cite our work:
|
| 94 |
+
|
| 95 |
+
```bibtex
|
| 96 |
+
@inproceedings{labelanything2025,
|
| 97 |
+
title={LabelAnything: Multi-Class Few-Shot Semantic Segmentation with Visual Prompts},
|
| 98 |
+
author={De Marinis, Pasquale and Fanelli, Nicola and Scaringi, Raffaele and Colonna, Emanuele and Fiameni, Giuseppe and Vessio, Gennaro and Castellano, Giovanna},
|
| 99 |
+
booktitle={ECAI 2025},
|
| 100 |
+
year={2025}
|
| 101 |
+
}
|
| 102 |
+
```
|