Update model card with paper, project links, and metadata
Browse filesHi! I'm Niels from the Hugging Face community team. This PR improves your model card by adding relevant metadata and linking it to the official research paper, project page, and GitHub repository.
Specifically, it adds:
- The `image-segmentation` pipeline tag to help with discoverability.
- Links to the paper [Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision](https://huggingface.co/papers/2602.13195), the [Project Webpage](https://glab-caltech.github.io/converseg/), and the [GitHub Repository](https://github.com/AadSah/ConverSeg).
- A sample usage section explaining how to run inference with these checkpoints using your official codebase.
This should make the repository more informative and easier to discover for the community.
|
@@ -1,17 +1,59 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
# ConverSeg-Net-3B
|
| 6 |
|
| 7 |
-
|
| 8 |
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
https://github.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
|
| 13 |
## Download
|
| 14 |
|
| 15 |
```bash
|
| 16 |
git lfs install
|
| 17 |
git clone https://huggingface.co/aadarsh99/ConverSeg-Net-3B ./checkpoints/ConverSeg-Net-3B
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
pipeline_tag: image-segmentation
|
| 4 |
+
tags:
|
| 5 |
+
- conversational-image-segmentation
|
| 6 |
+
- lora
|
| 7 |
---
|
| 8 |
|
| 9 |
# ConverSeg-Net-3B
|
| 10 |
|
| 11 |
+
This repository contains raw checkpoints for **ConverSeg-Net-3B**, introduced in the paper [Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision](https://huggingface.co/papers/2602.13195).
|
| 12 |
|
| 13 |
+
ConverSeg-Net is designed for Conversational Image Segmentation (CIS), which focuses on grounding abstract, intent-driven concepts, including functional and physical reasoning, into pixel-accurate masks.
|
| 14 |
+
|
| 15 |
+
- **Project Page:** [https://glab-caltech.github.io/converseg/](https://glab-caltech.github.io/converseg/)
|
| 16 |
+
- **Code:** [https://github.com/AadSah/ConverSeg](https://github.com/AadSah/ConverSeg)
|
| 17 |
+
- **Paper:** [arXiv:2602.13195](https://arxiv.org/abs/2602.13195)
|
| 18 |
+
|
| 19 |
+
## Important Note
|
| 20 |
+
|
| 21 |
+
These are **not** Hugging Face `from_pretrained` model files. They are raw checkpoint files and LoRA adapter files meant to be downloaded and used with the official [ConverSeg codebase](https://github.com/AadSah/ConverSeg).
|
| 22 |
|
| 23 |
## Download
|
| 24 |
|
| 25 |
```bash
|
| 26 |
git lfs install
|
| 27 |
git clone https://huggingface.co/aadarsh99/ConverSeg-Net-3B ./checkpoints/ConverSeg-Net-3B
|
| 28 |
+
```
|
| 29 |
+
|
| 30 |
+
## Sample Usage
|
| 31 |
+
|
| 32 |
+
After cloning the [ConverSeg codebase](https://github.com/AadSah/ConverSeg) and setting up the environment, you can run inference using the `demo.py` script by pointing to the downloaded checkpoint paths:
|
| 33 |
+
|
| 34 |
+
```bash
|
| 35 |
+
python demo.py \
|
| 36 |
+
--final_ckpt ./checkpoints/ConverSeg-Net-3B/ConverSeg-Net_sam2_90000.torch.torch \
|
| 37 |
+
--plm_ckpt ./checkpoints/ConverSeg-Net-3B/ConverSeg-Net_plm_90000.torch \
|
| 38 |
+
--lora_ckpt ./checkpoints/ConverSeg-Net-3B/lora_plm_adapter_90000 \
|
| 39 |
+
--model_cfg sam2_hiera_l.yaml \
|
| 40 |
+
--base_ckpt /path/to/sam2_hiera_large.pt \
|
| 41 |
+
--image /path/to/image.jpg \
|
| 42 |
+
--prompt "the left-most person" \
|
| 43 |
+
--device cuda \
|
| 44 |
+
--out_dir ./demo_outputs
|
| 45 |
+
```
|
| 46 |
+
|
| 47 |
+
## Citation
|
| 48 |
+
|
| 49 |
+
```bibtex
|
| 50 |
+
@misc{sahoo2026conversationalimagesegmentationgrounding,
|
| 51 |
+
title = {Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision},
|
| 52 |
+
author = {Aadarsh Sahoo and Georgia Gkioxari},
|
| 53 |
+
year = {2026},
|
| 54 |
+
eprint = {2602.13195},
|
| 55 |
+
archivePrefix = {arXiv},
|
| 56 |
+
primaryClass = {cs.CV},
|
| 57 |
+
url = {https://arxiv.org/abs/2602.13195},
|
| 58 |
+
}
|
| 59 |
+
```
|