nielsr HF Staff commited on
Commit
a0fd499
·
verified ·
1 Parent(s): b03759e

Update model card with paper, project links, and metadata

Browse files

Hi! I'm Niels from the Hugging Face community team. This PR improves your model card by adding relevant metadata and linking it to the official research paper, project page, and GitHub repository.

Specifically, it adds:
- The `image-segmentation` pipeline tag to help with discoverability.
- Links to the paper [Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision](https://huggingface.co/papers/2602.13195), the [Project Webpage](https://glab-caltech.github.io/converseg/), and the [GitHub Repository](https://github.com/AadSah/ConverSeg).
- A sample usage section explaining how to run inference with these checkpoints using your official codebase.

This should make the repository more informative and easier to discover for the community.

Files changed (1) hide show
  1. README.md +46 -4
README.md CHANGED
@@ -1,17 +1,59 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
3
  ---
4
 
5
  # ConverSeg-Net-3B
6
 
7
- Raw checkpoints for **ConverSeg-Net**.
8
 
9
- These are **not** Hugging Face `from_pretrained` model files.
10
- They are meant to be downloaded and used with the ConverSeg codebase:
11
- https://github.com/AadSah/ConverSeg
 
 
 
 
 
 
12
 
13
  ## Download
14
 
15
  ```bash
16
  git lfs install
17
  git clone https://huggingface.co/aadarsh99/ConverSeg-Net-3B ./checkpoints/ConverSeg-Net-3B
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: image-segmentation
4
+ tags:
5
+ - conversational-image-segmentation
6
+ - lora
7
  ---
8
 
9
  # ConverSeg-Net-3B
10
 
11
+ This repository contains raw checkpoints for **ConverSeg-Net-3B**, introduced in the paper [Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision](https://huggingface.co/papers/2602.13195).
12
 
13
+ ConverSeg-Net is designed for Conversational Image Segmentation (CIS), which focuses on grounding abstract, intent-driven concepts, including functional and physical reasoning, into pixel-accurate masks.
14
+
15
+ - **Project Page:** [https://glab-caltech.github.io/converseg/](https://glab-caltech.github.io/converseg/)
16
+ - **Code:** [https://github.com/AadSah/ConverSeg](https://github.com/AadSah/ConverSeg)
17
+ - **Paper:** [arXiv:2602.13195](https://arxiv.org/abs/2602.13195)
18
+
19
+ ## Important Note
20
+
21
+ These are **not** Hugging Face `from_pretrained` model files. They are raw checkpoint files and LoRA adapter files meant to be downloaded and used with the official [ConverSeg codebase](https://github.com/AadSah/ConverSeg).
22
 
23
  ## Download
24
 
25
  ```bash
26
  git lfs install
27
  git clone https://huggingface.co/aadarsh99/ConverSeg-Net-3B ./checkpoints/ConverSeg-Net-3B
28
+ ```
29
+
30
+ ## Sample Usage
31
+
32
+ After cloning the [ConverSeg codebase](https://github.com/AadSah/ConverSeg) and setting up the environment, you can run inference using the `demo.py` script by pointing to the downloaded checkpoint paths:
33
+
34
+ ```bash
35
+ python demo.py \
36
+ --final_ckpt ./checkpoints/ConverSeg-Net-3B/ConverSeg-Net_sam2_90000.torch.torch \
37
+ --plm_ckpt ./checkpoints/ConverSeg-Net-3B/ConverSeg-Net_plm_90000.torch \
38
+ --lora_ckpt ./checkpoints/ConverSeg-Net-3B/lora_plm_adapter_90000 \
39
+ --model_cfg sam2_hiera_l.yaml \
40
+ --base_ckpt /path/to/sam2_hiera_large.pt \
41
+ --image /path/to/image.jpg \
42
+ --prompt "the left-most person" \
43
+ --device cuda \
44
+ --out_dir ./demo_outputs
45
+ ```
46
+
47
+ ## Citation
48
+
49
+ ```bibtex
50
+ @misc{sahoo2026conversationalimagesegmentationgrounding,
51
+ title = {Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision},
52
+ author = {Aadarsh Sahoo and Georgia Gkioxari},
53
+ year = {2026},
54
+ eprint = {2602.13195},
55
+ archivePrefix = {arXiv},
56
+ primaryClass = {cs.CV},
57
+ url = {https://arxiv.org/abs/2602.13195},
58
+ }
59
+ ```