Update model card with paper, project links, and metadata

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +46 -4
README.md CHANGED
@@ -1,17 +1,59 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
3
  ---
4
 
5
  # ConverSeg-Net-3B
6
 
7
- Raw checkpoints for **ConverSeg-Net**.
8
 
9
- These are **not** Hugging Face `from_pretrained` model files.
10
- They are meant to be downloaded and used with the ConverSeg codebase:
11
- https://github.com/AadSah/ConverSeg
 
 
 
 
 
 
12
 
13
  ## Download
14
 
15
  ```bash
16
  git lfs install
17
  git clone https://huggingface.co/aadarsh99/ConverSeg-Net-3B ./checkpoints/ConverSeg-Net-3B
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: image-segmentation
4
+ tags:
5
+ - conversational-image-segmentation
6
+ - lora
7
  ---
8
 
9
  # ConverSeg-Net-3B
10
 
11
+ This repository contains raw checkpoints for **ConverSeg-Net-3B**, introduced in the paper [Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision](https://huggingface.co/papers/2602.13195).
12
 
13
+ ConverSeg-Net is designed for Conversational Image Segmentation (CIS), which focuses on grounding abstract, intent-driven concepts, including functional and physical reasoning, into pixel-accurate masks.
14
+
15
+ - **Project Page:** [https://glab-caltech.github.io/converseg/](https://glab-caltech.github.io/converseg/)
16
+ - **Code:** [https://github.com/AadSah/ConverSeg](https://github.com/AadSah/ConverSeg)
17
+ - **Paper:** [arXiv:2602.13195](https://arxiv.org/abs/2602.13195)
18
+
19
+ ## Important Note
20
+
21
+ These are **not** Hugging Face `from_pretrained` model files. They are raw checkpoint files and LoRA adapter files meant to be downloaded and used with the official [ConverSeg codebase](https://github.com/AadSah/ConverSeg).
22
 
23
  ## Download
24
 
25
  ```bash
26
  git lfs install
27
  git clone https://huggingface.co/aadarsh99/ConverSeg-Net-3B ./checkpoints/ConverSeg-Net-3B
28
+ ```
29
+
30
+ ## Sample Usage
31
+
32
+ After cloning the [ConverSeg codebase](https://github.com/AadSah/ConverSeg) and setting up the environment, you can run inference using the `demo.py` script by pointing to the downloaded checkpoint paths:
33
+
34
+ ```bash
35
+ python demo.py \
36
+ --final_ckpt ./checkpoints/ConverSeg-Net-3B/ConverSeg-Net_sam2_90000.torch.torch \
37
+ --plm_ckpt ./checkpoints/ConverSeg-Net-3B/ConverSeg-Net_plm_90000.torch \
38
+ --lora_ckpt ./checkpoints/ConverSeg-Net-3B/lora_plm_adapter_90000 \
39
+ --model_cfg sam2_hiera_l.yaml \
40
+ --base_ckpt /path/to/sam2_hiera_large.pt \
41
+ --image /path/to/image.jpg \
42
+ --prompt "the left-most person" \
43
+ --device cuda \
44
+ --out_dir ./demo_outputs
45
+ ```
46
+
47
+ ## Citation
48
+
49
+ ```bibtex
50
+ @misc{sahoo2026conversationalimagesegmentationgrounding,
51
+ title = {Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision},
52
+ author = {Aadarsh Sahoo and Georgia Gkioxari},
53
+ year = {2026},
54
+ eprint = {2602.13195},
55
+ archivePrefix = {arXiv},
56
+ primaryClass = {cs.CV},
57
+ url = {https://arxiv.org/abs/2602.13195},
58
+ }
59
+ ```