| --- |
| datasets: |
| - lmms-lab/RefCOCO |
| license: apache-2.0 |
| pipeline_tag: image-segmentation |
| tags: |
| - multimodal |
| - referring-image-segmentation |
| --- |
| |
| # TALENT: Target-aware Efficient Tuning for Referring Image Segmentation |
|
|
| TALENT is a framework for Referring Image Segmentation (RIS) designed to address the "non-target activation" (NTA) issue in parameter-efficient tuning. It introduces a Rectified Cost Aggregator (RCA) to aggregate text-referred features and a Target-aware Learning Mechanism (TLM) to calibrate activation into accurate target localization. |
|
|
| ## Resources |
|
|
| - **Paper:** [TALENT: Target-aware Efficient Tuning for Referring Image Segmentation](https://huggingface.co/papers/2604.00609) |
| - **Repository:** [GitHub - Kimsure/TALENT](https://github.com/Kimsure/TALENT) |
|
|
| ## Usage |
|
|
| To evaluate the model, follow the installation instructions in the [GitHub repository](https://github.com/Kimsure/TALENT) and run the following script: |
|
|
| ```bash |
| bash run_scripts/test.sh |
| ``` |
|
|
| To visualize the results, you can set the `visualize` flag to `True` in the configuration file. |
|
|
| ## Acknowledgements |
|
|
| The code for TALENT is based on [CRIS](https://github.com/DerrickWang005/CRIS.pytorch), [ETRIS](https://github.com/kkakkkka/ETRIS), and previous TALENT implementations. We thank the authors for their open-sourced code. |
|
|
| ## Citation |
|
|
| If you find this work useful, please cite: |
|
|
| ```bibtex |
| @article{talent2026, |
| title={TALENT: Target-aware Efficient Tuning for Referring Image Segmentation}, |
| author={Shuo Jin, Siyue Yu, Bingfeng Zhang, Chao Yao, Meiqin Liu, Jimin Xiao}, |
| journal={arXiv preprint arXiv:2604.00609}, |
| year={2026} |
| } |
| ``` |