Improve model card: Add metadata and project page link

This PR improves the model card for the DenseVLM model by:

- Setting the correct `license` in the metadata to `cc-by-nc-4.0`, as explicitly stated in the repository's license section.
- Adding the `pipeline_tag: zero-shot-object-detection` to ensure the model can be discovered via relevant filters on the Hugging Face Hub, aligning with its capabilities for "open-vocabulary dense prediction".
- Specifying the `library_name: open_clip`, as the model's usage examples clearly show compatibility and interaction with the `open_clip` library.
- Including a link to the first author's personal page (`https://lyhisme.github.io/`) as the `Project Page`, which serves as a suitable project overview given no dedicated project page was provided.
- Ensuring the existing Arxiv paper link is retained, as per instructions not to replace an existing Arxiv link with a Hugging Face paper link.
- Making a minor improvement to the "Quick Start" section by adding `import open_clip` for better code clarity.

These changes enhance the model card's accuracy, completeness, and discoverability for users.

Files changed (1) hide show

README.md +9 -7

README.md CHANGED Viewed

@@ -1,3 +1,8 @@
 <p align="center">
   <h1 align="center">Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction</h1>
@@ -16,7 +21,7 @@
   </p>
   <h2 align="center">Accepted By ICCV 2025!</h2>
-### [[Paper](https://arxiv.org/pdf/2412.06244)] [[Github](https://github.com/HVision-NKU/DenseVLM)] [[Pretrained models](https://github.com/HVision-NKU/DenseVLM/tree/main#)]
 ## Contributions
 - 🔥 We identify the foreground bias issue in existing VLMs and propose region-text alignment by incorporating explicit semantic structuring through category guidance.
@@ -111,6 +116,8 @@ DenseVLM/
 If using a fine-tuned CLIP, you can directly use it. For example:
 ```python
 model = open_clip.create_model(
     'EVA02-CLIP-B-16', pretrained='eva', cache_dir='checkpoints/densevlm_coco_6_save6_512_eva_vib16_12layers.pt'
 )
@@ -151,9 +158,4 @@ If you find this project useful, please consider citing:
   month={21--27 Jul},
   publisher={PMLR}
 }
-```
-## License
-Licensed under a [Creative Commons Attribution-NonCommercial 4.0 International](https://creativecommons.org/licenses/by-nc/4.0/) for Non-commercial use only.
-Any commercial use should get formal permission first.

+---
+license: cc-by-nc-4.0
+pipeline_tag: zero-shot-object-detection
+library_name: open_clip
+---
 <p align="center">
   <h1 align="center">Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction</h1>
   </p>
   <h2 align="center">Accepted By ICCV 2025!</h2>
+### [[Paper](https://arxiv.org/pdf/2412.06244)] [[Project Page](https://lyhisme.github.io/)] [[Github](https://github.com/HVision-NKU/DenseVLM)] [[Pretrained models](https://github.com/HVision-NKU/DenseVLM/tree/main#)]
 ## Contributions
 - 🔥 We identify the foreground bias issue in existing VLMs and propose region-text alignment by incorporating explicit semantic structuring through category guidance.
 If using a fine-tuned CLIP, you can directly use it. For example:
 ```python
+import open_clip
 model = open_clip.create_model(
     'EVA02-CLIP-B-16', pretrained='eva', cache_dir='checkpoints/densevlm_coco_6_save6_512_eva_vib16_12layers.pt'
 )
   month={21--27 Jul},
   publisher={PMLR}
 }
+```