Improve model card: Add metadata and project page link
Browse filesThis PR improves the model card for the DenseVLM model by:
- Setting the correct `license` in the metadata to `cc-by-nc-4.0`, as explicitly stated in the repository's license section.
- Adding the `pipeline_tag: zero-shot-object-detection` to ensure the model can be discovered via relevant filters on the Hugging Face Hub, aligning with its capabilities for "open-vocabulary dense prediction".
- Specifying the `library_name: open_clip`, as the model's usage examples clearly show compatibility and interaction with the `open_clip` library.
- Including a link to the first author's personal page (`https://lyhisme.github.io/`) as the `Project Page`, which serves as a suitable project overview given no dedicated project page was provided.
- Ensuring the existing Arxiv paper link is retained, as per instructions not to replace an existing Arxiv link with a Hugging Face paper link.
- Making a minor improvement to the "Quick Start" section by adding `import open_clip` for better code clarity.
These changes enhance the model card's accuracy, completeness, and discoverability for users.
|
@@ -1,3 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
|
| 2 |
<p align="center">
|
| 3 |
<h1 align="center">Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction</h1>
|
|
@@ -16,7 +21,7 @@
|
|
| 16 |
</p>
|
| 17 |
<h2 align="center">Accepted By ICCV 2025!</h2>
|
| 18 |
|
| 19 |
-
### [[Paper](https://arxiv.org/pdf/2412.06244)] [[Github](https://github.com/HVision-NKU/DenseVLM)] [[Pretrained models](https://github.com/HVision-NKU/DenseVLM/tree/main#)]
|
| 20 |
|
| 21 |
## Contributions
|
| 22 |
- 🔥 We identify the foreground bias issue in existing VLMs and propose region-text alignment by incorporating explicit semantic structuring through category guidance.
|
|
@@ -111,6 +116,8 @@ DenseVLM/
|
|
| 111 |
If using a fine-tuned CLIP, you can directly use it. For example:
|
| 112 |
|
| 113 |
```python
|
|
|
|
|
|
|
| 114 |
model = open_clip.create_model(
|
| 115 |
'EVA02-CLIP-B-16', pretrained='eva', cache_dir='checkpoints/densevlm_coco_6_save6_512_eva_vib16_12layers.pt'
|
| 116 |
)
|
|
@@ -151,9 +158,4 @@ If you find this project useful, please consider citing:
|
|
| 151 |
month={21--27 Jul},
|
| 152 |
publisher={PMLR}
|
| 153 |
}
|
| 154 |
-
```
|
| 155 |
-
|
| 156 |
-
## License
|
| 157 |
-
Licensed under a [Creative Commons Attribution-NonCommercial 4.0 International](https://creativecommons.org/licenses/by-nc/4.0/) for Non-commercial use only.
|
| 158 |
-
Any commercial use should get formal permission first.
|
| 159 |
-
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-nc-4.0
|
| 3 |
+
pipeline_tag: zero-shot-object-detection
|
| 4 |
+
library_name: open_clip
|
| 5 |
+
---
|
| 6 |
|
| 7 |
<p align="center">
|
| 8 |
<h1 align="center">Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction</h1>
|
|
|
|
| 21 |
</p>
|
| 22 |
<h2 align="center">Accepted By ICCV 2025!</h2>
|
| 23 |
|
| 24 |
+
### [[Paper](https://arxiv.org/pdf/2412.06244)] [[Project Page](https://lyhisme.github.io/)] [[Github](https://github.com/HVision-NKU/DenseVLM)] [[Pretrained models](https://github.com/HVision-NKU/DenseVLM/tree/main#)]
|
| 25 |
|
| 26 |
## Contributions
|
| 27 |
- 🔥 We identify the foreground bias issue in existing VLMs and propose region-text alignment by incorporating explicit semantic structuring through category guidance.
|
|
|
|
| 116 |
If using a fine-tuned CLIP, you can directly use it. For example:
|
| 117 |
|
| 118 |
```python
|
| 119 |
+
import open_clip
|
| 120 |
+
|
| 121 |
model = open_clip.create_model(
|
| 122 |
'EVA02-CLIP-B-16', pretrained='eva', cache_dir='checkpoints/densevlm_coco_6_save6_512_eva_vib16_12layers.pt'
|
| 123 |
)
|
|
|
|
| 158 |
month={21--27 Jul},
|
| 159 |
publisher={PMLR}
|
| 160 |
}
|
| 161 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|