Spaces:
Runtime error
Runtime error
Commit
·
18a3ab7
1
Parent(s):
bffd5b1
Update README.md
Browse files
README.md
CHANGED
|
@@ -103,202 +103,6 @@ Marrying <a href="https://github.com/IDEA-Research/GroundingDINO">Grounding DINO
|
|
| 103 |

|
| 104 |

|
| 105 |
|
| 106 |
-
## :label: TODO
|
| 107 |
-
|
| 108 |
-
- [x] Release inference code and demo.
|
| 109 |
-
- [x] Release checkpoints.
|
| 110 |
-
- [x] Grounding DINO with Stable Diffusion and GLIGEN demos.
|
| 111 |
-
- [ ] Release training codes.
|
| 112 |
-
|
| 113 |
-
## :hammer_and_wrench: Install
|
| 114 |
-
|
| 115 |
-
**Note:**
|
| 116 |
-
|
| 117 |
-
0. If you have a CUDA environment, please make sure the environment variable `CUDA_HOME` is set. It will be compiled under CPU-only mode if no CUDA available.
|
| 118 |
-
|
| 119 |
-
Please make sure following the installation steps strictly, otherwise the program may produce:
|
| 120 |
-
```bash
|
| 121 |
-
NameError: name '_C' is not defined
|
| 122 |
-
```
|
| 123 |
-
|
| 124 |
-
If this happened, please reinstalled the groundingDINO by reclone the git and do all the installation steps again.
|
| 125 |
-
|
| 126 |
-
#### how to check cuda:
|
| 127 |
-
```bash
|
| 128 |
-
echo $CUDA_HOME
|
| 129 |
-
```
|
| 130 |
-
If it print nothing, then it means you haven't set up the path/
|
| 131 |
-
|
| 132 |
-
Run this so the environment variable will be set under current shell.
|
| 133 |
-
```bash
|
| 134 |
-
export CUDA_HOME=/path/to/cuda-11.3
|
| 135 |
-
```
|
| 136 |
-
|
| 137 |
-
Notice the version of cuda should be aligned with your CUDA runtime, for there might exists multiple cuda at the same time.
|
| 138 |
-
|
| 139 |
-
If you want to set the CUDA_HOME permanently, store it using:
|
| 140 |
-
|
| 141 |
-
```bash
|
| 142 |
-
echo 'export CUDA_HOME=/path/to/cuda' >> ~/.bashrc
|
| 143 |
-
```
|
| 144 |
-
after that, source the bashrc file and check CUDA_HOME:
|
| 145 |
-
```bash
|
| 146 |
-
source ~/.bashrc
|
| 147 |
-
echo $CUDA_HOME
|
| 148 |
-
```
|
| 149 |
-
|
| 150 |
-
In this example, /path/to/cuda-11.3 should be replaced with the path where your CUDA toolkit is installed. You can find this by typing **which nvcc** in your terminal:
|
| 151 |
-
|
| 152 |
-
For instance,
|
| 153 |
-
if the output is /usr/local/cuda/bin/nvcc, then:
|
| 154 |
-
```bash
|
| 155 |
-
export CUDA_HOME=/usr/local/cuda
|
| 156 |
-
```
|
| 157 |
-
**Installation:**
|
| 158 |
-
|
| 159 |
-
1.Clone the GroundingDINO repository from GitHub.
|
| 160 |
-
|
| 161 |
-
```bash
|
| 162 |
-
git clone https://github.com/IDEA-Research/GroundingDINO.git
|
| 163 |
-
```
|
| 164 |
-
|
| 165 |
-
2. Change the current directory to the GroundingDINO folder.
|
| 166 |
-
|
| 167 |
-
```bash
|
| 168 |
-
cd GroundingDINO/
|
| 169 |
-
```
|
| 170 |
-
|
| 171 |
-
3. Install the required dependencies in the current directory.
|
| 172 |
-
|
| 173 |
-
```bash
|
| 174 |
-
pip install -e .
|
| 175 |
-
```
|
| 176 |
-
|
| 177 |
-
4. Download pre-trained model weights.
|
| 178 |
-
|
| 179 |
-
```bash
|
| 180 |
-
mkdir weights
|
| 181 |
-
cd weights
|
| 182 |
-
wget -q https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
|
| 183 |
-
cd ..
|
| 184 |
-
```
|
| 185 |
-
|
| 186 |
-
## :arrow_forward: Demo
|
| 187 |
-
Check your GPU ID (only if you're using a GPU)
|
| 188 |
-
|
| 189 |
-
```bash
|
| 190 |
-
nvidia-smi
|
| 191 |
-
```
|
| 192 |
-
Replace `{GPU ID}`, `image_you_want_to_detect.jpg`, and `"dir you want to save the output"` with appropriate values in the following command
|
| 193 |
-
```bash
|
| 194 |
-
CUDA_VISIBLE_DEVICES={GPU ID} python demo/inference_on_a_image.py \
|
| 195 |
-
-c groundingdino/config/GroundingDINO_SwinT_OGC.py \
|
| 196 |
-
-p weights/groundingdino_swint_ogc.pth \
|
| 197 |
-
-i image_you_want_to_detect.jpg \
|
| 198 |
-
-o "dir you want to save the output" \
|
| 199 |
-
-t "chair"
|
| 200 |
-
[--cpu-only] # open it for cpu mode
|
| 201 |
-
```
|
| 202 |
-
|
| 203 |
-
If you would like to specify the phrases to detect, here is a demo:
|
| 204 |
-
```bash
|
| 205 |
-
CUDA_VISIBLE_DEVICES={GPU ID} python demo/inference_on_a_image.py \
|
| 206 |
-
-c groundingdino/config/GroundingDINO_SwinT_OGC.py \
|
| 207 |
-
-p ./groundingdino_swint_ogc.pth \
|
| 208 |
-
-i .asset/cat_dog.jpeg \
|
| 209 |
-
-o logs/1111 \
|
| 210 |
-
-t "There is a cat and a dog in the image ." \
|
| 211 |
-
--token_spans "[[[9, 10], [11, 14]], [[19, 20], [21, 24]]]"
|
| 212 |
-
[--cpu-only] # open it for cpu mode
|
| 213 |
-
```
|
| 214 |
-
The token_spans specify the start and end positions of a phrases. For example, the first phrase is `[[9, 10], [11, 14]]`. `"There is a cat and a dog in the image ."[9:10] = 'a'`, `"There is a cat and a dog in the image ."[11:14] = 'cat'`. Hence it refers to the phrase `a cat` . Similarly, the `[[19, 20], [21, 24]]` refers to the phrase `a dog`.
|
| 215 |
-
|
| 216 |
-
See the `demo/inference_on_a_image.py` for more details.
|
| 217 |
-
|
| 218 |
-
**Running with Python:**
|
| 219 |
-
|
| 220 |
-
```python
|
| 221 |
-
from groundingdino.util.inference import load_model, load_image, predict, annotate
|
| 222 |
-
import cv2
|
| 223 |
-
|
| 224 |
-
model = load_model("groundingdino/config/GroundingDINO_SwinT_OGC.py", "weights/groundingdino_swint_ogc.pth")
|
| 225 |
-
IMAGE_PATH = "weights/dog-3.jpeg"
|
| 226 |
-
TEXT_PROMPT = "chair . person . dog ."
|
| 227 |
-
BOX_TRESHOLD = 0.35
|
| 228 |
-
TEXT_TRESHOLD = 0.25
|
| 229 |
-
|
| 230 |
-
image_source, image = load_image(IMAGE_PATH)
|
| 231 |
-
|
| 232 |
-
boxes, logits, phrases = predict(
|
| 233 |
-
model=model,
|
| 234 |
-
image=image,
|
| 235 |
-
caption=TEXT_PROMPT,
|
| 236 |
-
box_threshold=BOX_TRESHOLD,
|
| 237 |
-
text_threshold=TEXT_TRESHOLD
|
| 238 |
-
)
|
| 239 |
-
|
| 240 |
-
annotated_frame = annotate(image_source=image_source, boxes=boxes, logits=logits, phrases=phrases)
|
| 241 |
-
cv2.imwrite("annotated_image.jpg", annotated_frame)
|
| 242 |
-
```
|
| 243 |
-
**Web UI**
|
| 244 |
-
|
| 245 |
-
We also provide a demo code to integrate Grounding DINO with Gradio Web UI. See the file `demo/gradio_app.py` for more details.
|
| 246 |
-
|
| 247 |
-
**Notebooks**
|
| 248 |
-
|
| 249 |
-
- We release [demos](demo/image_editing_with_groundingdino_gligen.ipynb) to combine [Grounding DINO](https://arxiv.org/abs/2303.05499) with [GLIGEN](https://github.com/gligen/GLIGEN) for more controllable image editings.
|
| 250 |
-
- We release [demos](demo/image_editing_with_groundingdino_stablediffusion.ipynb) to combine [Grounding DINO](https://arxiv.org/abs/2303.05499) with [Stable Diffusion](https://github.com/Stability-AI/StableDiffusion) for image editings.
|
| 251 |
-
|
| 252 |
-
## COCO Zero-shot Evaluations
|
| 253 |
-
|
| 254 |
-
We provide an example to evaluate Grounding DINO zero-shot performance on COCO. The results should be **48.5**.
|
| 255 |
-
|
| 256 |
-
```bash
|
| 257 |
-
CUDA_VISIBLE_DEVICES=0 \
|
| 258 |
-
python demo/test_ap_on_coco.py \
|
| 259 |
-
-c groundingdino/config/GroundingDINO_SwinT_OGC.py \
|
| 260 |
-
-p weights/groundingdino_swint_ogc.pth \
|
| 261 |
-
--anno_path /path/to/annoataions/ie/instances_val2017.json \
|
| 262 |
-
--image_dir /path/to/imagedir/ie/val2017
|
| 263 |
-
```
|
| 264 |
-
|
| 265 |
-
|
| 266 |
-
## :luggage: Checkpoints
|
| 267 |
-
|
| 268 |
-
<!-- insert a table -->
|
| 269 |
-
<table>
|
| 270 |
-
<thead>
|
| 271 |
-
<tr style="text-align: right;">
|
| 272 |
-
<th></th>
|
| 273 |
-
<th>name</th>
|
| 274 |
-
<th>backbone</th>
|
| 275 |
-
<th>Data</th>
|
| 276 |
-
<th>box AP on COCO</th>
|
| 277 |
-
<th>Checkpoint</th>
|
| 278 |
-
<th>Config</th>
|
| 279 |
-
</tr>
|
| 280 |
-
</thead>
|
| 281 |
-
<tbody>
|
| 282 |
-
<tr>
|
| 283 |
-
<th>1</th>
|
| 284 |
-
<td>GroundingDINO-T</td>
|
| 285 |
-
<td>Swin-T</td>
|
| 286 |
-
<td>O365,GoldG,Cap4M</td>
|
| 287 |
-
<td>48.4 (zero-shot) / 57.2 (fine-tune)</td>
|
| 288 |
-
<td><a href="https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth">GitHub link</a> | <a href="https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/groundingdino_swint_ogc.pth">HF link</a></td>
|
| 289 |
-
<td><a href="https://github.com/IDEA-Research/GroundingDINO/blob/main/groundingdino/config/GroundingDINO_SwinT_OGC.py">link</a></td>
|
| 290 |
-
</tr>
|
| 291 |
-
<tr>
|
| 292 |
-
<th>2</th>
|
| 293 |
-
<td>GroundingDINO-B</td>
|
| 294 |
-
<td>Swin-B</td>
|
| 295 |
-
<td>COCO,O365,GoldG,Cap4M,OpenImage,ODinW-35,RefCOCO</td>
|
| 296 |
-
<td>56.7 </td>
|
| 297 |
-
<td><a href="https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha2/groundingdino_swinb_cogcoor.pth">GitHub link</a> | <a href="https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/groundingdino_swinb_cogcoor.pth">HF link</a>
|
| 298 |
-
<td><a href="https://github.com/IDEA-Research/GroundingDINO/blob/main/groundingdino/config/GroundingDINO_SwinB.cfg.py">link</a></td>
|
| 299 |
-
</tr>
|
| 300 |
-
</tbody>
|
| 301 |
-
</table>
|
| 302 |
|
| 303 |
## :medal_military: Results
|
| 304 |
|
|
|
|
| 103 |

|
| 104 |

|
| 105 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 106 |
|
| 107 |
## :medal_military: Results
|
| 108 |
|