File size: 1,049 Bytes
a07c2b0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8d70968
a07c2b0
5567e56
a07c2b0
 
 
5567e56
a07c2b0
 
 
 
5567e56
6b989ad
a07c2b0
 
6b989ad
a07c2b0
5567e56
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
language:
- en
library_name: CountGD
license: mit
tags:
- computer-vision
- counting
- grounding-dino
- model_hub_mixin
- multi-modal
- open-vocabulary
- pytorch_model_hub_mixin
- transformers
---

# CountGD

A Multi-Modal Open-World Counting Model for counting objects in an image with text and image prompts. 
For more details, please check out the following links

- Project page: https://www.robots.ox.ac.uk/~vgg/research/countgd/
- Code: https://github.com/niki-amini-naieni/CountGD
- Demo: https://huggingface.co/spaces/nikigoli/countgd
- Paper: https://arxiv.org/pdf/2407.04619

![Sample prediction](https://www.robots.ox.ac.uk/~vgg/research/countgd/images/teaser-improved.png)

## Architecture

![CountGD Architecture](https://www.robots.ox.ac.uk/~vgg/research/countgd/images/architecture.png)


## Citation

```
@inproceedings{AminiNaieni24,
    author       = "Amini-Naieni, N. and Han, T. and Zisserman, A.",
    title        = "CountGD: Multi-Modal Open-World Counting",
    booktitle    = "NeurIPS",
    year         = "2024",
}
```