IDEA-Research
/

ChatRex-7B

@@ -1,10 +1,11 @@
 ---
-language:
-- en
 base_model:
 - lmsys/vicuna-7b-v1.5
 - openai/clip-vit-large-patch14
 - laion/CLIP-convnext_large_d.laion2B-s26B-b102K-augreg
 pipeline_tag: image-text-to-text
 tags:
 - chatrex
@@ -17,6 +18,8 @@ arxiv.org/abs/2411.18363
   <img src="assets/teaser.jpg" width=600 >
 </div>
 ----
 # 1. Introduction 📚
@@ -43,7 +46,7 @@ cd chatrex/upn/ops
 pip install -v -e .
 ```
-## 2.1 Download Pre-trained UPN Models
 We provide model checkpoints for both the ***Universal Proposal Network (UPN)*** and the ***ChatRex model***. You can download the pre-trained models from the following links:
 - [UPN Checkpoint](https://github.com/IDEA-Research/ChatRex/releases/download/upn-large/upn_large.pth)
 - [ChatRex-7B Checkpoint](https://huggingface.co/IDEA-Research/ChatRex-7B)
@@ -173,6 +176,8 @@ Please detect person in the car; cat below the table in this image. Answer the q
 <details close>
 <summary><strong>Example Code</strong></summary>
 ```python
 import torch
 from PIL import Image
@@ -289,6 +294,8 @@ Can you provide me with a one sentence of <obji>? Answer the question with one s
 <details close>
 <summary><strong>Example Code</strong></summary>
 ```python
 import torch
 from PIL import Image
@@ -381,6 +388,8 @@ Please provide a detailed description of the image and detect all the mentioned
 <details close>
 <summary><strong>Example Code</strong></summary>
 ```python
 import torch
 from PIL import Image
@@ -483,6 +492,8 @@ Answer the question in Grounded format. Question
 <details close>
 <summary><strong>Example Code</strong></summary>
 ```python
 import torch
 from PIL import Image
@@ -558,7 +569,6 @@ if __name__ == "__main__":
     )
     vis_image.save("tests/test_chatrex_grounded_conversation.jpeg")
     print(f"prediction is saved at tests/test_chatrex_grounded_conversation.jpeg")
 ```
 The output from LLM is like:
@@ -576,7 +586,6 @@ The visualization of the output is like:
 ----
 # 5. LICENSE
 ChatRex is licensed under the IDEA License 1.0, Copyright (c) IDEA. All Rights Reserved. Note that this project utilizes certain datasets and checkpoints that are subject to their respective original licenses. Users must comply with all terms and conditions of these original licenses including but not limited to the:
@@ -595,4 +604,4 @@ ChatRex is licensed under the IDEA License 1.0, Copyright (c) IDEA. All Rights R
       primaryClass={cs.CV},
       url={https://arxiv.org/abs/2411.18363},
 }
-```

 ---
 base_model:
 - lmsys/vicuna-7b-v1.5
 - openai/clip-vit-large-patch14
 - laion/CLIP-convnext_large_d.laion2B-s26B-b102K-augreg
+language:
+- en
+library_name: transformers
 pipeline_tag: image-text-to-text
 tags:
 - chatrex
   <img src="assets/teaser.jpg" width=600 >
 </div>
+The code for the model can be found at: https://github.com/IDEA-Research/ChatRex
 ----
 # 1. Introduction 📚
 pip install -v -e .
 ```
+## 2.1 Download Pre-trained Models
 We provide model checkpoints for both the ***Universal Proposal Network (UPN)*** and the ***ChatRex model***. You can download the pre-trained models from the following links:
 - [UPN Checkpoint](https://github.com/IDEA-Research/ChatRex/releases/download/upn-large/upn_large.pth)
 - [ChatRex-7B Checkpoint](https://huggingface.co/IDEA-Research/ChatRex-7B)
 <details close>
 <summary><strong>Example Code</strong></summary>
+- [Example Code in python file](tests/test_chatrex_detection.py)
 ```python
 import torch
 from PIL import Image
 <details close>
 <summary><strong>Example Code</strong></summary>
+- [Example Code in python file](tests/test_chatrex_region_caption.py)
 ```python
 import torch
 from PIL import Image
 <details close>
 <summary><strong>Example Code</strong></summary>
+- [Example Code in python file](tests/test_chatrex_grounded_image_caption.py)
 ```python
 import torch
 from PIL import Image
 <details close>
 <summary><strong>Example Code</strong></summary>
+- [Example Code in python file](tests/test_chatrex_grounded_conversation.py)
 ```python
 import torch
 from PIL import Image
     )
     vis_image.save("tests/test_chatrex_grounded_conversation.jpeg")
     print(f"prediction is saved at tests/test_chatrex_grounded_conversation.jpeg")
 ```
 The output from LLM is like:
 ----
 # 5. LICENSE
 ChatRex is licensed under the IDEA License 1.0, Copyright (c) IDEA. All Rights Reserved. Note that this project utilizes certain datasets and checkpoints that are subject to their respective original licenses. Users must comply with all terms and conditions of these original licenses including but not limited to the:
       primaryClass={cs.CV},
       url={https://arxiv.org/abs/2411.18363},
 }
+```