IDEA-Research
/

ChatRex-7B

@@ -5,8 +5,12 @@
 # 1. Introduction 📚
-**TL;DR: ChatRex is a MLLM skilled in perception that can respond to questions while simultaneously grounding its answers to the referenced objects.**
 ChatRex is a Multimodal Large Language Model (MLLM) designed to seamlessly integrate fine-grained object perception and robust language understanding. By adopting a decoupled architecture with a retrieval-based approach for object detection and leveraging high-resolution visual inputs, ChatRex addresses key challenges in perception tasks. It is powered by the Rexverse-2M dataset with diverse image-region-text annotations. ChatRex can be applied to various scenarios requiring fine-grained perception, such as object detection, grounded conversation, grounded image captioning and region
 understanding.
@@ -29,7 +33,7 @@ pip install -v -e .
 ## 2.1 Download Pre-trained Models
 We provide model checkpoints for both the ***Universal Proposal Network (UPN)*** and the ***ChatRex model***. You can download the pre-trained models from the following links:
-- [UPN Checkpoint](https://drive.google)
 - [ChatRex-7B Checkpoint](https://huggingface.co/IDEA-Research/ChatRex-7B)
 Or you can also using the following command to download the pre-trained models:
@@ -37,7 +41,7 @@ Or you can also using the following command to download the pre-trained models:
 mkdir checkpoints
 mkdir checkpoints/upn
 # download UPN checkpoint
-wget -O checkpoints/upn/upn_large.pth https://drive.google.com/file/d/
 # download ChatRex checkpoint from huggingface IDEA-Research/ChatRex-7B
 # Download ChatRex checkpoint from Hugging Face
 git lfs install
@@ -174,7 +178,7 @@ from chatrex.upn import UPNWrapper
 if __name__ == "__main__":
     # load the processor
     processor = AutoProcessor.from_pretrained(
-        "checkpoints/chatrex7b",
         trust_remote_code=True,
         device_map="cuda",
     )
@@ -182,7 +186,7 @@ if __name__ == "__main__":
     print(f"loading chatrex model...")
     # load chatrex model
     model = AutoModelForCausalLM.from_pretrained(
-        "checkpoints/chatrex7b",
         trust_remote_code=True,
         use_safetensors=True,
     ).to("cuda")
@@ -292,7 +296,7 @@ from chatrex.upn import UPNWrapper
 if __name__ == "__main__":
     # load the processor
     processor = AutoProcessor.from_pretrained(
-        "checkpoints/chatrex7b",
         trust_remote_code=True,
         device_map="cuda",
     )
@@ -300,7 +304,7 @@ if __name__ == "__main__":
     print(f"loading chatrex model...")
     # load chatrex model
     model = AutoModelForCausalLM.from_pretrained(
-        "checkpoints/chatrex7b",
         trust_remote_code=True,
         use_safetensors=True,
     ).to("cuda")
@@ -386,7 +390,7 @@ from chatrex.upn import UPNWrapper
 if __name__ == "__main__":
     # load the processor
     processor = AutoProcessor.from_pretrained(
-        "checkpoints/chatrex7b",
         trust_remote_code=True,
         device_map="cuda",
     )
@@ -394,7 +398,7 @@ if __name__ == "__main__":
     print(f"loading chatrex model...")
     # load chatrex model
     model = AutoModelForCausalLM.from_pretrained(
-        "checkpoints/chatrex7b",
         trust_remote_code=True,
         use_safetensors=True,
     ).to("cuda")
@@ -490,7 +494,7 @@ from chatrex.upn import UPNWrapper
 if __name__ == "__main__":
     # load the processor
     processor = AutoProcessor.from_pretrained(
-        "checkpoints/chatrex7b",
         trust_remote_code=True,
         device_map="cuda",
     )
@@ -498,7 +502,7 @@ if __name__ == "__main__":
     print(f"loading chatrex model...")
     # load chatrex model
     model = AutoModelForCausalLM.from_pretrained(
-        "checkpoints/chatrex7b",
         trust_remote_code=True,
         use_safetensors=True,
     ).to("cuda")
@@ -572,35 +576,6 @@ The visualization of the output is like:
 ----
-# 4. Gradio Demos 🎨
-## 4.1 Gradio Demo for UPN
-We provide a gradio demo for UPN to visualize the object proposals generated by UPN. You can run the following command to start the gradio demo:
-```bash
-python gradio_demos/upn_demo.py
-# if there is permission error, please run the following command
-mkdir tmp
-TMPDIR='/tmp' python gradio_demos/upn_demo.py
-```
-<div align=center>
-  <img src="assets/upn_gradio.jpg" width=600 >
-</div>
-## 4.2 Gradio Demo for ChatRex
-We also provide a gradio demo for ChatRex.
-```bash
-python gradio_demos/chatrex_demo.py
-# if there is permission error, please run the following command
-mkdir tmp
-TMPDIR='/tmp' python gradio_demos/upn_demo.py
-```
-<div align=center>
-  <img src="assets/chatrex_gradio.jpg" width=600 >
-</div>
 # 5. LICENSE

+<div align=center>
+----
 # 1. Introduction 📚
+**TL;DR: ChatRex is an MLLM skilled in perception that can respond to questions while simultaneously grounding its answers to the referenced objects.**
 ChatRex is a Multimodal Large Language Model (MLLM) designed to seamlessly integrate fine-grained object perception and robust language understanding. By adopting a decoupled architecture with a retrieval-based approach for object detection and leveraging high-resolution visual inputs, ChatRex addresses key challenges in perception tasks. It is powered by the Rexverse-2M dataset with diverse image-region-text annotations. ChatRex can be applied to various scenarios requiring fine-grained perception, such as object detection, grounded conversation, grounded image captioning and region
 understanding.
 ## 2.1 Download Pre-trained Models
 We provide model checkpoints for both the ***Universal Proposal Network (UPN)*** and the ***ChatRex model***. You can download the pre-trained models from the following links:
+- [UPN Checkpoint](https://github.com/IDEA-Research/ChatRex/releases/download/upn-large/upn_large.pth)
 - [ChatRex-7B Checkpoint](https://huggingface.co/IDEA-Research/ChatRex-7B)
 Or you can also using the following command to download the pre-trained models:
 mkdir checkpoints
 mkdir checkpoints/upn
 # download UPN checkpoint
+wget -O checkpoints/upn/upn_large.pth https://github.com/IDEA-Research/ChatRex/releases/download/upn-large/upn_large.pth
 # download ChatRex checkpoint from huggingface IDEA-Research/ChatRex-7B
 # Download ChatRex checkpoint from Hugging Face
 git lfs install
 if __name__ == "__main__":
     # load the processor
     processor = AutoProcessor.from_pretrained(
+        "IDEA-Research/ChatRex-7B",
         trust_remote_code=True,
         device_map="cuda",
     )
     print(f"loading chatrex model...")
     # load chatrex model
     model = AutoModelForCausalLM.from_pretrained(
+        "IDEA-Research/ChatRex-7B",
         trust_remote_code=True,
         use_safetensors=True,
     ).to("cuda")
 if __name__ == "__main__":
     # load the processor
     processor = AutoProcessor.from_pretrained(
+        "IDEA-Research/ChatRex-7B",
         trust_remote_code=True,
         device_map="cuda",
     )
     print(f"loading chatrex model...")
     # load chatrex model
     model = AutoModelForCausalLM.from_pretrained(
+        "IDEA-Research/ChatRex-7B",
         trust_remote_code=True,
         use_safetensors=True,
     ).to("cuda")
 if __name__ == "__main__":
     # load the processor
     processor = AutoProcessor.from_pretrained(
+        "IDEA-Research/ChatRex-7B",
         trust_remote_code=True,
         device_map="cuda",
     )
     print(f"loading chatrex model...")
     # load chatrex model
     model = AutoModelForCausalLM.from_pretrained(
+        "IDEA-Research/ChatRex-7B",
         trust_remote_code=True,
         use_safetensors=True,
     ).to("cuda")
 if __name__ == "__main__":
     # load the processor
     processor = AutoProcessor.from_pretrained(
+        "IDEA-Research/ChatRex-7B",
         trust_remote_code=True,
         device_map="cuda",
     )
     print(f"loading chatrex model...")
     # load chatrex model
     model = AutoModelForCausalLM.from_pretrained(
+        "IDEA-Research/ChatRex-7B",
         trust_remote_code=True,
         use_safetensors=True,
     ).to("cuda")
 ----
 # 5. LICENSE