jcwang0602 commited on
Commit
7a53e1e
·
verified ·
1 Parent(s): ab7284e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -3
README.md CHANGED
@@ -1,3 +1,61 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # VPTracker: Global Vision-Language Tracking via Visual Prompt and MLLM
2
+
3
+ [![arXiv](https://img.shields.io/badge/Arxiv-2508.04107-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2512.22799)
4
+ [![Python](https://img.shields.io/badge/Python-3.9-blue.svg)](https://www.python.org/downloads/)
5
+ [![PyTorch](https://img.shields.io/badge/PyTorch-2.5.1-red.svg)](https://pytorch.org/)
6
+ [![Transformers](https://img.shields.io/badge/Transformers-4.37.2-green.svg)](https://huggingface.co/docs/transformers/)
7
+
8
+ <img src="assets/VPTracker.jpg" width="800">
9
+
10
+ ## 🚀 Quick Start
11
+
12
+ ### Installation
13
+
14
+ ```bash
15
+ conda create -n gltrack python==3.10
16
+ conda activate gltrack
17
+
18
+ cd ms-swift
19
+ conda install -c conda-forge pyarrow sentencepiece
20
+ pip install -e .
21
+ pip install "sglang[all]" -U
22
+ pip install "vllm>=0.5.1" "transformers<4.55" "trl<0.21" -U
23
+ pip install "lmdeploy>=0.5" -U
24
+ pip install autoawq -U --no-deps
25
+ pip install auto_gptq optimum bitsandbytes "gradio<5.33" -U
26
+ pip install git+https://github.com/modelscope/ms-swift.git
27
+ pip install timm -U
28
+ pip install "deepspeed" -U
29
+ pip install flash-attn==2.7.4.post1 --no-build-isolation
30
+
31
+ conda install av -c conda-forge
32
+ pip install qwen_vl_utils qwen_omni_utils decord librosa icecream soundfile -U
33
+ pip install liger_kernel nvitop pre-commit math_verify py-spy -U
34
+
35
+ ```
36
+
37
+ ## 👀 Visualization
38
+ <img src="assets/Results.jpg" width="800">
39
+
40
+ ## 🙏 Acknowledgments
41
+ This code is developed on the top of [ms-swift](https://github.com/modelscope/ms-swift)
42
+
43
+ ## ✉️ Contact
44
+
45
+ Email: jcwang@stu.ecnu.edu.cn. Any kind discussions are welcomed!
46
+
47
+ ---
48
+
49
+ ## 📖 Citation
50
+ If our work is useful for your research, please consider cite:
51
+ ```
52
+ @misc{wang2025vptrackerglobalvisionlanguagetracking,
53
+ title={VPTracker: Global Vision-Language Tracking via Visual Prompt and MLLM},
54
+ author={Jingchao Wang and Kaiwen Zhou and Zhijian Wu and Kunhua Ji and Dingjiang Huang and Yefeng Zheng},
55
+ year={2025},
56
+ eprint={2512.22799},
57
+ archivePrefix={arXiv},
58
+ primaryClass={cs.CV},
59
+ url={https://arxiv.org/abs/2512.22799},
60
+ }
61
+ ```