Spaces:
Sleeping
Sleeping
| title: SCOPE Chat Demo | |
| emoji: ๐ | |
| colorFrom: purple | |
| colorTo: red | |
| sdk: gradio | |
| sdk_version: 4.43.0 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| short_description: An interactive chating demo for the SCOPE. | |
| # SCOPE Demo-Chat Code | |
| In this Demo, we show the chat results of different visual token compression methods. | |
| ## Deploy the Demo Locally | |
| To run the demo, follow the same steps as LLaVA: | |
| 1. In **Terminal 1**, start the controller: | |
| ```bash | |
| python -m llava.serve.controller --host 0.0.0.0 --port 10000 | |
| ``` | |
| 2. In **Terminal 2**, launch the Gradio web server: | |
| ```bash | |
| python -m llava.serve.gradio_web_server_SCOPE --controller http://localhost:10000 --model-list-mode reload | |
| ``` | |
| 3. In **Terminal 3**, start the model worker: | |
| ```bash | |
| python -m llava.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path liuhaotian/llava-v1.5-7b | |
| ``` | |
| ## Citation | |
| If you find this project useful in your research, please consider citing: | |
| ``` | |
| @article{deng2025scope, | |
| title={SCOPE: Saliency-Coverage Oriented Token Pruning for Efficient Multimodel LLMs}, | |
| author={Deng Jinhong, Li Wen, and Zhou, Joey Tianyi, and He, Yang}, | |
| booktitle={Advances in Neural Information Processing Systems (NeurIPS)}, | |
| year={2025} | |
| } | |
| ``` | |
| ## Acknowledgement | |
| This demo-chat is based on [VisionZIP](https://github.com/dvlab-research/VisionZip.git), many thanks. | |