caca9527 commited on
Commit
ecc61fc
·
verified ·
1 Parent(s): 2410073
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -5,6 +5,6 @@ license: apache-2.0
5
 
6
  [\[💻Code\]](https://sh-code.mthreads.com/liang.yang/mt-gui)[\[📝Paper\]](https://arxiv.org/abs/2503.11170) [\[🤗Models\]](https://huggingface.co/caca9527/GUIExplorer)[\[🤗Data\]]()
7
 
8
- ![GUIExplorer](https://sh-code.mthreads.com/liang.yang/mt-gui/-/raw/master/assets/overview.png?ref_type=heads){:height="60%" width="60%"}
9
 
10
  🔥🔥🔥 We have open-sourced our self-developed GUI multimodal visual understanding model GUIExplorer, which is based on the model architecture of LLaVA OneVision 7B. It has basic GUI visual understanding capabilities, including regional OCR, Grounding, and single-step command execution capabilities. For details on how to train and use this model, please refer to the [\[💻Code\]](https://sh-code.mthreads.com/liang.yang/mt-gui).
 
5
 
6
  [\[💻Code\]](https://sh-code.mthreads.com/liang.yang/mt-gui)[\[📝Paper\]](https://arxiv.org/abs/2503.11170) [\[🤗Models\]](https://huggingface.co/caca9527/GUIExplorer)[\[🤗Data\]]()
7
 
8
+ <div align=center><img width="300" height="150" src="https://sh-code.mthreads.com/liang.yang/mt-gui/-/raw/master/assets/overview.png?ref_type=heads"/></div>
9
 
10
  🔥🔥🔥 We have open-sourced our self-developed GUI multimodal visual understanding model GUIExplorer, which is based on the model architecture of LLaVA OneVision 7B. It has basic GUI visual understanding capabilities, including regional OCR, Grounding, and single-step command execution capabilities. For details on how to train and use this model, please refer to the [\[💻Code\]](https://sh-code.mthreads.com/liang.yang/mt-gui).