Xu0307 commited on
Commit
5030509
·
verified ·
1 Parent(s): 8a3ce70

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -0
README.md CHANGED
@@ -9,6 +9,7 @@ The development of GUI agents could revolutionize the next generation of human-c
9
 
10
  ## 🏆 Results
11
 
 
12
  MAI-UI establishes new state-of-the-art across GUI grounding and mobile navigation.
13
 
14
  - On grounding benchmarks, it reaches 73.5% on ScreenSpot-Pro, 91.3% on MMBench GUI L2, 70.9% on OSWorld-G, and 49.2% on UI-Vision, surpassing Gemini-3-Pro and Seed1.8 on ScreenSpot-Pro.
@@ -17,10 +18,16 @@ MAI-UI establishes new state-of-the-art across GUI grounding and mobile navigati
17
  ![mmbench](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/dwc8np4JLxTWoNS8wu18p.jpeg)
18
  ![osworld-g](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/lff2k7ZkMgCrgt3lPZChb.jpeg)
19
 
 
20
  - On mobile GUI navigation, it sets a new SOTA of 76.7% on AndroidWorld, surpassing UI-Tars-2, Gemini-2.5-Pro and Seed1.8. On MobileWorld, MAI-UI obtains 41.7% success rate, significantly outperforming end-to-end GUI models and competitive with Gemini-3-Pro based agentic frameworks.
21
  ![aw](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/EUXl1osfQ26WGYcV4hhsG.jpeg)
22
  ![mw](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/gsbjUtIoUFWyR6Nw3qRz6.jpeg)
23
 
 
24
  - Our online RL experiments show significant gains from scaling parallel environments from 32 to 512 (+5.2 points) and increasing environment step budget from 15 to 50 (+4.3 points).
25
  ![rl](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/BO8GnYXsJ51ZTvVRLxn6u.jpeg)
26
  ![rl_env](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/39mhKfUvt7159dKM_oeGa.jpeg)
 
 
 
 
 
9
 
10
  ## 🏆 Results
11
 
12
+ ### Grounding
13
  MAI-UI establishes new state-of-the-art across GUI grounding and mobile navigation.
14
 
15
  - On grounding benchmarks, it reaches 73.5% on ScreenSpot-Pro, 91.3% on MMBench GUI L2, 70.9% on OSWorld-G, and 49.2% on UI-Vision, surpassing Gemini-3-Pro and Seed1.8 on ScreenSpot-Pro.
 
18
  ![mmbench](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/dwc8np4JLxTWoNS8wu18p.jpeg)
19
  ![osworld-g](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/lff2k7ZkMgCrgt3lPZChb.jpeg)
20
 
21
+ ### Mobile Navigation
22
  - On mobile GUI navigation, it sets a new SOTA of 76.7% on AndroidWorld, surpassing UI-Tars-2, Gemini-2.5-Pro and Seed1.8. On MobileWorld, MAI-UI obtains 41.7% success rate, significantly outperforming end-to-end GUI models and competitive with Gemini-3-Pro based agentic frameworks.
23
  ![aw](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/EUXl1osfQ26WGYcV4hhsG.jpeg)
24
  ![mw](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/gsbjUtIoUFWyR6Nw3qRz6.jpeg)
25
 
26
+ ### Online RL
27
  - Our online RL experiments show significant gains from scaling parallel environments from 32 to 512 (+5.2 points) and increasing environment step budget from 15 to 50 (+4.3 points).
28
  ![rl](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/BO8GnYXsJ51ZTvVRLxn6u.jpeg)
29
  ![rl_env](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/39mhKfUvt7159dKM_oeGa.jpeg)
30
+
31
+ ### Device-Cloud Collaboration
32
+ - Our device-cloud collaboration framework can dynamically select on-device or cloud execution based on task execution state and data sensitivity. It improves on-device performance by 33% and reduces cloud API calls by over 40%.
33
+ ![dcc](https://cdn-uploads.huggingface.co/production/uploads/63525c3a6cfb8f1498127a34/Fm2PPxbRpASfdvVBxkjLw.jpeg)