Update README.md
Browse files
README.md
CHANGED
|
@@ -9,6 +9,7 @@ The development of GUI agents could revolutionize the next generation of human-c
|
|
| 9 |
|
| 10 |
## 🏆 Results
|
| 11 |
|
|
|
|
| 12 |
MAI-UI establishes new state-of-the-art across GUI grounding and mobile navigation.
|
| 13 |
|
| 14 |
- On grounding benchmarks, it reaches 73.5% on ScreenSpot-Pro, 91.3% on MMBench GUI L2, 70.9% on OSWorld-G, and 49.2% on UI-Vision, surpassing Gemini-3-Pro and Seed1.8 on ScreenSpot-Pro.
|
|
@@ -17,10 +18,16 @@ MAI-UI establishes new state-of-the-art across GUI grounding and mobile navigati
|
|
| 17 |

|
| 18 |

|
| 19 |
|
|
|
|
| 20 |
- On mobile GUI navigation, it sets a new SOTA of 76.7% on AndroidWorld, surpassing UI-Tars-2, Gemini-2.5-Pro and Seed1.8. On MobileWorld, MAI-UI obtains 41.7% success rate, significantly outperforming end-to-end GUI models and competitive with Gemini-3-Pro based agentic frameworks.
|
| 21 |

|
| 22 |

|
| 23 |
|
|
|
|
| 24 |
- Our online RL experiments show significant gains from scaling parallel environments from 32 to 512 (+5.2 points) and increasing environment step budget from 15 to 50 (+4.3 points).
|
| 25 |

|
| 26 |

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
## 🏆 Results
|
| 11 |
|
| 12 |
+
### Grounding
|
| 13 |
MAI-UI establishes new state-of-the-art across GUI grounding and mobile navigation.
|
| 14 |
|
| 15 |
- On grounding benchmarks, it reaches 73.5% on ScreenSpot-Pro, 91.3% on MMBench GUI L2, 70.9% on OSWorld-G, and 49.2% on UI-Vision, surpassing Gemini-3-Pro and Seed1.8 on ScreenSpot-Pro.
|
|
|
|
| 18 |

|
| 19 |

|
| 20 |
|
| 21 |
+
### Mobile Navigation
|
| 22 |
- On mobile GUI navigation, it sets a new SOTA of 76.7% on AndroidWorld, surpassing UI-Tars-2, Gemini-2.5-Pro and Seed1.8. On MobileWorld, MAI-UI obtains 41.7% success rate, significantly outperforming end-to-end GUI models and competitive with Gemini-3-Pro based agentic frameworks.
|
| 23 |

|
| 24 |

|
| 25 |
|
| 26 |
+
### Online RL
|
| 27 |
- Our online RL experiments show significant gains from scaling parallel environments from 32 to 512 (+5.2 points) and increasing environment step budget from 15 to 50 (+4.3 points).
|
| 28 |

|
| 29 |

|
| 30 |
+
|
| 31 |
+
### Device-Cloud Collaboration
|
| 32 |
+
- Our device-cloud collaboration framework can dynamically select on-device or cloud execution based on task execution state and data sensitivity. It improves on-device performance by 33% and reduces cloud API calls by over 40%.
|
| 33 |
+

|