UI Agent Collection a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics • 501 items • Updated 2 days ago • 69
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 97
view article Article LeRobot v0.4.0:全面提升开源机器人的学习能力 +7 imstevenpmwork, aractingi, pepijn223, CarolinePascal, jadechoghari, fracapuano, AdilZtn, nepyope, thomwolf • Oct 24, 2025 • 14
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents Paper • 2506.03143 • Published Jun 3, 2025 • 54
GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding Paper • 2511.00810 • Published Nov 2, 2025 • 5
view article Article A failed experiment: Infini-Attention, and why we should keep trying? +1 neuralink, lvwerra, thomwolf • Aug 14, 2024 • 76
Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming Paper • 2408.16725 • Published Aug 29, 2024 • 53
view article Article SeeMoE: Implementing a MoE Vision Language Model from Scratch AviSoori1x • Jun 23, 2024 • 40
view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch AviSoori1x • May 7, 2024 • 122