š Takeaways 1ļøā£ Even the best computer-using models struggle with simple, context-dependent tasks.Ā 2ļøā£ Visual perception and reasoning remain major hurdles for multimodal agents. 3ļøā£ Real-world use cases reveal significant gaps between hype and reality. Perception accuracy drops to near zero by the last turn š