view reply I'd imagine, similar to an agentic loop controlling a web browser. Observe to determine what action to take, take the action, and observe any visual changes to decide what to do next. Repeats until the goal is achieved.