A11y-CUA Dataset: Characterizing the Accessibility Gap in Computer Use Agents
Abstract
A11y-CUA dataset reveals distinct interaction patterns between sighted and blind/low-vision users, demonstrating significant performance gaps in current computer use agents when adapted for assistive technology use.
Computer Use Agents (CUAs) operate interfaces by pointing, clicking, and typing -- mirroring interactions of sighted users (SUs) who can thus monitor CUAs and share control. CUAs do not reflect interactions by blind and low-vision users (BLVUs) who use assistive technology (AT). BLVUs thus cannot easily collaborate with CUAs. To characterize the accessibility gap of CUAs, we present A11y-CUA, a dataset of BLVUs and SUs performing 60 everyday tasks with 40.4 hours and 158,325 events. Our dataset analysis reveals that our collected interaction traces quantitatively confirm distinct interaction styles between SU and BLVU groups (mouse- vs. keyboard-dominant) and demonstrate interaction diversity within each group (sequential vs. shortcut navigation for BLVUs). We then compare collected traces to state-of-the-art CUAs under default and AT conditions (keyboard-only, magnifier). The default CUA executed 78.3% of tasks successfully. But with the AT conditions, CUA's performance dropped to 41.67% and 28.3% with keyboard-only and magnifier conditions respectively, and did not reflect nuances of real AT use. With our open A11y-CUA dataset, we aim to promote collaborative and accessible CUAs for everyone.
Get this paper in your agent:
hf papers read 2602.09310 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 2
berkeley-hci/Reduced-A11y-CUA
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper