|
|
--- |
|
|
title: Open Computer Use Agent |
|
|
emoji: π₯οΈ |
|
|
colorFrom: blue |
|
|
colorTo: purple |
|
|
sdk: docker |
|
|
app_port: 7860 |
|
|
pinned: false |
|
|
license: mit |
|
|
--- |
|
|
|
|
|
# Open Computer Use Agent |
|
|
|
|
|
An open-source alternative to OpenAI's Operator and Anthropic's Computer Use. |
|
|
|
|
|
## Features |
|
|
|
|
|
- π₯οΈ Full Linux desktop (Xfce) running in the browser |
|
|
- π±οΈ Mouse control (click, double-click, right-click) |
|
|
- β¨οΈ Keyboard input (typing, key combinations) |
|
|
- π· Real-time screenshots |
|
|
- π Firefox ESR browser included |
|
|
|
|
|
## How It Works |
|
|
|
|
|
This Space runs a virtual X11 desktop using: |
|
|
- **Xvfb**: Virtual framebuffer for headless display |
|
|
- **Xfce4**: Lightweight desktop environment |
|
|
- **xdotool**: Mouse and keyboard automation |
|
|
- **Gradio**: Web UI for control |
|
|
|
|
|
## Usage |
|
|
|
|
|
1. Click "Take Screenshot" to see the current desktop |
|
|
2. Use the action controls to interact: |
|
|
- Enter X,Y coordinates and click |
|
|
- Type text |
|
|
- Press keyboard shortcuts |
|
|
- Scroll up/down |
|
|
|
|
|
## Architecture |
|
|
|
|
|
``` |
|
|
βββββββββββββββββββββββββββββββββββ |
|
|
β HuggingFace Spaces Container β |
|
|
β βββββββββββββ ββββββββββββββ β |
|
|
β β Xvfb ββββ Xfce4 β β |
|
|
β β :99 β β Desktop β β |
|
|
β βββββββ¬ββββββ ββββββββββββββ β |
|
|
β β β |
|
|
β βββββββΌββββββ ββββββββββββββ β |
|
|
β β xdotool ββββ Gradio β β |
|
|
β β control β β UI :7860 β β |
|
|
β βββββββββββββ ββββββββββββββ β |
|
|
βββββββββββββββββββββββββββββββββββ |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
MIT |
|
|
|