likhonsheikh's picture
Upload README.md with huggingface_hub
513a2d7 verified
metadata
title: Open Computer Use Agent
emoji: πŸ–₯️
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit

Open Computer Use Agent

An open-source alternative to OpenAI's Operator and Anthropic's Computer Use.

Features

  • πŸ–₯️ Full Linux desktop (Xfce) running in the browser
  • πŸ–±οΈ Mouse control (click, double-click, right-click)
  • ⌨️ Keyboard input (typing, key combinations)
  • πŸ“· Real-time screenshots
  • 🌐 Firefox ESR browser included

How It Works

This Space runs a virtual X11 desktop using:

  • Xvfb: Virtual framebuffer for headless display
  • Xfce4: Lightweight desktop environment
  • xdotool: Mouse and keyboard automation
  • Gradio: Web UI for control

Usage

  1. Click "Take Screenshot" to see the current desktop
  2. Use the action controls to interact:
    • Enter X,Y coordinates and click
    • Type text
    • Press keyboard shortcuts
    • Scroll up/down

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  HuggingFace Spaces Container   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚   Xvfb    │◄─│   Xfce4    β”‚  β”‚
β”‚  β”‚   :99     β”‚  β”‚  Desktop   β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚        β”‚                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  xdotool  │◄─│  Gradio    β”‚  β”‚
β”‚  β”‚  control  β”‚  β”‚  UI :7860  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

License

MIT