Spaces:
Sleeping
Sleeping
Browser Actor
Browser Actor is a web automation library built on CDP (Chrome DevTools Protocol) that provides low-level browser automation capabilities within the browser-use ecosystem.
Usage
Integrated with Browser (Recommended)
from browser_use import Browser # Alias for BrowserSession
# Create and start browser session
browser = Browser()
await browser.start()
# Create new tabs and navigate
page = await browser.new_page("https://example.com")
pages = await browser.get_pages()
current_page = await browser.get_current_page()
Direct Page Access (Advanced)
from browser_use.actor import Page, Element, Mouse
# Create page with existing browser session
page = Page(browser_session, target_id, session_id)
Basic Operations
# Tab Management
page = await browser.new_page() # Create blank tab
page = await browser.new_page("https://example.com") # Create tab with URL
pages = await browser.get_pages() # Get all existing tabs
await browser.close_page(page) # Close specific tab
# Navigation
await page.goto("https://example.com")
await page.go_back()
await page.go_forward()
await page.reload()
Element Operations
# Find elements by CSS selector
elements = await page.get_elements_by_css_selector("input[type='text']")
buttons = await page.get_elements_by_css_selector("button.submit")
# Get element by backend node ID
element = await page.get_element(backend_node_id=12345)
# AI-powered element finding (requires LLM)
element = await page.get_element_by_prompt("search button", llm=your_llm)
element = await page.must_get_element_by_prompt("login form", llm=your_llm)
Note:
get_elements_by_css_selectorreturns immediately without waiting for visibility.
Element Interactions
# Element actions
await element.click(button='left', click_count=1, modifiers=['Control'])
await element.fill("Hello World") # Clears first, then types
await element.hover()
await element.focus()
await element.check() # Toggle checkbox/radio
await element.select_option(["option1", "option2"]) # For dropdown/select
await element.drag_to(target_element) # Drag and drop
# Element properties
value = await element.get_attribute("value")
box = await element.get_bounding_box() # Returns BoundingBox or None
info = await element.get_basic_info() # Comprehensive element info
screenshot_b64 = await element.screenshot(format='jpeg')
# Execute JavaScript on element (this context is the element)
text = await element.evaluate("() => this.textContent")
await element.evaluate("(color) => this.style.backgroundColor = color", "yellow")
classes = await element.evaluate("() => Array.from(this.classList)")
Mouse Operations
# Mouse operations
mouse = await page.mouse
await mouse.click(x=100, y=200, button='left', click_count=1)
await mouse.move(x=300, y=400, steps=1)
await mouse.down(button='left') # Press button
await mouse.up(button='left') # Release button
await mouse.scroll(x=0, y=100, delta_x=0, delta_y=-500) # Scroll at coordinates
Page Operations
# JavaScript evaluation
result = await page.evaluate('() => document.title') # Must use arrow function format
result = await page.evaluate('(x, y) => x + y', 10, 20) # With arguments
# Keyboard input
await page.press("Control+A") # Key combinations supported
await page.press("Escape") # Single keys
# Page controls
await page.set_viewport_size(width=1920, height=1080)
page_screenshot = await page.screenshot() # JPEG by default
page_png = await page.screenshot(format="png", quality=90)
# Page information
url = await page.get_url()
title = await page.get_title()
AI-Powered Features
# Content extraction using LLM
from pydantic import BaseModel
class ProductInfo(BaseModel):
name: str
price: float
description: str
# Extract structured data from current page
products = await page.extract_content(
"Find all products with their names, prices and descriptions",
ProductInfo,
llm=your_llm
)
Core Classes
- BrowserSession (aliased as Browser): Main browser session manager with tab operations
- Page: Represents a single browser tab or iframe for page-level operations
- Element: Individual DOM element for interactions and property access
- Mouse: Mouse operations within a page (click, move, scroll)
API Reference
BrowserSession Methods (Tab Management)
start()- Initialize and start the browser sessionstop()- Stop the browser session (keeps browser alive)kill()- Kill the browser process and reset all statenew_page(url=None)βPage- Create blank tab or navigate to URLget_pages()βlist[Page]- Get all available pagesget_current_page()βPage | None- Get the currently focused pageclose_page(page: Page | str)- Close page by object or ID- Session management and CDP client operations
Page Methods (Page Operations)
get_elements_by_css_selector(selector: str)βlist[Element]- Find elements by CSS selectorget_element(backend_node_id: int)βElement- Get element by backend node IDget_element_by_prompt(prompt: str, llm)βElement | None- AI-powered element findingmust_get_element_by_prompt(prompt: str, llm)βElement- AI element finding (raises if not found)extract_content(prompt: str, structured_output: type[T], llm)βT- Extract structured data using LLMgoto(url: str)- Navigate this page to URLgo_back(),go_forward()- Navigate history (with error handling)reload()- Reload the current pageevaluate(page_function: str, *args)βstr- Execute JavaScript (MUST use (...args) => format)press(key: str)- Press key on page (supports "Control+A" format)set_viewport_size(width: int, height: int)- Set viewport dimensionsscreenshot(format='jpeg', quality=None)βstr- Take page screenshot, return base64get_url()βstr,get_title()βstr- Get page informationmouseβMouse- Get mouse interface for this page
Element Methods (DOM Interactions)
click(button='left', click_count=1, modifiers=None)- Click element with advanced fallbacksfill(text: str, clear=True)- Fill input with text (clears first by default)hover()- Hover over elementfocus()- Focus the elementcheck()- Toggle checkbox/radio button (clicks to change state)select_option(values: str | list[str])- Select dropdown optionsdrag_to(target_element: Element | Position, source_position=None, target_position=None)- Drag to target elementevaluate(page_function: str, *args)βstr- Execute JavaScript on element (this = element)get_attribute(name: str)βstr | None- Get attribute valueget_bounding_box()βBoundingBox | None- Get element position/sizescreenshot(format='jpeg', quality=None)βstr- Take element screenshot, return base64get_basic_info()βElementInfo- Get comprehensive element information
Mouse Methods (Coordinate-Based Operations)
click(x: int, y: int, button='left', click_count=1)- Click at coordinatesmove(x: int, y: int, steps=1)- Move to coordinatesdown(button='left', click_count=1),up(button='left', click_count=1)- Press/release buttonscroll(x=0, y=0, delta_x=None, delta_y=None)- Scroll page at coordinates
Type Definitions
Position
class Position(TypedDict):
x: float
y: float
BoundingBox
class BoundingBox(TypedDict):
x: float
y: float
width: float
height: float
ElementInfo
class ElementInfo(TypedDict):
backendNodeId: int # CDP backend node ID
nodeId: int | None # CDP node ID
nodeName: str # HTML tag name (e.g., "DIV", "INPUT")
nodeType: int # DOM node type
nodeValue: str | None # Text content for text nodes
attributes: dict[str, str] # HTML attributes
boundingBox: BoundingBox | None # Element position and size
error: str | None # Error message if info retrieval failed
Important Usage Notes
This is browser-use actor, NOT Playwright or Selenium. Only use the methods documented above.
Critical JavaScript Rules
page.evaluate()andelement.evaluate()MUST use(...args) => {}arrow function format- Always returns string (objects are JSON-stringified automatically)
- Use single quotes around the function:
page.evaluate('() => document.title') - For complex selectors in JS:
'() => document.querySelector("input[name=\\"email\\"]")' element.evaluate():thiscontext is bound to the element automatically
Method Restrictions
get_elements_by_css_selector()returns immediately (no automatic waiting)- For dropdowns: use
element.select_option(), NOTelement.fill() - Form submission: click submit button or use
page.press("Enter") - No methods like:
element.submit(),element.dispatch_event(),element.get_property()
Error Prevention
- Always verify page state changes with
page.get_url(),page.get_title() - Use
element.get_attribute()to check element properties - Validate CSS selectors before use
- Handle navigation timing with appropriate
asyncio.sleep()calls
AI Features
get_element_by_prompt()andextract_content()require an LLM instance- These methods use DOM analysis and structured output parsing
- Best for complex page understanding and data extraction tasks