WitNote / docs /mcp.md
AUXteam's picture
Upload folder using huggingface_hub
6a7089a verified

MCP Server

PinchTab includes a native Model Context Protocol (MCP) server that lets AI agents control the browser directly through the standardized MCP interface.

Quick Start

  1. Start PinchTab in any mode (server or bridge):

    pinchtab server
    #or
    pinchtab daemon install
    
  2. Start the MCP server (in a separate terminal or from your MCP client config):

    pinchtab mcp
    

The MCP server communicates over stdio (stdin/stdout with JSON-RPC 2.0), which is the standard transport for MCP tools.

Client Configuration

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "pinchtab": {
      "command": "pinchtab",
      "args": ["mcp"]
    }
  }
}

VS Code / GitHub Copilot

Create .vscode/mcp.json in your workspace:

{
  "servers": {
    "pinchtab": {
      "type": "stdio",
      "command": "pinchtab",
      "args": ["mcp"]
    }
  }
}

Cursor

Add to your Cursor MCP settings:

{
  "mcpServers": {
    "pinchtab": {
      "command": "pinchtab",
      "args": ["mcp"]
    }
  }
}

Environment Variables

Variable Default Description
PINCHTAB_TOKEN (from config) Auth token for secured servers

For remote servers, use the --server flag: pinchtab --server http://remote:9867 mcp

Available Tools

Navigation (4 tools)

Tool Description
pinchtab_navigate Navigate to a URL
pinchtab_snapshot Get accessibility tree snapshot
pinchtab_screenshot Take a screenshot (base64 JPEG or PNG)
pinchtab_get_text Extract readable text content

Interaction (8 tools)

Tool Description
pinchtab_click Click an element by ref
pinchtab_type Type text into an input
pinchtab_press Press a keyboard key
pinchtab_hover Hover over an element
pinchtab_focus Focus an element
pinchtab_select Select a dropdown option
pinchtab_scroll Scroll page or element
pinchtab_fill Fill input via JS dispatch

Content (3 tools)

Tool Description
pinchtab_eval Execute JavaScript
pinchtab_pdf Export page as PDF
pinchtab_find Find elements by text or CSS

Tab Management (4 tools)

Tool Description
pinchtab_list_tabs List open browser tabs
pinchtab_close_tab Close a tab
pinchtab_health Check server health
pinchtab_cookies Get page cookies

Utility (2 tools)

Tool Description
pinchtab_wait Wait for N milliseconds
pinchtab_wait_for_selector Wait for a CSS selector to appear

Tool Reference

pinchtab_navigate

Navigate the browser to a URL.

Parameter Type Required Description
url string Yes URL to navigate to
tabId string No Target tab (uses current if empty)

pinchtab_snapshot

Get an accessibility tree snapshot of the page. This is the primary way agents understand page structure.

Parameter Type Required Description
tabId string No Target tab
interactive boolean No Only interactive elements
compact boolean No Compact format (fewer tokens)
diff boolean No Only changes since last snapshot
selector string No CSS selector to scope

pinchtab_screenshot

Capture a screenshot of the page. Defaults to JPEG.

Parameter Type Required Description
tabId string No Target tab
format string No jpeg (default) or png
quality number No JPEG quality 0-100

pinchtab_get_text

Extract readable text from the page.

Parameter Type Required Description
tabId string No Target tab
raw boolean No Raw text without formatting

pinchtab_click

Click an element identified by its ref from a snapshot.

Parameter Type Required Description
ref string Yes Element ref (e.g., e5)
tabId string No Target tab

pinchtab_type

Type text into an input element.

Parameter Type Required Description
ref string Yes Element ref
text string Yes Text to type
tabId string No Target tab

pinchtab_press

Press a keyboard key.

Parameter Type Required Description
key string Yes Key name (Enter, Tab, Escape, etc.)
tabId string No Target tab

pinchtab_hover

Hover over an element.

Parameter Type Required Description
ref string Yes Element ref
tabId string No Target tab

pinchtab_focus

Focus an element.

Parameter Type Required Description
ref string Yes Element ref
tabId string No Target tab

pinchtab_select

Select a dropdown option.

Parameter Type Required Description
ref string Yes Select element ref
value string Yes Option value
tabId string No Target tab

pinchtab_scroll

Scroll the page or a specific element.

Parameter Type Required Description
ref string No Element ref (omit for page scroll)
pixels number No Pixels to scroll (positive=down)
tabId string No Target tab

pinchtab_fill

Fill an input using JavaScript dispatch (works with React/Vue/Angular).

Parameter Type Required Description
ref string Yes Element ref or CSS selector
value string Yes Value to fill
tabId string No Target tab

pinchtab_eval

Execute JavaScript in the browser. Requires security.allowEvaluate: true in config.

Parameter Type Required Description
expression string Yes JavaScript expression
tabId string No Target tab

pinchtab_pdf

Export the page as a PDF document.

Parameter Type Required Description
tabId string No Target tab
landscape boolean No Landscape orientation
scale number No Print scale 0.1-2.0
pageRanges string No Pages (e.g., "1-3,5")

pinchtab_find

Find elements by text content or CSS selector using semantic matching.

Parameter Type Required Description
query string Yes Text or CSS selector
tabId string No Target tab

pinchtab_list_tabs

List all open browser tabs. No parameters.

pinchtab_close_tab

Close a browser tab.

Parameter Type Required Description
tabId string No Tab to close (current if empty)

pinchtab_health

Check server health. No parameters.

pinchtab_cookies

Get cookies for the current page.

Parameter Type Required Description
tabId string No Target tab

pinchtab_wait

Wait for a specified duration.

Parameter Type Required Description
ms number Yes Milliseconds (max 30000)

pinchtab_wait_for_selector

Wait for a CSS selector to appear on the page. Polls every 250ms.

Parameter Type Required Description
selector string Yes CSS selector
timeout number No Timeout ms (default 10000, max 30000)
tabId string No Target tab

Typical Agent Workflow

1. pinchtab_navigate β†’ go to URL
2. pinchtab_snapshot  β†’ understand page structure
3. pinchtab_click     β†’ interact with elements
4. pinchtab_type      β†’ fill in forms
5. pinchtab_snapshot  β†’ verify changes
6. pinchtab_get_text  β†’ extract results

Architecture

The MCP server is a thin stdio-based JSON-RPC layer that translates MCP tool calls into HTTP requests to a running PinchTab instance:

LLM Client ──stdio──▢ pinchtab mcp ──HTTP──▢ PinchTab Server

This architecture means:

  • The MCP server works with any PinchTab deployment (local, remote, Docker)
  • No direct Chrome dependency β€” the server delegates to PinchTab
  • Built with mcp-go SDK (MCP spec 2025-11-25)

Code Layout

internal/mcp/
β”œβ”€β”€ server.go      # MCP server setup and stdio transport
β”œβ”€β”€ tools.go       # Tool definitions with JSON schemas
β”œβ”€β”€ handlers.go    # Tool handler implementations
└── client.go      # HTTP client for PinchTab API

cmd/pinchtab/
└── cmd_mcp.go     # `pinchtab mcp` subcommand