Type what you want in natural language. Crab-Agent navigates pages, clicks elements, fills forms, reads content, manages tabs — all autonomously via Chrome DevTools Protocol. Bring your own API key. Works with any LLM provider.
A complete agent toolkit built into a Chrome side panel. No coding required.
Uses Chromium's native Accessibility tree with ref IDs (covers Shadow DOM, cross-origin iframes, custom elements). Pixel-perfect coordinates from the renderer.
Click, type, scroll, drag, navigate, open tabs, fill forms, upload files, execute JavaScript — hardware-level events through Chrome DevTools Protocol, not synthetic JS.
Uses each provider's native tool-use API (Anthropic tool_use, OpenAI function calling, Gemini function_declarations). Structured responses, not text JSON parsing.
Record every task replay as GIF/HTML/JSON. Review what the agent did, debug failed flows, or share runs as visual artifacts.
Schedule tasks for the future — one-time or recurring. Natural language time parsing via LLM. The agent runs them automatically via Chrome alarms.
Domain-based permissions keep you in control. Smart message compaction with progressive token budgeting keeps long sessions stable.
A tool-use agent loop that keeps going until the task is done.
Open the side panel and type what you want in plain language. Attach screenshots if needed. "Book the cheapest flight to Tokyo for next Friday"
Crab-Agent takes a screenshot and pulls the native CDP Accessibility tree, mapping every interactive element to a ref ID (e.g., ref_42) with pixel-perfect coordinates.
The conversation (including visual context) is sent to your chosen LLM via native tool-calling APIs. It selects a tool — click, type, navigate, read — and the extension executes it via CDP.
The result is appended to the conversation and the loop continues. State manager handles loop detection. The agent handles multi-step flows, tab switching, and error recovery automatically.
22 external + 2 internal. The agent picks the right tool for each step.
React 18 + TypeScript + Vite. Chrome MV3 service worker. Multi-provider LLM gateway.
Free to use. Bring your own API key and let Clawd handle the rest.
Crab-Agent wouldn't exist without these amazing projects and their creators.
The adorable "Clawd" crab pixel-art mascot and SVG animations used throughout Crab-Agent are derived from the Clawd Tank project by Marcio Granzotto. Thank you for the amazing crab character!
Core agent loop architecture and browser automation logic inspired by Anthropic's Claude for Chrome. The tool-use cycle pattern — screenshot, observe, decide, act — draws heavily from their pioneering work on AI browser agents.