| --- |
| summary: "Firecrawl fallback for web_fetch (anti-bot + cached extraction)" |
| read_when: |
| - You want Firecrawl-backed web extraction |
| - You need a Firecrawl API key |
| - You want anti-bot extraction for web_fetch |
| title: "Firecrawl" |
| --- |
| |
| # Firecrawl |
|
|
| OpenClaw can use **Firecrawl** as a fallback extractor for `web_fetch`. It is a hosted |
| content extraction service that supports bot circumvention and caching, which helps |
| with JS-heavy sites or pages that block plain HTTP fetches. |
|
|
| ## Get an API key |
|
|
| 1. Create a Firecrawl account and generate an API key. |
| 2. Store it in config or set `FIRECRAWL_API_KEY` in the gateway environment. |
|
|
| ## Configure Firecrawl |
|
|
| ```json5 |
| { |
| tools: { |
| web: { |
| fetch: { |
| firecrawl: { |
| apiKey: "FIRECRAWL_API_KEY_HERE", |
| baseUrl: "https://api.firecrawl.dev", |
| onlyMainContent: true, |
| maxAgeMs: 172800000, |
| timeoutSeconds: 60, |
| }, |
| }, |
| }, |
| }, |
| } |
| ``` |
|
|
| Notes: |
|
|
| - `firecrawl.enabled` defaults to `true` unless explicitly set to `false`. |
| - Firecrawl fallback attempts run only when an API key is available (`tools.web.fetch.firecrawl.apiKey` or `FIRECRAWL_API_KEY`). |
| - `maxAgeMs` controls how old cached results can be (ms). Default is 2 days. |
|
|
| ## Stealth / bot circumvention |
|
|
| Firecrawl exposes a **proxy mode** parameter for bot circumvention (`basic`, `stealth`, or `auto`). |
| OpenClaw always uses `proxy: "auto"` plus `storeInCache: true` for Firecrawl requests. |
| If proxy is omitted, Firecrawl defaults to `auto`. `auto` retries with stealth proxies if a basic attempt fails, which may use more credits |
| than basic-only scraping. |
|
|
| ## How `web_fetch` uses Firecrawl |
| |
| `web_fetch` extraction order: |
|
|
| 1. Readability (local) |
| 2. Firecrawl (if configured) |
| 3. Basic HTML cleanup (last fallback) |
|
|
| See [Web tools](/tools/web) for the full web tool setup. |
|
|