Mirror OpenSkyNet workspace snapshot from Git HEAD

fc93158 verified 9 days ago

1.84 kB

	---
	summary: "Firecrawl fallback for web_fetch (anti-bot + cached extraction)"
	read_when:
	- You want Firecrawl-backed web extraction
	- You need a Firecrawl API key
	- You want anti-bot extraction for web_fetch
	title: "Firecrawl"
	---

	# Firecrawl

	OpenClaw can use Firecrawl as a fallback extractor for `web_fetch`. It is a hosted
	content extraction service that supports bot circumvention and caching, which helps
	with JS-heavy sites or pages that block plain HTTP fetches.

	## Get an API key

	1. Create a Firecrawl account and generate an API key.
	2. Store it in config or set `FIRECRAWL_API_KEY` in the gateway environment.

	## Configure Firecrawl

	```json5
	{
	tools: {
	web: {
	fetch: {
	firecrawl: {
	apiKey: "FIRECRAWL_API_KEY_HERE",
	baseUrl: "https://api.firecrawl.dev",
	onlyMainContent: true,
	maxAgeMs: 172800000,
	timeoutSeconds: 60,
	},
	},
	},
	},
	}
	```

	Notes:

	- `firecrawl.enabled` defaults to `true` unless explicitly set to `false`.
	- Firecrawl fallback attempts run only when an API key is available (`tools.web.fetch.firecrawl.apiKey` or `FIRECRAWL_API_KEY`).
	- `maxAgeMs` controls how old cached results can be (ms). Default is 2 days.

	## Stealth / bot circumvention

	Firecrawl exposes a proxy mode parameter for bot circumvention (`basic`, `stealth`, or `auto`).
	OpenClaw always uses `proxy: "auto"` plus `storeInCache: true` for Firecrawl requests.
	If proxy is omitted, Firecrawl defaults to `auto`. `auto` retries with stealth proxies if a basic attempt fails, which may use more credits
	than basic-only scraping.

	## How `web_fetch` uses Firecrawl

	`web_fetch` extraction order:

	1. Readability (local)
	2. Firecrawl (if configured)
	3. Basic HTML cleanup (last fallback)

	See [Web tools](/tools/web) for the full web tool setup.