Add Cloudflare /crawl Support

thomasumstattd · March 11, 2026, 1:42pm

Cloudflare Just Launched the /crawl Endpoint – Crawl Entire Websites with ONE API Call (Open Beta)

Cloudflare dropped a game-changer yesterday (March 10, 2026): the new Browser Rendering /crawl endpoint.

Unlike Markdown for Agents, this Cloudflare feature is free and enabled automatically meaning most Pickaxe users have Cloudflare and this feature already works on their website. It’s just a matter of adding it to the

This is huge for anyone building AI agents, RAG systems, knowledge bases, or data ingestion tools inside Pickaxe. You no longer need to roll your own crawler, manage Puppeteer queues, or fight anti-bot measures — just one API call and you get clean, fully rendered content from an entire site (or any section of it).

What the /crawl Endpoint Actually Does

Submit a starting URL and Cloudflare automatically:

Discovers pages via sitemaps (including deeply nested sitemaps like YOAST SEO generates on WordPress), internal links, or both
Renders each page in a real headless browser (JavaScript fully executed) or fast static mode
Returns the content in the exact format you want, yes including markdown.

Jobs are asynchronous (fire the request → get a job ID → poll until done). Results are stored for 14 days.

Why This Is Perfect for Pickaxe

Instant RAG ingestion — Pull clean Markdown or structured JSON from docs sites, blogs, product catalogs, or client websites.
LLM-ready output — Native Markdown support means fewer tokens and better agent performance.
Structured data extraction — Use Workers AI to pull exactly what you need (products, FAQs, pricing, etc.) with a prompt + JSON schema.
Incremental & smart — Only re-crawl changed pages on repeat runs.
Well-behaved bot — Fully respects robots.txt (including Crawl-delay and Sitemap directives). No angry site owners.
Handles real-world sites perfectly — depth up to 100,000 links, wildcards for include/exclude, subdomains, external links, custom headers, auth, etc.

Simple Example (Markdown + JSON crawl)

curl -X POST 'https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/crawl' \
  -H 'Authorization: Bearer <YOUR_TOKEN>' \
  -H 'Content-Type: application/json' \
  -d '{
    "url": "https://example.com/docs",
    "formats": ["markdown", "json"],
    "limit": 200,
    "depth": 5,
    "source": "all",
    "render": true
  }'

You instantly get back a job_id. Then poll it:

curl -X GET 'https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/crawl/{job_id}' \
  -H 'Authorization: Bearer <YOUR_TOKEN>'

When done, the response contains an array of records with your chosen formats + metadata (title, status, final URL, etc.).

Key Features & Controls

formats: ["html"], ["markdown"], ["json"], or any combo
source: "all" (default), "sitemaps", or "links"
limit / depth — full control (default limit 10, depth up to 100k)
render: false — super-fast static HTML (no browser cost during beta)
options.includePatterns / excludePatterns — wildcard targeting (e.g. ["**/docs/**"])
jsonOptions — AI-powered extraction (prompt + schema)
modifiedSince / maxAge — incremental crawling
Block images/fonts/stylesheets, custom User-Agent, auth, headers, waitForSelector, etc.

Full reference (with every parameter explained):

Official announcement:

Limits (Open Beta)

Workers Free plan: 5 crawl jobs per day, max 100 pages per crawl (+ 10 min browser time/day)
Workers Paid plan: Much higher limits (billed by browser hours used — ~$0.09/hr beyond included allowance)

Quick Note for Website Owners

If you run a site, nothing to turn on. The crawler is polite and honors robots.txt. You can block it via WAF if you want, but most people are leaving it open.

This feels like it could become a core primitive for Pickaxe agents and data sources. Cleaner, cheaper, and more reliable than anything we’ve had before.

Topic		Replies	Views
Issue Building a pickaxe to do an seo audit of website Prompt Help	4	115	January 3, 2025
Cloudflare Adds Markdown for Agents Feature Requests	1	31	February 23, 2026
Allow CSV/Batch imports to Knowledge Base Feature Requests	1	31	October 10, 2025
Documents upload for analize content and web scraping ability Feature Requests	0	66	September 1, 2024
Pickaxe embedding pdf pages, youtube videos, webpage sources in responses Bugs / Site Issues	5	45	May 16, 2026