Cloudflare Demands AI Crawlers Be Separated from Search Crawlers, Blocks Mixed-Use Bots by Default

N.R. Finch

Published todayAbout 10 min read

Starting September 15, 2026, Cloudflare will block "dual-use crawlers" — bots that handle both search indexing and AI training — by default on ad-supported pages, effectively forcing AI companies to run separate crawlers for search and AI, rewriting the default rules for how website content feeds AI.

What is a "dual-use crawler," and why split it?

A dual-use crawler is a single bot doing three jobs at once: indexing pages for search, feeding data to AI model training, and running tasks for AI agents. Most major search companies operate this way today.

Cloudflare's new rule boils down to one demand: if you want to crawl our sites for search, fine — but you can't quietly funnel the same content into AI training. The two functions must run on separate, separately labeled crawlers.

This means → AI companies that still want content from Cloudflare-hosted sites must deploy and identify a dedicated AI crawler — no more piggybacking inside a search bot.

Who does the new rule cover? Can site owners override it?

It applies to all new customers, new sites from existing customers, and every current free-tier user — a vast footprint, given Cloudflare hosts a huge share of the world's small and mid-sized websites.

The default is "blocked." Site owners can manually allow dual-use crawlers, but if they do nothing, the answer is no.

In plain terms = the old default was "open unless you opt out." Now it flips to "closed unless you opt in." That reversal sharply cuts the volume of content AI companies can access without negotiation.

Why is Cloudflare singling out Google?

The announcement names "the world's largest search engine" and says it holds roughly twice the information of other AI companies — because Google makes it hard for sites to stay discoverable in search while refusing AI use of their content.

Google disputes this, noting it already offers a crawler called Google Extended that lets sites opt out of AI training and products like Gemini without losing search visibility.

This reflects a deeper power struggle: who gets to draw the line between "search" and "AI use"? Google says it already split the two; Cloudflare says the split isn't real enough. At its core, the fight is over data control.

What does "Pay Per Use" change?

Cloudflare is upgrading its old "Pay Per Crawl" model to "Pay Per Use" — publishers no longer get paid just when content is scraped, but when it actually generates value downstream.

Cloudflare's data shows AI crawlers spend over 50% of their traffic re-fetching unchanged pages. The new model should cut publishers' wasted bandwidth and compute costs at the same time.

This means → the payment trigger shifts from "you visited" to "you used my content" — fairer for publishers, and it pressures AI companies to stop redundant crawling.

Will publishers actually make money from this?

Two AI companies have signed on so far: Ceramic.ai and You.com. Once a publisher opts in, they earn revenue when their content surfaces in Ceramic's AI search results or when You.com accesses their paywalled content.

The critical variable: whether more AI companies agree to join this payment framework. Without major players participating, publisher revenue stays small.

In plain terms = Cloudflare has built the toll booth. How much toll it collects depends on how many cars are willing to take the road.

Content is for reference only, not financial advice.