---
title: "What Is an AI Crawler? GPTBot, ClaudeBot & More"
description: "An AI crawler is a bot operated by an AI company to fetch web content for training or live retrieval — GPTBot, ClaudeBot, PerplexityBot, Google-Extended. Controlled via robots.txt."
canonical: https://aiovsseo.com/glossary/ai-crawler.html
date: 2026-06-07
---
# What is AI crawler?

Definition

An AI crawler is a bot run by an AI company to fetch web content for model training or live retrieval. Examples include GPTBot, ClaudeBot, PerplexityBot and Google-Extended. You control their access via robots.txt.

An **AI crawler** is an automated agent operated by an AI provider that fetches web pages either to train models or to retrieve content for live answers. They are distinct from classic search crawlers, though some companies run both.

## Common AI crawlers (2026)

- **GPTBot** (OpenAI) — training; **OAI-SearchBot** / **ChatGPT-User** — search & browsing
- **ClaudeBot** (Anthropic) — training; **Claude-SearchBot** — retrieval
- **PerplexityBot** — retrieval
- **Google-Extended** — Gemini training (separate from Googlebot)

## Controlling access

Govern them with `robots.txt` rules and the [Content-Signal](/glossary/content-signal.html) directive to separate search indexing, live retrieval and training permissions.

## Frequently asked questions

**Should I block AI crawlers?**

It depends on your goal. Blocking training crawlers (e.g. GPTBot, Google-Extended) while allowing retrieval crawlers keeps you citable in AI answers without feeding training. Blocking everything risks losing AI visibility entirely.

**Is Google-Extended the same as Googlebot?**

No. Googlebot handles search indexing; Google-Extended is a separate control for whether your content trains Gemini. You can allow one and disallow the other.
