Action guide · April 2026

How to Opt Out of AI Training Data

Every major AI provider offers some way to prevent your conversations, documents, or published content from being used to train future models. Some are one-toggle settings; some require a formal legal objection. This guide covers all of them, with direct opt-out instructions and the legal basis behind each.

Fast path

1. Log in to each AI service and toggle off training/history settings.
2. For providers without a setting: send a formal GDPR Art. 21 objection (EU/EEA) or CCPA deletion request (US).
3. Block Common Crawl (CCBot) in your own website's robots.txt.
4. Choose enterprise tiers when available, most contractually exclude training.

How do you opt out from each AI provider?

OpenAI

Free to opt out

Models: ChatGPT, GPT-5, DALL-E, Sora

ChatGPT settings: toggle off "Improve the model for everyone". API users: default is no training use. Legal basis: OpenAI relies on legitimate interest (GDPR Art. 6(1)(f)) which can be objected to per Art. 21.

Anthropic

Not used by default

Models: Claude (all versions)

Anthropic does not train on API inputs or Claude.ai conversations by default. To formally object to any use, email privacy@anthropic.com citing GDPR Art. 21 or applicable state law.

Google

Free to opt out

Models: Gemini, Gemini Advanced

Gemini Apps Activity setting at myactivity.google.com, turn off to prevent Gemini from using your conversations for training. Enterprise and Workspace usage is separate.

Perplexity

Free to opt out

Models: Perplexity AI, Pro Search

Settings > Privacy > toggle "AI Data Retention" off. Does not use conversations for model improvement when disabled.

Mistral AI

Opt-out via settings

Models: Le Chat, Codestral, Mistral Large

Le Chat Privacy Settings, disable "Use my data to improve the model." API users: no training by default.

Microsoft

Free to opt out

Models: Copilot, Bing Chat

Microsoft Privacy Dashboard: account.microsoft.com/privacy, turn off "Chat history." Enterprise Copilot is covered by Microsoft 365 terms (no training on customer data).

xAI (Grok)

Opt-out available

Models: Grok

X settings > Data sharing > disable "Share your posts for training." Grok API: no training by default.

Cohere

Not used by default

Models: Command R, Command R+

API inputs not used for training by default. To object to any historical use, email privacy@cohere.com.

Common Crawl (upstream dataset)

Robots.txt respected

Models: Used by most training pipelines

Add CCBot to your robots.txt: "User-agent: CCBot\nDisallow: /". Common Crawl is the most influential web-scrape dataset used to train nearly every major LLM.

How do you block AI training crawlers on your own website?

If you publish content on your own site and want to block AI training crawlers, add the following to your robots.txt:

User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

User-agent: Bytespider
Disallow: /

User-agent: Meta-ExternalAgent
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: Applebot-Extended
Disallow: /

Note: Major AI providers publicly commit to honoring robots.txt, but enforcement varies. Some training-data providers ignore robots.txt. The NYT vs OpenAI litigation is shaping the legal framework around training-data scraping.

What legal rights cover AI training data?

GDPR Article 21

Right to object to processing based on legitimate interest. Most AI providers cite legitimate interest for training, this is the primary objection pathway in EU/EEA.

GDPR Article 17

Right to erasure. Applies to training data if the lawful basis for processing is missing or withdrawn.

CCPA (California)

Right to delete (§ 1798.105) and right to opt out of sale/sharing (§ 1798.120). Applies to AI providers meeting CCPA thresholds.

CA AB 2013 (2026)

California AI Transparency Act requires AI developers to publish summaries of training datasets, enabling targeted opt-out requests.

Colorado CPA profiling opt-out

Right to opt out of profiling that produces legal or similarly significant effects.

Minnesota MCDPA

Right to question automated decisions and demand human review.

After AI opt-out

Complete the privacy pass

Opting out of AI training stops future exposure. Removing data brokers stops current exposure. OfflistMe handles 500+ brokers in one run for $7.

Start broker opt-out for $7 →

FAQ

Can I remove my data from a model that is already trained?+

Not directly, once a model is trained, individual data points are embedded across billions of parameters and cannot be "deleted" mechanically. However, you can: (1) object to future use under GDPR Article 21 or CCPA, (2) request deletion of conversation logs that might be used for fine-tuning, (3) prevent future inclusion by opting out of conversation storage, and (4) cite a formal erasure demand that may obligate providers to exclude you from future model versions.

What legal rights do I have against AI training?+

Under GDPR, Article 21 grants the right to object to processing based on legitimate interest, which most AI providers rely on. Under GDPR Article 17, the right to erasure covers training data if lawful basis is absent. Under CCPA, the right to delete and right to opt out of sale both apply. State laws (Colorado CPA, Minnesota MCDPA) grant specific opt-outs from profiling and automated decision-making. California and Illinois have also passed AI-specific transparency laws (AB 2013 and HB 5116 respectively).

What is the difference between "training opt-out" and "chat history opt-out"?+

They are related but distinct. Chat-history opt-out prevents your conversations from being stored at all. Training opt-out prevents stored conversations from being used to improve future models. Some providers offer both (e.g., OpenAI); others offer only one. For complete protection, enable both where available.

Do enterprise AI subscriptions include training opt-out by default?+

Most enterprise tiers (OpenAI Enterprise, ChatGPT Team, Microsoft Copilot for Microsoft 365, Anthropic Claude Enterprise, Google Gemini for Workspace) contractually guarantee no training on customer data. This is a major reason enterprise tiers exist. Verify in the specific DPA (Data Processing Agreement).

Can I opt my copyrighted content out of being training data?+

Partially. Common Crawl and other upstream datasets respect robots.txt, add "User-agent: CCBot" with "Disallow: /" to prevent inclusion in future crawls. But content already in existing datasets remains. Platforms like Substack, Medium, and GitHub are adding bulk opt-out mechanisms. The NYT vs OpenAI case is shaping case law on this.

ChatGPT, Claude, Gemini & Grok opt-outPlatform-specific step-by-step walkthrough What is GDPR?EU right to object (Article 21)What is CCPA?California deletion rights What is a data broker?Different layer of exposure International privacy lawsGDPR, PIPEDA, LGPD, and more Privacy glossary41 terms defined 500+ broker directoryThe other major exposure surface

Fast path

How do you opt out from each AI provider?

OpenAI

Anthropic

Google

Meta

Perplexity

Mistral AI

Microsoft

xAI (Grok)

Cohere

Common Crawl (upstream dataset)

How do you block AI training crawlers on your own website?

What legal rights cover AI training data?

Complete the privacy pass

FAQ

Related