AI Search and Your ISMS: ISO 27001 A.5.34 & A.8.21 Reframed

AI search reshapes ISO 27001 A.5.34 (PII) and A.8.21 (network services). What auditors now look for, gaps we flag, and a working baseline.

ISO 27001:2022 was finalized before ChatGPT shipped. The Annex A controls don't mention AI search engines, generative bots, or large language models, and they don't need to, because the controls were written at a level of abstraction that absorbs new technology. But auditors are now beginning to look at how two specific controls apply to a surface that didn't exist in the prior version of the standard.

The two controls are A.5.34 (Privacy and protection of PII) and A.8.21 (Security of network services). Most ISMS programs have evidence for the pre-AI interpretation of both. Almost none have evidence for the AI-era extension. This post walks through what changes, the gaps that are starting to show up, and a working baseline you can implement in a sprint.

Why this matters now

AI search engines (ChatGPT web search, Perplexity, Google AI Overviews, Gemini, Bing Copilot, Claude with web tools) sit in two positions relative to your organization at the same time:

Crawler / consumer of your content, GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Google-Extended, and their kin fetch your site to index it for citation, training, or both.
Surface where information about you appears, when a prospect, customer, journalist, or attacker queries an AI engine about your organization, the engine produces an answer synthesized from indexed sources, training data, and inference. The answer can include accurate facts, stale facts, or fabricated ones.

Both positions create real obligations under ISO 27001:2022. Neither is currently well-covered by typical ISMS evidence. That gap is starting to surface in audits, and will accelerate as AI search share continues to grow.

A.5.34 reframed: privacy and PII in the AI era

A.5.34, Privacy and protection of PII, calls for the organization to identify and meet requirements regarding the preservation of privacy and protection of personally identifiable information, in line with applicable laws, regulations, and contractual requirements.

Pre-AI, this control mapped to your privacy policy, your DSAR procedure, your data-deletion workflow, your retention schedule, and your supplier agreements. All of that still applies. AI search adds three new questions an auditor will want answers to:

1. Detection of AI-surfaced PII about your data subjects

AI engines synthesize answers from many sources. When a query touches on a person whose data your organization processes, a customer, a former employee, a job applicant, the AI may produce content that includes PII you've already deleted, or worse, content the AI fabricated. You don't control either outcome directly. You do need a process to detect it.

Practically: a periodic check on AI engines for queries that could surface your data subjects' PII. Manual or scripted, but documented. The artifact an auditor will ask for: evidence the check has been performed in the audit period, with a record of any incidents and the response.

2. A signaling layer for AI ingestion (llms.txt and beyond)

Robots.txt has been the de-facto crawl-control surface for two decades. AI training and citation introduces a new layer: what your organization explicitly intends AI engines to index, train on, or quote.

The emerging standard is llms.txt, a structured markdown file at your domain root listing what's available, what the canonical content is, and what's contextual. Pair it with per-bot directives in robots.txt for crawlers you want to allow or block (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, anthropic-ai, etc.).

Auditors looking at A.5.34 in 2026 are starting to ask whether the organization has a documented position on AI ingestion of content that includes PII (e.g., team bios, client logos with consent, customer testimonials). The position can be allow, deny, or partial, but it needs to exist and be implemented through llms.txt + robots.txt.

3. AI in the incident-response playbook

If an AI engine hallucinates PII about a data subject, or worse, about your organization's regulated obligations, that's an information-security incident. Most incident response plans don't currently include AI-surfaced misinformation as a category.

At minimum, the playbook should contain: how to detect (link to the periodic check), how to escalate, how to issue takedown or correction requests to the relevant engines, and how to document the resolution.

A.8.21 reframed: AI bots as network services

A.8.21, Security of network services, requires security mechanisms, service levels, and management requirements to be identified and applied to the network services the organization uses. Pre-AI, this typically covered ISP contracts, VPN providers, edge security platforms, content delivery networks, and similar infrastructure suppliers.

AI search engines and their crawlers introduce three new entries in this control's scope.

1. AI crawler traffic is network-service traffic

GPTBot, ClaudeBot, PerplexityBot, OAI-SearchBot, Google-Extended, and Bingbot AI extensions are network services your site interacts with on every request. Each publishes documented user agents and (in most cases) IP ranges you can verify.

The audit question: are you logging AI-bot traffic distinctly? Are you authenticating that bots claiming to be GPTBot are actually originating from OpenAI's documented ranges, not spoofers? Are you rate-limiting the firehose if a bot misbehaves?

Most organizations have none of this configured today. The baseline is straightforward: a logging filter, a verification check (Anthropic and OpenAI both publish IP ranges for legitimate bots), and a rate-limit policy.

2. A documented organizational position on AI ingestion

A.8.21 calls for "service levels and management requirements" to be documented. Translation for AI: where does your organization sit on the allow / partial / deny spectrum for each major AI engine, and how is that position implemented in robots.txt, llms.txt, headers, and CDN rules?

The right answer differs by organization. A B2B SaaS marketing site might allow everything. A healthtech platform might block training crawlers (Google-Extended, anthropic-ai, GPTBot for training) while allowing search-time bots (OAI-SearchBot, ClaudeBot for retrieval). A regulated financial firm might block everything except documented search bots.

Whichever position you take, an auditor wants to see it documented in policy, implemented in the actual files, and reviewed at least annually.

3. Employee use of external AI services

A.8.21 is bidirectional, it covers network services your organization uses, not just ones that interact with you. ChatGPT, Claude, Gemini, and similar consumer AI tools are network services your employees increasingly use, often outside formal procurement.

The audit question: do you have an employee-use policy for external AI tools? Is data classification reflected in what can be pasted into a prompt? Does your DLP catch outbound traffic to AI service domains? Is there a sanctioned-tool list?

This overlaps significantly with A.5.20 (Information security in supplier relationships) and A.5.23 (Information security for use of cloud services), but A.8.21 is where the network-traffic side of it lands.

The gaps we're starting to flag

Across recent ISO 27001 internal audits, these AI-era gaps are coming up , and they'll come up more aggressively in the next cycle of certification audits.

Gap 1: No documented position on AI ingestion

The organization has a robots.txt, but it's the default wildcard-allow. There's no llms.txt. There's no policy section addressing AI engines specifically. When asked, "what's our position on AI training using our content?" the answer is "we haven't decided." That's an A.5.34 + A.8.21 finding.

Gap 2: No bot-traffic logging

Web access logs exist but aren't filtered or alerted on for AI-bot user agents. The team can't answer "how often is GPTBot on our site this month?", let alone authenticate that the traffic is genuine.

Gap 3: Periodic AI surveillance not in the calendar

No one has searched ChatGPT, Perplexity, or Gemini for the organization's own customers, products, or known-PII data subjects. If an AI engine is hallucinating something damaging right now, no one would know.

Gap 4: AI tools missing from the supplier register

The team uses ChatGPT, Claude, Copilot, Cursor, none of which are in the supplier register or covered by a vendor risk assessment. A.5.19, A.5.20, A.5.23 all flag this. So does A.8.21 by extension.

Gap 5: Incident response silent on AI-surfaced misinformation

The incident response playbook covers data breaches, malware, DDoS, account takeover. It does not cover "an AI engine published wrong information about us / our customers / our regulatory posture." The runbook for that scenario doesn't exist.

A working baseline

Six artifacts close most of the AI-era gap under both controls. This is what I'd expect a 2026-mature ISMS to have.

1. Documented organizational position on AI ingestion

A short policy section: which AI crawler bots are permitted (training vs search-time), what content is in scope vs out of scope, the implementation file path (robots.txt + llms.txt), and the review cadence (annual minimum).

2. Implemented robots.txt and llms.txt

The policy implementation. Per-bot directives in robots.txt for the bots you want to allow or deny. An llms.txt at the root with the canonical content list. Documented decisions about Google-Extended, anthropic-ai, GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, and CCBot (Common Crawl).

3. AI-bot traffic logging

A web-log filter or CDN rule that tags requests from documented AI-bot user agents into their own bucket. Bonus: cross-check the source IP against the bot's published range to detect spoofers.

4. Quarterly AI surveillance check

A documented procedure for periodically querying the major AI engines for the organization's name, key product names, and sample customer mentions. A log of what was found, including any inaccuracies and the response taken. Quarterly is a defensible cadence for most organizations.

5. Employee AI-use policy and supplier listing

An acceptable-use policy section covering external AI services, a sanctioned-tool list, data-classification rules for AI prompts, and the AI tools added to the supplier register with appropriate vendor-risk assessment depth.

6. Incident-response coverage of AI-surfaced misinformation

A playbook entry: detection trigger (something surfaced in the quarterly check, a customer complaint, a journalist inquiry), escalation path, takedown / correction request templates for the major engines, and the documentation requirement.

The ISO 42001 cross-reference

If the organization is in scope for ISO 42001:2023, the AI-related extensions of A.5.34 and A.8.21 ladder cleanly into the AIMS clauses. ISO 42001 Annex A.6.2 (AI system lifecycle) and Annex A.7 (Data for AI systems) cover the org's own AI systems; the A.5.34 + A.8.21 work in this post covers the org's relationship to external AI systems. Together they give a coherent answer to "how is your organization governed with respect to AI" without separate frameworks competing.

What an audit looks for

During an ISO 27001 audit in 2026 and beyond, I'd review the following for the AI-era extension of A.5.34 and A.8.21:

The policy text addressing AI ingestion and AI tool use
The actual robots.txt + llms.txt files at the production root
The web-log filter for AI-bot traffic, and a recent extract
The most recent quarterly AI-surveillance check log, with at least one substantive query per engine
The supplier register entries for any AI tools in use, with risk-assessment evidence
The incident-response playbook section on AI-surfaced misinformation

Frequently asked questions

Should I block AI training crawlers entirely?

It depends on your business. If your business model relies on being discoverable in AI search (B2B services, content marketing, public-information-heavy), blocking training crawlers cuts you out of the next decade of search. If your business handles regulated data and "discoverability" isn't a value driver, blocking is defensible. The wrong answer is not deciding, both A.5.34 and A.8.21 want to see a documented decision.

Is llms.txt a real standard?

It's a community standard, not yet an IETF or W3C specification. It's increasingly recognized by AI search engines and content platforms as the convention for signaling AI-relevant content structure. For audit purposes, it functions the same way robots.txt does, a documented, implemented, testable signal.

How is "AI-bot traffic logging" different from regular web logs?

Regular web logs capture everything; the missing piece is analysis. An AI-bot filter is a log query, an alert rule, or a CDN rule that surfaces AI-specific traffic patterns. The artifact an auditor wants is evidence the team can answer questions about AI-bot traffic, not necessarily a separate logging pipeline.

What if our marketing team uses ChatGPT to write blog posts?

That belongs in your acceptable-use policy and content-review process, not in this control specifically. But the use of an external AI service to generate content that ends up on your domain creates supplier-relationship questions (A.5.20), data flow questions (A.8.10–A.8.12), and content-integrity questions that interact with A.5.34 if the AI is making claims about identifiable people.

Does Vanta or Drata cover any of this automatically?

Not yet, in any meaningful way. Both platforms map evidence to Annex A controls but neither has a template specifically for AI-era A.5.34 / A.8.21 evidence as of this writing. The substantive policy, llms.txt, surveillance log, and IR playbook all live in your own documentation. Use the GRC platform as the storage and reminder layer, but don't expect it to generate the artifacts.

How fast is this audit expectation moving?

Faster than the 2022 standard anticipated. The certification bodies and auditor training organizations are starting to publish AI-extension guidance. The first round of certification audits to take this seriously will be the cycles starting in late 2026. Internal audits are the right place to surface and close these gaps before then.