Structured Data for AI Search: 6 Schema Types That Move the Needle

SEO

June 2, 2026

Technical SEO – speed and site structure concept

Why AI search engines read schema differently than crawlers

If you are investigating structured data AI search, the short answer is six schema types do most of the work: FAQ, Article, Person, HowTo, Speakable, and Organization. Pages with two or three of these implemented correctly are measurably more likely to appear in AI-generated responses, up to 36% more likely according to WPRiders' citation testing. The rest of this post is the JSON-LD patterns Gravidy ships on its own blog, in the order we ship them.

Key Takeaways

Six schema types (FAQ, Article, Person, HowTo, Speakable, Organization) do roughly 90% of the citation work in ChatGPT, Perplexity, and AI Overviews.
Pages with schema get 40% higher click-through versus pages without it, per Schema App data cited by SEOptimer.
FAQ schema is the highest-leverage starting point because question-and-answer pairs map directly to how LLMs retrieve passages.
Schema validates authorship and entity context, which is what AI engines use to decide who to cite when answers are contested.

Traditional Google crawlers treat schema as a SERP-feature trigger. Star ratings, FAQ dropdowns, breadcrumb links, the visual stuff. AI engines treat the same markup as a classification layer. They use it to decide what your page is about, who wrote it, and whether the passage they want to quote actually belongs to the entity they think they are citing.

This matters because AI Overviews, ChatGPT, and Perplexity sit at different points in the pipeline. AI Overviews extracts at query time from Google's index. ChatGPT pulls from its training set and, increasingly, from live Bing retrieval. Perplexity hits a live web index per query. Same schema, three different jobs. Per SEOptimer's review of Backlinko data, 72% of pages on Google's first page already use some form of schema, so the floor is high. The ceiling is whether your markup is precise enough for AI extraction, not just rich-result eligibility.

Which schema types matter for structured data AI search?

Six. In priority order: FAQ, Article, Person, HowTo, Speakable, Organization. Everything else (Product, Review, Event, Recipe) is vertical-specific and only matters if your business sells those things directly. For B2B SaaS and services, the six above carry the weight.

Verified schema impact metrics for AI search

Citation lift with schema (WPRiders)	36%
CTR uplift with schema (Schema App)	40%
First-page pages using schema (Backlinko)	72%

The pattern we see in audits: most B2B sites either ship no schema or they ship one type (usually Organization) and stop. The lift comes from stacking. Article tells AI what the content is, Person tells it who is responsible, FAQ extracts the question-answer pairs the LLM actually wants to quote, and Organization anchors all of it to a verifiable entity. Each one alone is weak. Together they make your page a clean training example, which is what citation engines reward.

A practical sequence: ship Article and Organization sitewide first (one-time template work). Add Person to author bios. Add FAQ to posts that contain genuine Q&A. Add HowTo to step-by-step guides. Speakable last, and only if you care about voice or audio rendering.

How does FAQ schema actually drive AI citations?

FAQ schema is the highest-leverage starting point because question-and-answer pairs map one-to-one with how retrieval-augmented systems chunk and index passages. When ChatGPT or Perplexity ingests a page, the FAQPage JSON-LD gives them pre-segmented Q&A blocks they can lift verbatim. No guessing where the answer starts. No tokenization of surrounding fluff.

The catch: the questions in your schema must match the questions your audience actually asks. Stuffing FAQPage with keyword variants of "best schema for ChatGPT" is the fastest way to get the markup ignored. We pull our FAQ questions from People Also Ask, from sales-call transcripts, and from search-console queries that hit the page with no satisfying answer.

A working pattern, abbreviated:

The answer text in the schema should mirror, not contradict, the visible answer on the page. Mismatch is a soft penalty in AI extraction because it signals manipulation. For the deeper pattern library, see our Schema Markup Cheat Sheet for SaaS Companies.

Article and Person schema: how AI verifies who wrote what

Article and Person schema solve a problem most marketers do not realize they have: when an LLM quotes your post, it needs to know who to credit. If your byline is a plain text string with no Person entity behind it, the AI either skips attribution or invents one. Both outcomes hurt you.

Article schema does three jobs. It tells AI engines the type of content (BlogPosting, NewsArticle, TechArticle), the publication date, and the canonical author. Person schema then makes that author a real entity with credentials, a job title, and ideally a sameAs link to LinkedIn or a verified profile. This is the spine of E-E-A-T for AI, and it is why pages with named, schema-backed authors get cited more often than pages with "Team" bylines.

Per SEOptimer reporting Schema App data, pages with schema receive 40% higher click-through rates than pages without. The author entity carries a disproportionate share of that lift because trust attribution is what gates citation, not just retrieval. We covered the underlying mechanics in Getting Cited by ChatGPT: 90 Days of Citation Tracking Data.

A note on what we do not do: we do not invent credentials. Person schema with "Expert" in the jobTitle and no verifiable profile behind it gets discounted fast. Use real titles, real sameAs URLs, real publications.

HowTo, Speakable, and Organization: the supporting trio

HowTo is straightforward when you have actual steps. If your post is a procedure (install this, configure that, run the audit), HowTo schema tells AI engines the sequence, the tools required, and the expected outcome. Voice assistants and AI Overviews extract HowTo blocks aggressively because they map cleanly to "how do I" queries.

Speakable schema is narrower. It marks the sections of a page best suited for text-to-speech rendering, which matters if you want exposure on Google Assistant or Alexa-class surfaces. For most B2B blogs, Speakable is a low-priority add. Worth it for high-traffic posts answering common voice queries, skip-able for everything else. The realistic ROI question on voice is covered in our take on Voice Search SEO in 2026: Is It Still Worth It?.

Organization schema is the anchor. One block, sitewide, declaring who you are, what you do, your logo, your social profiles. It is what AI engines hash against to confirm that the entity making the claim on this page is the same entity that appeared in five other citations last week. Skip it and your individual posts have to do entity-resolution work from scratch every time.

How do you validate schema for AI extraction, not just Google?

Google's Rich Results Test only checks whether your schema qualifies for visual SERP features. It says nothing about whether ChatGPT or Perplexity can parse it. For AI validation we run a three-step check.

First, validate the JSON-LD itself against Schema.org's structure (use Schema.org's validator, not just Google's). Second, fetch the page server-rendered (curl, not browser) and confirm the schema appears in raw HTML, not injected by client-side JavaScript. AI crawlers frequently skip JS execution. Third, paste the rendered page into ChatGPT with a direct extraction prompt ("What FAQ questions does this page answer?") and see what comes back. If the model misses or paraphrases your schema'd questions, the markup is not doing its job.

The most common failure we find in audits is schema present in the HTML but contradicted by the visible content. AI engines treat that as a manipulation signal and downweight the page in retrieval. The fix is boring: make the visible page match the schema, word for word in the critical fields.

Frequently Asked Questions

What schema do I need for AI search?

FAQ, Article, Person, HowTo, Speakable, and Organization. Ship Article and Organization sitewide first, then layer FAQ and Person onto individual posts. HowTo for procedural content. Speakable last.

What is the best schema for ChatGPT citations?

FAQ schema combined with Article and Person. FAQ gives ChatGPT pre-chunked Q&A pairs it can quote verbatim. Article identifies the content type and publication date. Person attaches the answer to a verifiable author entity ChatGPT can attribute.

Does FAQ schema actually help with AI Overviews?

Yes, when the questions match real user queries and the answers in the schema match the visible page. FAQ schema accelerates passage extraction for AI Overviews. Fake or stuffed FAQ blocks get ignored or penalized.

Will AI engines strip schema during tokenization?

Some testing suggests LLMs may discard schema markers during training tokenization. The retrieval and indexing layers still use it heavily at query time, which is where citation decisions happen. Ship the schema either way.

How do I track whether my schema is driving AI citations?

You need to measure citations directly, not infer from rankings. We cover the tracking stack in AI Search Tracking: How to Measure Citations Across ChatGPT/Perplexity/AI Overviews.

Most B2B sites we audit ship one schema type and call it done, missing the three or four additions that actually move citation rates. If you want a specific list of which fixes are draining your traffic, book a Free SEO Audit Call. Thirty minutes, exact findings, no slide decks.

Discover the
Future SEO Trends

Explore All Blogs

July 16, 2026

Brand Mentions vs Backlinks: What Actually Moves Rankings in 2026

July 14, 2026

The Hidden Cost of Free Cybersecurity Tools: A B2B SaaS Audit

July 9, 2026