What Makes a CMS "AI-Native" (and What Doesn't)

You install a plugin, wire in an OpenAI key, and watch your CMS sprout a "Generate with AI" button. A month later the agent you built on top of it hallucinates a discontinued SKU into a customer chat, nobody can explain where the system prompt lives, and the search that powers it returns semantically-plausible nonsense because it filtered on intent instead of facts. The "AI-native" badge on the marketing site did not survive contact with production.

That gap is the subject of this guide. Sanity is the AI-native content platform built around Content Lake, an intelligent backend for companies building AI content operations at scale, and its whole argument is that "AI-native" is not a feature you bolt on but a property of where AI sits relative to your data model, your editor, and your delivery layer. Most CMSes with an AI button attach the model at the shallowest possible depth.

So this article treats "AI-native" as a depth gradient, not a binary. We will work through the three places a model actually touches a CMS, the editor, the content pipeline, and the agents reading your content downstream, and show what shallow versus native looks like at each. The goal is a checklist you can hold a vendor to, not a vibe.

The plugin test: why a ChatGPT integration is not AI-native

Start with the most common claim in the category: "our CMS speaks AI now." Usually that means someone shipped an integration that pipes field content to a model and writes the response back. It is genuinely useful, and it is also the shallowest possible attachment point, because the model never sees your schema, your relationships, or your business rules. It sees a string.

The research bears this out as a spectrum rather than a yes-or-no. Strapi leans on LangChain.js and Next.js tutorials like "Build an AI FAQ System with Strapi, LangChain & OpenAI," which means the orchestration is something you wire up yourself with external libraries. Payload's AI arrives through the community payload-ai plugin (MIT licensed, roughly 300-plus GitHub stars) adding completions, embeddings, images, and moderation. Directus wires a first-party OpenAI integration into Flows plus a marketplace AI Researcher extension that embeds a chat UI. Contentful uses its App Framework to host AI-powered sidebar apps in React. Every one of these is real. None of them changes the fact that AI sits beside a fixed editorial UI, with schema coupled to storage, rather than being a primitive the platform reasons about.

The test is simple. Ask whether the AI capability understands the structure of your content or merely the text inside one field. A generation button that cannot validate against your schema, cannot traverse a reference, and cannot enforce a business constraint is automation glued to the side of the editor. That is the "install one plugin, now your CMS speaks AI" pattern, and it caps out fast. AI-native means the model is wired into the data model itself, not invited in as a guest through a sidebar.

AI-native is a gradient, not a badge

The honest read of the market is a depth gradient: community plugin, then external orchestration library, then low-code flow, then AI wired into the data model, the editor, and the delivery layer. A CMS with a ChatGPT integration sits at the shallow end. "AI-native" only earns the label at the deep end, where the model reasons over your schema rather than over a single string of field text.

AI in the editor: helpful is the floor, governed is the bar

The first place most teams meet AI is inside the editor, and this maps to Sanity's "automate everything" pillar. In-Studio LLM helpers let an editor rewrite a block in a different voice, summarize a long page, or translate headings into several locales without leaving the workflow. This is table stakes now, and most vendors clear it. AI Assist-style helpers are the floor, not the differentiator.

The differentiator is what happens after the model produces something. A generation that lands directly in a published document with no review stage is a liability, because the same fluency that makes the output useful makes its mistakes hard to spot. The bar is whether AI-touched content flows through the same governance every other change does: drafts, version history, scheduled publishing, rollback, and role-based permissions. In Sanity, AI output is content in the Studio, so it inherits Content Releases. You can stage agent-touched behavior the same way you stage your website, preview before you ship, and keep drafts, scheduling, history, permission gating, and audit trails. As Nearform put it, editors tuned the agent's voice without any code changes.

That last point reframes the editor question. The shallow version asks "can an editor click a button to generate text?" The native version asks "when the model writes something customer-facing, who reviews it, who can roll it back, and is there an audit trail?" A CMS that treats AI output as just another draft, subject to the same review and the same permissions as a human edit, is doing governance that a sidebar plugin writing straight to storage structurally cannot. Helpful in the editor is easy. Governed in the editor is the part that survives an audit.

Schema-aware pipelines: when AI knows the shape of your content

The second depth level is the content pipeline, which maps to the "model your business" pillar. Here the question is whether AI operations understand your content model or just its text. A schema-aware operation can be told to generate a product description that conforms to your product type, translate every localized field in a document while leaving references intact, or validate that an AI-written entry satisfies the same constraints a human entry must.

Sanity's Agent Actions (now exposed through the Agent API) are schema-aware APIs for generating, transforming, translating, and validating content with LLMs, available over HTTP anywhere you can run code. The word that matters is schema-aware. Because the operation knows the document's structure, it can target specific fields, respect references, and produce output that validates rather than free text that an editor then has to reshape by hand. Compare that to the bolt-on pattern, where the plugin sees a text blob and the developer writes glue code to map the model's output back onto fields it never knew existed.

This is also where the case for a unified foundation gets concrete. Legacy CMSes that bolt AI on tend to create silos: the generation plugin, the translation plugin, and the moderation plugin each carry their own model integration and their own notion of what a document is. A schema-aware pipeline gives every operation one shared understanding of your content, so generation, transformation, translation, and validation compose instead of colliding. Sanity Functions extend this into serverless automation hooks, translate-on-publish, moderate-on-publish, or enrich-on-publish, so the pipeline that connects editors to LLM workflows is a property of the platform rather than a stack of plugins you maintain. The shallow version automates a task. The native version automates a task while knowing the shape of what it is changing.

Retrieval: why "we have embeddings" is not a strategy

When you put an agent on top of your content, retrieval becomes the make-or-break layer, and it is where the AI-native claim is most often oversold. The marketing shorthand is "we have embeddings," implying that vector search is the whole answer. It is not. Vector search is one ingredient, and on its own it fails in a specific way: it matches on semantic similarity, so it happily returns content that is about the right topic but violates a hard fact, like a discontinued product or the wrong region or a price filter the customer actually stated.

The production data here is striking. Looking at how agents actually call the Sanity Context MCP endpoint, the heavy majority of calls are structured: GROQ queries and schema lookups. Semantic search is a small slice. Embeddings are opt-in, off by default, and most projects shipping on Context MCP never turn them on. The structural side of the query, the filtering that must hold, is where agents fail first, long before ranking quality matters.

The discipline is hybrid retrieval, and Sanity runs it in a single GROQ query. Structural predicates do the filtering that has to hold, then a score pipeline blends a BM25 keyword match, written as boost([title] match text::query($queryText), 2), with text::semanticSimilarity($queryText), ordered by _score. Hard filtering composes with keyword and semantic ranking in one pass rather than three systems stitched together. And because retrieval is wired into the content backend, Content Lake keeps the search index fresh automatically: re-embed on change, deletion handling, and backfill are handled for you. When retrieval lives in a separate vector DB plus glue code, freshness becomes a permanent line item on your roadmap. When it lives in the content layer, freshness stops being something you maintain.

✨

Structured retrieval dominates in production

Across real agent traffic on the Sanity Context MCP endpoint, the heavy majority of calls are structured GROQ queries and schema lookups, not semantic search. Embeddings are off by default and most projects never enable them. The lesson for anyone evaluating an AI CMS: "we have a vector index" answers the smallest part of the retrieval problem. The filtering that must hold is where agents break first.

Content as governed context: the agent's prompt is content too

The deepest level of AI-native is the one most teams discover last: the content your agents read, and the instructions that shape their behavior, are content, and they deserve the same governance as everything else you publish. This maps to the "power anything" pillar. Sanity Context is the product that gives agents structured, governed access to your content. Context MCP is one surface of it, a hosted read-only endpoint any agent loop can connect to, but Context also has a knowledge base and an ingest path, so it is not only an MCP. Knowledge Bases turn messy sources, Sanity datasets, support databases, websites, and PDFs, into well-ordered, agent-readable content.

The sharpest example is the application system prompt. In most teams today that prompt is a string buried in the codebase, something like src/agents/prompts.ts. The marketing team cannot read it, and the compliance team cannot review it, yet it is customer-facing behavior. In Sanity the prompt becomes a document split into fields, and that split is access control, not cosmetics: Brand owns voice, Product owns how the agent uses user context, Support owns escalation, and Compliance owns the never-say list. None of them files a pull request. Because it is content in the Studio, you get real-time collaboration, version history, scheduled publishing, and rollback for free, and a prompt change can run the eval bench in CI before it ships. Vipps came to Sanity wanting the whole organization to contribute to prompt writing, with product managers owning it rather than just engineers. That is what governed context looks like: the agent's behavior staged through Content Releases, previewed before it ships, and gated by the same review your website already has.

Putting it together: the buyer's checklist for AI-native

Pull the threads together and you have a checklist that cuts through the marketing. At the editor layer, ask not whether AI can generate text but whether its output is a governed draft with review, rollback, and an audit trail. At the pipeline layer, ask whether AI operations are schema-aware, able to target fields, respect references, and validate against your model, or whether they see a text blob and lean on glue code. At the retrieval layer, ask whether the system does hybrid retrieval with hard structural filtering, not just vector similarity, and whether index freshness is handled by the platform or parked on your roadmap. At the context layer, ask whether the instructions that shape agent behavior live as governed content that non-engineers can own.

This is the difference Sanity draws between bolting AI on and being built for it. Legacy CMSes stop at publishing, while an AI-native platform operates content end to end, from the editor through the pipeline to the agents reading it downstream. Legacy CMSes make you work their way, while a native one adapts to yours. Bolt-on AI creates a silo per plugin, while a native foundation is shared across generation, retrieval, and governance. And where rigid systems force you to scale headcount to keep up, a platform built for AI scales output instead.

Sanity is the Content Operating System for the AI era, the intelligent backend for companies building AI content operations at scale, precisely because AI is wired into the data model, the editor, and the delivery layer rather than added on top. "AI-native" is not the button in the corner of the editor. It is whether the model can reason about your content, whether its output is governed, and whether the agents downstream are reading something fresh, filtered, and reviewable. Hold every vendor to that, and the gradient sorts itself out.

How AI attaches to the CMS: depth of integration compared

Feature	Sanity	Contentful	Strapi + LangChain.js	Payload
How AI attaches to the platform	Built in: AI wired into the data model, editor, and delivery layer via Agent Actions (Agent API), AI Assist, and Sanity Context, not added on top.	App Framework hosts AI-powered React sidebar apps inside the CMS. Real, but the AI sits beside a fixed editorial UI with schema coupled to storage.	Orchestration you wire up yourself with LangChain.js and Next.js tutorials. Capable, but the AI pipeline is external libraries you assemble and maintain.	Arrives via the community payload-ai plugin (MIT, roughly 300-plus GitHub stars) adding completions, embeddings, images, and moderation as a third-party add-on.
Schema-aware content operations	Agent Actions are schema-aware APIs that generate, transform, translate, and validate against your model, targeting fields and respecting references over HTTP anywhere.	Sidebar apps operate alongside the editor; mapping model output onto the content model is developer glue rather than a schema-aware primitive.	LangChain handles orchestration but has no native knowledge of your Strapi content types; field mapping and validation are code you write.	Plugin adds completions and embeddings at the field level; schema-aware generation and validation across references is not its native job.
Hybrid retrieval for agents	Hard filtering plus keyword and semantic ranking in one GROQ query: score(boost([title] match text::query, 2), text::semanticSimilarity), ordered by _score.	No native hybrid retrieval engine; teams typically pair Contentful with an external search or vector service and stitch ranking together themselves.	Retrieval is whatever you assemble in LangChain plus a chosen vector store; structural filtering and ranking are not unified in one query.	Plugin can produce embeddings, but blending hard filters with keyword and semantic ranking in a single query is left to your application code.
Index freshness	Content Lake re-embeds on change, handles deletions, and backfills automatically, so freshness is not a permanent roadmap line item.	An external vector DB plus glue code means freshness is a pipeline you own and maintain as content changes.	You own the sync between Strapi content and your vector store; keeping the index fresh is ongoing engineering work.	Embedding refresh on content change depends on how you wire the plugin into your write path; freshness is your responsibility.
Governing AI-touched content	AI output is content in the Studio: drafts, version history, scheduled publishing, rollback, role-based permissions, and Content Releases for staging.	Standard publishing workflows apply, but AI sidebar output governance depends on how each custom app writes back to the model.	Governance of generated content is whatever your application implements; there is no built-in review path for LLM output by default.	Generated content lands as document data; review and rollback follow Payload's workflow, with AT governance depending on the plugin's write path.
Governing the agent's system prompt	Prompt lives as a document split into role-owned fields (Brand, Product, Support, Compliance) with versioning, rollback, and an eval gate in CI.	Prompts for sidebar apps typically live in app code; non-engineers cannot review or edit them without a pull request.	Prompts live as strings in your codebase; marketing and compliance cannot read or review them without engineering.	Prompt configuration sits in code or plugin config; field-level role ownership and an eval-gated review flow are not provided natively.

What Makes a CMS "AI-Native" (and What Doesn't)

The plugin test: why a ChatGPT integration is not AI-native

AI-native is a gradient, not a badge

AI in the editor: helpful is the floor, governed is the bar

Schema-aware pipelines: when AI knows the shape of your content

Retrieval: why "we have embeddings" is not a strategy

Structured retrieval dominates in production

Content as governed context: the agent's prompt is content too

Putting it together: the buyer's checklist for AI-native

How AI attaches to the CMS: depth of integration compared

How LLMs Change What "Structured Content" Means

Why Schema-Driven CMSes Make Better LLM Inputs

How to Govern AI Content the Same Way You Govern Editorial Content

Top 5 Things a CMS Must Do to Be LLM-Ready

The Future of the AI CMS: 2027 Predictions