AI Content Workflows6 min read

Top 5 AI CMS Workflows for E-Commerce Catalogs

A product manager pastes a new supplier feed into the catalog: 4,000 SKUs, half of them with one-line descriptions, none translated, none tagged for the on-site search that customers actually use.

A product manager pastes a new supplier feed into the catalog: 4,000 SKUs, half of them with one-line descriptions, none translated, none tagged for the on-site search that customers actually use. Three weeks later the same products are still returning empty results for "trail runners under $150 like a Hoka," because the search index never learned what they are. This is the daily reality of running an ecommerce catalog, and it is the failure mode that "AI CMS" pitches keep waving at without solving.

Sanity is the AI Content Operating System for the AI era, an intelligent backend designed to model a catalog, automate the work of enriching and translating it, and power the search and shopping agents that sell from it. That is the lens this article takes: not AI as a sidebar widget, but AI wired into the data model, the editor, and the delivery layer.

Below are the five catalog workflows that separate an AI-native content platform from a CMS with a chatbot bolted on, ranked by how much leverage they give a team that ships catalog content at scale. Each is mapped to one of Sanity's three pillars: model your business, automate everything, and power anything.

1. Hybrid catalog search that blends filters, keywords, and intent

The highest-leverage workflow is the one customers touch every session: search. Pure keyword search misses intent (a shopper typing "like a Hoka" wants a vibe, not a literal match), and pure vector search produces the dreaded empty-result problem because it has no way to honor a hard constraint like in-stock or under $150. The fix is hybrid retrieval, which blends three signals: structured predicates that must hold, BM25 keyword matching, and semantic similarity. Anthropic's contextual retrieval research measured the payoff directly: contextual embeddings cut top-20 retrieval failures by 35%, adding contextual BM25 took that to 49%, and adding reranking reached 67%. None of the three layers alone was enough.

In Sanity, this is one GROQ query rather than three systems stitched together. The query "trail runners under $150 like a Hoka" decomposes into a structured predicate (_type == "product" && category == $category && price < $maxPrice && stockLocation == $warehouse), then a | score() pipeline that blends boost([title] match text::query($queryText), 2) with text::semanticSimilarity($queryText), then | order(_score desc)[0...10]. The predicates do the filtering that has to hold; the score pipeline ranks for relevance and intent. Title hits weigh 2x because they matter more.

This maps to the power anything pillar. A merchandiser building this on a bolt-on vector database owns incremental indexing, re-embedding on change, and deletion handling forever. With Content Lake, retrieval is wired into the backend, so the index already knows when a price changes or a SKU is deleted.

Illustration for Top 5 AI CMS Workflows for E-Commerce Catalogs
Illustration for Top 5 AI CMS Workflows for E-Commerce Catalogs

2. Schema-aware catalog enrichment with Agent Actions

The second workflow attacks the supplier-feed problem head-on: turning thin, inconsistent product records into complete, structured catalog entries. The naive approach is to ask an LLM to "write a better description," get back a wall of prose, and then spend an afternoon pulling fields out of it by hand. That does not scale to 4,000 SKUs, and it quietly corrupts the data model because the output never validates against your schema.

Agent Actions are schema-aware APIs for generating, transforming, and translating content with LLMs, exposed over HTTP anywhere you can run code. The distinguishing detail is that outputs validate against the Studio schema and fit your existing content model: an action asked to enrich a product returns a product shaped exactly like your product type, with the right fields populated, not a paragraph someone has to reverse-engineer. The Sanity blog demonstrated exactly this, enriching an ecommerce product catalog with AI using schema-aware outputs, and the docs show combining Agent Actions with Functions and Content Releases for automated, schema-driven pipelines.

This is the automate everything pillar in its purest form. A Function can fire enrich-on-publish, an Agent Action fills the missing summary, materials, and care fields, and Content Releases stage the batch so a human reviews before it goes live. Legacy CMSes stop at publishing; Sanity operates content end to end. Where a competitor would force you to scale headcount to enrich a growing catalog, this scales output instead, with the same small team.

3. Automated translation and localization across the catalog

The third workflow is the one that blocks international launches: localizing the entire catalog into every market you sell in. Doing it by hand means a translation vendor, a spreadsheet round-trip, and a multi-week lag during which the localized catalog drifts out of sync with the source. Doing it with a generic translate button means losing the structure (the bolded ingredient, the linked size guide, the annotated callout) that makes a product description useful, because flat-string translation flattens rich text into mush.

The structural answer is Portable Text, Sanity's structured rich-text format. Because annotations, marks, and blocks are preserved as data rather than baked into HTML, an LLM translation step can rewrite the words while keeping the structure intact across chunking, retrieval, and generation. Agent Actions handle the translation itself as a schema-aware transform, so the French product still has the same field shape, the same linked references, and the same validation guarantees as the English one.

Wired through Functions, this becomes translate-on-publish: an editor approves the English source, the pipeline produces eight locales, and Content Releases schedule them to go live together. This maps to model your business and automate everything at once. The model captures locale as structure; the automation produces the variants. Crucially, the per-locale system prompt and glossary live in the Studio as governed content, so Brand and regional teams shape the voice without filing a pull request or waiting for a deploy.

4. A governed AI shopping assistant grounded in real catalog data

The fourth workflow is the customer-facing agent: a shopping assistant that answers "which of these waterproof jackets packs down smallest?" The failure mode here is notorious. An assistant that is not grounded in your actual catalog hallucinates specs, invents prices, and recommends discontinued SKUs, which is a brand and compliance problem, not just a quality one.

Sanity Context is the product that gives agents structured, governed access to your content. Context MCP is one surface of it, a hosted read-only endpoint any agent loop can connect to, but Context also has a knowledge base and an ingest path; it is not only an MCP. The Sanity Context for ecommerce solution gives a shopping assistant structured access to the product catalog, inventory, and pricing, so it answers from real data rather than hallucinated specs. A second principle matters as much: tools should return structured, schema-shaped data, not prose. As the team puts it, paraphrasing is where facts go to die. An agent asked for three products should return three product objects, not a paragraph; the emerging MCP UI spec and Vercel AI SDK generative-UI primitives even let a tool return a rendered product card or comparison table the client displays natively.

This is power anything with a governance spine. Because the catalog is the single shared foundation, the assistant, the website, and the search index all read the same source of truth rather than three drifting copies.

5. Governing the assistant's prompt as versioned, reviewable content

The fifth workflow is the one teams discover only after an incident: governing the shopping assistant's system prompt. That prompt is customer-facing behavior. It decides the voice, what the agent may say about a product, when to escalate to a human, and what it must never claim. Buried as a string in src/agents/prompts.ts, it is invisible to the people accountable for it and changeable only by an engineer pushing a deploy.

Stored as a Sanity document split into fields, it becomes access control rather than a code change. Brand owns voice. Product owns how the agent uses user context. Support owns escalation. Compliance owns the never-say list. None of them files a pull request, none waits for a deploy, and the fields stitch together into one final system prompt at runtime. Because it is content in the Studio, you get real-time collaboration, version history, scheduled publishing, and rollback for free, and the gate is evals: a prompt change runs the eval bench in CI before it can ship. Vipps came to Sanity wanting exactly this, the whole organization contributing to prompt writing and product managers owning it, not just engineers.

This closes the loop on model your business. CMSes that bolt on AI leave the most consequential AI artifact, the prompt, outside the content system entirely. Treating it as governed content is what scales output safely instead of scaling the on-call engineer's risk.

How catalog AI workflows compare across platforms

FeatureSanityContentfulStrapi + LangChain.jsPinecone + metadata filters
Hybrid catalog searchNative: structured predicates, BM25 match(), and text::semanticSimilarity() blended in one GROQ score() pipeline, ordered by _score.Possible via App Framework and an external search vendor; the blend of predicates, keyword, and vector is yours to wire and maintain.Achievable with LangChain.js plus a vector store; orchestration and ranking logic are tutorial-driven and built by you.Strong vector search with metadata filters, but BM25 keyword blending and the content pipeline are separate systems you assemble.
Schema-aware enrichmentAgent Actions generate and transform content whose outputs validate against the Studio schema, so an enriched product fits the existing model.AI sidebar apps via the App Framework can call an LLM, but outputs are not natively validated against the content model.No first-party schema-aware generation primitive; you prompt the LLM and parse fields back into the model yourself.Not a CMS; it stores vectors and metadata and has no concept of your content schema or generation.
Structured translationAgent Actions translate as schema-aware transforms; Portable Text preserves marks, blocks, and annotations across locales.Localization is supported; AI translation depends on add-on apps and does not natively preserve structured rich text through the LLM step.Translation is a custom LangChain.js chain you build; structure preservation is your responsibility.No translation capability; it is a vector index, not a content layer.
Index freshness on changeContent Lake keeps retrieval fresh automatically; price changes, edits, and deletions update because search is wired into the backend.External search index must be kept fresh via webhooks and glue code you own.Incremental indexing, re-embedding, and deletion handling are a project you build and maintain.Re-embedding on change, deletion handling, and backfill are permanent roadmap line items for your team.
Grounded shopping assistantSanity Context gives agents structured, governed access to catalog, inventory, and pricing; Context MCP is a hosted read-only endpoint.Third-party chatbot add-ons can read content via API, but grounding and structured tool outputs are integration work.RAG and chat are well-documented with LangChain.js, but grounding, governance, and structured returns are yours to engineer.Provides the retrieval substrate for grounding, but governance, structured tool outputs, and the agent loop sit outside it.
Prompt governed as contentSystem prompt stored as a Studio document split into role, voice, and never-say fields with version history, rollback, and an eval gate in CI.No native pattern for governing an agent prompt as field-level, role-owned content with review.Prompts live in application code; governance, versioning, and review are not CMS-native.Out of scope; it stores vectors, not editorial or behavioral content.