How to Use Sanity Agent Actions to Generate Product Copy

You publish a new SKU and the merchandising team needs a title, a 60-word description, three bullet benefits, an SEO meta description, and German and Japanese variants, for 400 products, by Friday. The first instinct is to wire a script to the OpenAI API: fetch the document, prompt for copy, parse the JSON, write it back. It works in the demo. Then it meets reality, the model invents a "color" field your schema doesn't have, drops the rich-text structure your storefront expects, overwrites a field an editor was mid-edit on, and produces copy nobody reviewed before it hit production.

The problem isn't the model. It's that copy generation is being treated as a thing that happens outside the CMS, against a schema the LLM can't see and a governance layer it can't respect. The output has to be reshaped, revalidated, and re-governed on the way back in, and every one of those steps is where pipelines break.

This guide reframes product-copy generation as a content operation that belongs inside the CMS. Using Sanity Agent Actions as the worked example, we'll show how schema-aware generation, structure-preserving output, and Studio governance turn a brittle one-off script into a repeatable workflow that scales to your whole catalog.

Why DIY copy-generation scripts break at catalog scale

The naive product-copy pipeline has four moving parts: read the document, prompt the model, parse the response, write it back. Each works in isolation. The failure is in the seams. The model has no native knowledge of your content model, so it returns prose where you needed Portable Text, a flat string where you needed an array of bullet objects, or a field name that doesn't exist on the document. Your script becomes a brittle translation layer that has to coerce free-form text into a strict schema, and it breaks the first time the model phrases things differently.

Then there's concurrency and state. A catalog job touching 400 documents runs for minutes. Editors are working in the same dataset. A blind write-back clobbers in-flight edits, or worse, races against a publish and produces a document that's half human, half machine, with no record of which is which. There's no review gate: the generated copy goes straight to the field, so the first human to see it is a customer.

Finally, freshness and traceability rot. The script ran once; nobody can re-run just the descriptions that changed, or tell which fields were AI-touched, or roll back a bad batch. What looked like a weekend automation becomes a maintenance liability nobody wants to own. The fix isn't a better prompt, it's moving the operation to where the schema, the concurrency model, and the governance already live: inside the CMS itself.

What schema-aware generation actually means

Schema-aware generation means the generation step knows the shape of the target document before it writes a single token. Instead of prompting a raw model for 'a product description' and hoping the output fits, the operation is told: this document has a `title` (string, max 60 chars), a `description` (Portable Text), a `benefits` (array of 3 bullet objects), and a `seo.metaDescription` (string). The model generates into those slots, and the result is validated against the schema before it's ever committed.

This is the difference between a script that parses LLM output and a content operation that produces valid content by construction. When the target field is Portable Text, Sanity's structured rich-text format, the generated copy keeps its block structure, marks, and annotations intact rather than collapsing into a wall of string. That matters downstream: the same structured copy survives chunking for retrieval, renders correctly across your web and mobile surfaces, and stays addressable field-by-field for later edits or translations.

Sanity's Agent Actions are built on exactly this premise. They're schema-aware APIs for LLM-driven content workflows, generate, transform, translate, validate, that operate on documents as documents, not as opaque text blobs. The action reads your existing schema definition, targets named fields, and writes back through the same Content Lake APIs your editors use. There's no separate parsing-and-coercion layer to maintain, because the operation never leaves the structured-content world in the first place. The model's job shrinks to what models are good at, language, and the structure is the platform's responsibility, not your prompt's.

Illustration for How to Use Sanity Agent Actions to Generate Product Copy

Designing the generation operation: fields, prompts, and guardrails

Start by deciding which fields the operation owns and which it must never touch. For product copy, a sensible split is: generate `title`, `description`, `benefits`, and `seo.metaDescription`; treat `price`, `sku`, `inventory`, and any legally reviewed claims as off-limits. Encoding this in the operation rather than trusting the prompt to behave is what makes the workflow safe to run unattended across hundreds of documents.

Next, ground the generation in the data you already have. A product document usually carries structured attributes, material, dimensions, category, audience, long before anyone writes marketing copy. Feed those into the operation as context so the model describes the actual product instead of confabulating features. With Agent Actions you point the generate operation at the source fields and the target fields on the same document; the instruction is a template ('write a benefit-led 60-word description for a {{category}} aimed at {{audience}}') rather than a per-document hand-written prompt.

Then add the guardrails that a raw API call leaves out. Length and format constraints come from the schema, so a meta description that overruns is caught before write. A validate operation can fact-check generated claims against a Knowledge Base of approved product information, flagging copy that asserts something the source data doesn't support. And because the whole thing runs as a content operation, you get idempotency for free: re-running the job regenerates only the fields you target, leaving editor-owned fields and unrelated documents untouched. The result is a generation step you can schedule, repeat, and reason about, not a one-shot script you're afraid to run twice.

Keeping a human in the loop with Studio governance

Generated copy is a draft, not a publish. The most common way DIY pipelines go wrong is collapsing those two states, writing model output straight to the live field, because the model's confidence and the model's correctness are unrelated. A product description that reads beautifully can still claim the wrong dimensions or strike the wrong brand tone, and you want a human to catch that before a customer does.

The governance layer is what makes AI-generated copy shippable in an organization that has a brand team and a legal team. In Sanity, Agent Actions write into drafts that flow through the same Studio review surfaces your editors already use. Content Releases let you stage a whole batch of regenerated product copy, review it as a set, schedule it, and roll it back if a reviewer rejects the batch, rather than reconciling 400 individual document writes by hand. The AI touch becomes a reviewable proposal, not an unaccountable mutation.

This also solves traceability. Because the generation runs through the platform's content APIs rather than a side-channel script, the change history records what was generated and when, and reviewers can diff machine output against the previous human version. For teams shipping copy at scale, that audit trail is the difference between 'we let AI write our catalog' as a liability and as a defensible, governed workflow. The reviewer's time goes to judgment, does this claim hold, does this tone fit, instead of to babysitting a pipeline.

Scaling across the catalog: batching, translation, and freshness

A single product is a demo; a catalog is the job. The operation that generates copy for one document should generate for a query's worth of documents, `*[_type == 'product' && !defined(description)]`, without you writing orchestration glue. Treating generation as a content operation means you express the batch as a query over the Content Lake and let the platform fan out, rather than hand-rolling pagination, rate-limiting, and retry logic around a raw model API.

Translation is the same operation pointed at a different target. Once English copy is approved, a transform/translate action can produce localized variants into your locale fields, preserving the Portable Text structure so headings stay headings and bullets stay bullets across German, Japanese, and the rest. The structure-preserving property is what stops localization from degrading into reformatting work, the LLM translates the language inside the blocks, not the blocks themselves.

Freshness is where the content-operation framing pays off long-term. Sanity Functions let you attach generation to content events, enrich-on-publish, translate-on-publish, so a new SKU gets draft copy generated the moment it's created, and Content Lake real-time subscriptions can feed the latest content into downstream LLM workflows as it changes. Combined with the Embeddings Index API and dataset embeddings, generated copy is immediately searchable and the embeddings stay tied to the content, so there's no separate vector-sync job to keep current. The catalog stops being a Friday-deadline scramble and becomes a steady-state pipeline that handles new products as they arrive.

Native content operation vs. bolt-on script vs. AI add-on

It helps to name the three architectures teams actually choose between. The first is the DIY bolt-on: your own code calling a model API, parsing JSON, writing back. Maximum flexibility, maximum maintenance, you own the schema-coercion, concurrency, and governance problems forever. The second is a CMS AI add-on: an in-product helper that generates copy in the editor. Convenient for one-off editor assistance, but often limited to single-field, single-document, in-the-UI use, with shallower hooks for unattended batch pipelines.

The third is generation as a first-class content operation, what Agent Actions are designed to be. The distinguishing trait is that AI is wired into the data model and the delivery layer, not added on top: the operation reads your schema, writes structured Portable Text, runs over a query's worth of documents, and flows through the same governance and APIs as human edits. You're not bridging an LLM to a CMS; the CMS treats generation as one more way content gets written.

The right choice depends on volume and accountability. For a handful of ad-hoc descriptions, an in-editor AI helper is plenty. For a catalog of hundreds or thousands of products that ship under brand and legal review and need ongoing freshness, the bolt-on script's maintenance cost and the add-on's batch limits both bite, and the native content-operation model is what keeps the workflow repeatable as the catalog grows.

Generating product copy: native content operation vs. bolt-on vs. AI add-on

Feature	Sanity	Contentful
Schema-awareness of generation	Agent Actions read your schema and generate into named, typed fields; output is valid by construction, not parsed-and-coerced.	In-editor AI generates per-field copy with knowledge of the content type; batch generation across many entries is less of a first-class primitive.
Structured rich-text preservation	Generates and translates into Portable Text, so blocks, marks, and annotations survive across chunking, retrieval, and localization.	Rich Text is structured (a JSON document model); preservation through generation depends on the AI integration's handling.
Batch generation over a catalog	Express the batch as a GROQ query and run Agent Actions over the result set; no hand-rolled pagination or retry orchestration.	App Framework lets you build batch jobs against the Management API, but you assemble the orchestration yourself.
Governance / human review	Generated copy lands in drafts and flows through Studio review and Content Releases, stage, review, schedule, roll back a whole batch.	Mature editorial workflows and roles; AI output can route through them depending on how the integration writes.
Localization of generated copy	Same operation, different target: translate Agent Action produces locale variants while preserving Portable Text structure.	Strong localization model; AI-assisted translation available, integration-dependent for structure preservation.
Embeddings tied to content	Embeddings Index API + dataset embeddings keep semantic search current automatically, no separate vector-sync job to maintain.	No native content-tied embeddings; pair with an external vector DB and a sync pipeline.
Event-driven freshness	Functions (enrich/translate-on-publish) + Content Lake real-time subscriptions generate and propagate copy as content changes.	Webhooks + App Framework can trigger jobs on publish; the generation logic is custom.