Curators are the adversarial layer AI markets need

There is a recurring question in every Mentat investor conversation that goes roughly: “Cool, but how soon do you remove the humans?” The honest answer is: never. The curator console is not a temporary scaffold around an AI we are racing to make autonomous. It is permanent infrastructure that the protocol’s safety properties depend on. This post is the argument for why.

The category error

The reason the question keeps coming up is that “AI-assisted market authoring” pattern-matches to other domains where humans are obvious bottlenecks waiting to be automated away. Customer support, content moderation, basic legal review. The dominant story arc in those domains is “AI gets good enough, humans get removed, marginal cost approaches zero.”

Prediction market authoring is structurally different from those domains because the failure modes are not symmetric.

A misclassified support ticket is annoying. A misclassified moderation decision can be reversed. A bad legal review can be redone. The cost of an error in those domains is bounded by the cost of redoing the work plus some reputational damage.

A bad prediction market is irreversible. Once a market is on-chain and trading, the question text is the question text. If the trigger condition was ambiguous and the market settled the “wrong” way, the capital is gone. If the AI generated a market that violates content policy, it lives on-chain forever even if subsequently invalidated. The cost of an error is not the cost of redoing the work; it is the capital at risk in the market plus the trust damage to the protocol.

When error costs are unbounded and irreversible, the standard “automate everything” arc breaks. You want a human gate forever.

What the curator console actually does

The curator console is a UI for one job: keep bad markets off-chain.

Features in the M2 console, in rough order of importance:

Diff view. The most-used feature. Side-by-side comparison of the AI’s draft versus the latest curator-edited version. Highlights what the curator changed and why. Crucial for spot-checking AI quality drift.

Bulk actions. Curators can claim, approve, or reject N drafts at once when the validator pre-filtered them as a coherent batch. Saves enormous time for repetitive categories (price markets, scheduled event markets) where the AI is doing well.

Validation dashboard. Aggregate Validator scores, issue counts by severity, time-to-decision metrics. Lets curators triage queue depth and see where AI quality is regressing.

Version history. Every draft’s complete revision trail: which Scout candidates, which Draft iterations, which Validator reports, which curator edits, in chronological order. Crucial for post-mortem when a market resolves badly.

Request AI revision with notes. Curator can write a paragraph of context and have the Draft agent regenerate. The notes accumulate into the few-shot example set if the curator-edited version is later approved.

Audit log. Append-only record of every curation action. The audit log is part of the protocol’s accountability story — anyone can later verify that a specific market was reviewed and by whom.

None of this is exotic. The interesting thing is that none of it should ever go away.

Where AI is fast and curators are slow

AI is fast at:

Schema conformance. Did the Draft emit every required field? Yes/no. Validator catches most of this.
Surface-level clarity. Is the question text well-formed English? Almost always yes from a GPT-4-class model.
Source allowlist matching. Are the listed sources in the protocol’s allowed set? Validator can check this in constant time.
Policy categorization. Does this market fall into a blocked category (self-harm, violence, illegal activity)? Classification accuracy is high.

Curators are fast at:

Edge-case interpretation. “This question is technically clear but practically ambiguous because of X subtle context.” AI is bad at this; humans are not.
Domain-specific bullshit detection. A sports curator knows that “first to score” has specific edge cases (own goals, video review reversal) that the AI does not surface unless prompted. A political curator knows that “concedes” is a fuzzy predicate that has tripped UMA twice.
Quality-relative-to-peers. Is this market a worse version of a market we already have? Curators see the whole queue; AI sees one draft at a time.
Vibe checks. “This market technically passes validation but it does not feel like a Mentat market.” Hard to formalize; valuable nonetheless.

The pipeline puts AI on the fast checks and curators on the slow ones. Trying to put AI on the slow ones is where things go wrong.

What “AI safety” means in this context

The AI safety community has done good work on prompt injection, jailbreaks, and adversarial inputs. Mentat’s Validator pulls from that work for its category filters and prompt sanitization. But the structural safety property of the protocol is not the Validator’s classifier. It is the curator gate.

Here is the threat model in concrete terms:

A creator submits a malicious prompt. Validator’s prompt sanitization handles most of these. Edge cases that slip through hit Scout and Draft, which produce malicious or policy-violating output, which the Validator either flags or doesn’t. Either way, the draft enters the curator queue with whatever Validator flags it earned. A curator sees it, rejects it, and the draft never reaches on-chain deployment.
An adversarial creator finds a prompt injection that produces a clean Validator score but ships a misleading market. This is the dangerous case. The curator catches it because the curator reads the draft. The Validator’s classifier is fooled; the human reading the question is not.
A creator submits an honest prompt that produces a subtly miscalibrated trigger. The Validator does not catch it because the trigger is technically deterministic. The curator catches it because they think about what would happen at resolution time and notice the gap.

In every threat model, the curator is the backstop. Trying to make the Validator strong enough to replace the curator is a fool’s errand because the curator is doing a different job — not classification but judgment.

What this means for the protocol’s economics

If curators are permanent infrastructure, the protocol has to fund them. That is a real cost and we treat it as one.

The M3 economic model funds curators three ways:

Settlement fee share. A small slice of settlement fees flows to the curator who approved the market. Aligns curator incentive with markets that resolve cleanly (because contested markets generate disputes which eat into settlement fees).
Creator stake forfeit. If a curator rejects a draft for policy violation, a portion of the creator’s deposit (small, anti-spam-sized) goes to the curator. Aligns creator incentive against spamming the queue.
Protocol grants. Custom-engagement deployments fund their own curators. Self-host deployments fund whoever runs them. The Cryptuon-operated mainnet allocates protocol grant funding to a curator cohort.

Curator economics are not optional. A protocol that wants permanent human-in-the-loop has to pay for it. We are designing the fee splits and incentive model with that constraint front and center.

What changes as AI gets better

We should be honest about what does change. The pipeline’s economics shift as AI quality improves:

Approval-to-rejection ratio increases. As Draft gets better, more drafts pass Validator and reach curators in a state that needs less editing. Curators do less rewriting and more vibe-checking. Throughput per curator goes up.
Edge cases get sharper. As the Validator catches more surface failures, the cases that reach curators are the genuinely hard ones. Curator skill matters more, not less.
Specialization deepens. Per-category curators (sports, politics, crypto, science) emerge naturally as the queue grows. Generalist curators stay for the protocol-level policy work.
Tooling matures. Diff view, audit log, request-revision-with-notes — all the curator-side features have room to grow. Most of M4 product work is on this side.

What does not change is the existence of the gate. The gate stays.

What this means for engagement clients

Clients that want Mentat embedded in their stack often ask whether they can use the AI pipeline without the curator console. We say no.

It is not that the technology cannot be unbundled. It is that the protocol’s safety property depends on the gate, and we will not let a deployment ship Mentat-branded markets without it. Engagement clients get help staffing their own curator pipeline, training their curators on category-specific edge cases, and integrating the curator console into their existing operational stack. They do not get to skip the gate.

This is a commercial tradeoff we are deliberately making. Some prospective clients want a fully-automated AI market generator with no humans in the loop. Those clients are not Mentat customers; we are happy to refer them to other vendors. The protocol’s value proposition is honesty about resolution, and that requires honesty about the authoring layer too.

The summary

If you remember one thing from this post: the curator console is not technical debt to be paid down. It is the layer that makes the AI pipeline safe to deploy at all. Mentat without curators is a different protocol with worse safety properties. We are not building that protocol and we will not pretend the human layer is going away.

The fastest way to make AI prediction markets fail is to remove the people who say “no” to drafts that look almost-right. We are building the console because we want the protocol to survive its first hard year, not just ship its first impressive demo.

Curators are the adversarial layer AI markets need

Read the source.