This operating-model entry is part of our series on how the firm works, how knowledge is governed, and how AI-native delivery changes client service.
Editorial status: PUBLISH HOLD – draft brief or seed outline. This page is not a complete insight; it needs a full rewrite or merger into a larger article before publication review.
AI Quality Control and Expert Judgment
AI can make consulting work faster. It can also make weak thinking look finished. That is the quality problem every serious AI-native advisory firm has to solve.
A polished answer is not the same as a defensible recommendation. A fluent synthesis is not the same as evidence. A slide that reads well can still hide false assumptions, missing counterexamples, stale sources, or advice that does not fit the client's sector reality.
The New Failure Mode
Traditional consulting quality control looked for analytical gaps, inconsistent logic, weak evidence, formatting errors, and recommendations that did not survive partner review. AI adds a new set of risks: hallucinated facts, overconfident summaries, source laundering, prompt sensitivity, hidden bias, untested calculations, and shallow pattern matching.
The danger is not that AI will always be wrong. The danger is that it will often be plausible enough to pass a tired review.
Red Teaming as a Working Routine
A serious quality model needs structured challenge. What would make this recommendation false? Which source is weakest? Which assumption is imported from another market? What would the regulator, CFO, frontline manager, or competitor say? Where has the model smoothed over disagreement?
Red teaming should happen before the client sees the answer, not after a senior person has fallen in love with the storyline.
Expert Judgment Moves Up
AI should not replace expert judgment. It should move expert judgment to higher-value moments: framing the question, defining what evidence counts, spotting sector nuance, challenging the synthesis, and deciding what the client should actually do.
That means senior reviewers cannot simply skim outputs. They need to know where AI participated, which sources were used, what checks were performed, and what uncertainty remains.
A Client Example
In a bank model-risk engagement, AI can help compare policies, summarize regulatory expectations, draft control maps, and identify gaps across use cases. But the final judgment depends on risk appetite, product mix, regulator history, data architecture, and the bank's ability to operate controls.
Without expert review, the output may sound right and still be unusable.
The Review Questions That Matter
A quality review should ask whether the answer is useful, true, and safe to act on. Which claims are sourced? Which assumptions are unstated? Which counterargument is strongest? Which calculation has been independently checked? Which sector nuance could change the recommendation? Which part of the work is AI-assisted and therefore needs extra challenge?
Those questions create a different culture. Teams stop treating AI output as a draft to polish and start treating it as a hypothesis to test.
Read more
DRAFT - not publish-ready. This insight is live for editorial review only and still needs evidence check, structure edit, partner critique, and exhibit planning.…
Read nextDRAFT - not publish-ready. This insight is live for editorial review only and still needs evidence check, structure edit, partner critique, and exhibit planning.…
Read nextDRAFT - not publish-ready. This insight is live for editorial review only and still needs evidence check, structure edit, partner critique, and exhibit planning.…
Read nextPUBLISH HOLD - draft brief or seed outline. This page is not a complete insight; it needs a full rewrite or merger into a…
Read next