AI OperatorOPERATOR NOTES
← Back to Notes

Review Is a Contract

5 min read

You open the diff and your first reaction is physical.

Something tightens.

Not because the change is obviously broken, but because it’s large and confident in the way AI output often is.

It touches files you didn’t expect.

It reorganizes code that “was fine.”

It adds a helper you never asked for.

You can’t immediately point to an error, but you still feel resistance.

So you start reviewing in the most common way: by mood.

You read until you’re uncomfortable, then you request changes until you feel better.

This is how AI-assisted work becomes exhausting even when the output is decent.

You are not reviewing code.

You are trying to soothe uncertainty.

The Problem

In traditional engineering, reviews are often informal.

You know the codebase.

You know the patterns.

You can smell a bad change.

That intuition can work when changes are small and familiar.

AI breaks that dynamic.

It produces bigger diffs, faster.

It can be right in a way that feels wrong.

And it can be wrong in a way that looks right.

When you don’t have an explicit reference, review becomes subjective.

Subjective review creates two failure modes:

  1. Endless polishing.
    You keep requesting changes because you can’t define what “good” is.

  2. Silent drift.
    You approve because you’re tired, and the codebase slowly moves away from what you intended.

Both are symptoms of the same missing piece:

There is no contract.

Vibes Are Not a Review Strategy

“This feels off.”

“This is too complex.”

“I don’t like this pattern.”

These reactions are not useless.

They are signals.

But signals are not criteria.

A signal tells you where to look.

Criteria tells you what to do next.

Without criteria, you either overcorrect or undercorrect.

And over time, that creates a peculiar kind of fatigue: you are constantly making judgment calls you didn’t need to make.

The Solution (The Calm Way)

Review works best when it is comparative.

Not you versus the code.

Reality versus expectation.

Operators solve this by reviewing against artifacts: small documents that define what should be true.

That turns review from a debate into a check.

Turn “What Should Happen?” Into a File

Before implementation, write down the expectations:

  • goal
  • constraints
  • acceptance criteria
  • edge cases
  • any decisions you care about

This is not busywork.

It is the reference your future self will use when the diff is large and your energy is low.

Then review becomes simple:

Does the change satisfy the criteria?

Did it violate a constraint?

Did it introduce behavior that isn’t specified?

Now you are not relying on intuition alone.

You are comparing to a contract.

The Two-Column Review

If you want a mechanical method, use two columns:

  • Expected (from artifacts)
  • Observed (from code and behavior)

Then write mismatches.

This creates a review output that is actionable and emotionally neutral.

You are not saying “I don’t like this.”

You are saying “this deviates from the stated rule.”

That distinction matters.

It keeps the process calm.

It also makes fixes faster, because the model (or a human) can follow the contract precisely.

Fix at the Source: Code or Artifact?

Here is a question that saves enormous time:

Is the mismatch a code problem, or an artifact problem?

Sometimes the code is wrong.

But sometimes the artifact was incomplete, and the model made a reasonable choice inside a gap you didn’t notice.

If you only fix the code, you preserve the gap.

The next change will reintroduce the same debate.

Operators fix at the source:

  • if the code violates the artifact, fix the code
  • if the artifact is missing a decision, fix the artifact first, then rerun or refactor

This is how you prevent the same confusion from repeating.

Make Review Output Boring

The best review comments are boring.

Boring means specific.

Specific means testable.

Instead of:

“This seems messy.”

Write:

  • “Constraint violated: no new dependencies were approved.”
  • “Acceptance criteria missing: the error state for slow network isn’t handled.”
  • “Decision mismatch: we agreed on server-side validation, not client-only.”

Now the model can fix without guessing.

And you can approve without anxiety.

The Small Template I Use

If you want something you can paste into a review artifact, keep it short:

  • What changed: one paragraph
  • Expected (artifact): 3 - 7 bullets.
  • Observed: 3 - 7 bullets.
  • Mismatches: list of deviations
  • Action: fix code / update artifact / both

This turns review into a repeatable loop.

You stop relying on the fragile resource of “being in the right mood.”

Why This Is an Operator Skill

When AI speeds up implementation, review becomes the human bottleneck.

Not because humans are slow.

Because humans get overwhelmed.

The Operator mindset doesn’t try to eliminate review.

It tries to make review stable.

Stable review is not a heroic act of reading every line perfectly.

Stable review is a system that still works when you are tired, distracted, or under time pressure.

Artifacts are that system.

If you need a short primer on writing those artifacts, The Kindness of Definition is the guide to turning vibes into executable criteria.

And if you want a way to capture intent while work is moving, The Post-It Note Analogy shows how to create the lightweight briefs that keep reviews grounded.

Conclusion

If you want calmer AI-assisted engineering, don’t ask for “better output.”

Ask for a better contract.

Write down what must be true.

Review against that.

And when you catch a mismatch, fix it at the source, code or artifact, so you don’t have to argue with the same uncertainty again next week.

The goal is not to win every review.

The goal is to make review predictable enough that you can keep building without carrying the whole system in your head.

LIKE THIS? READ THE BOOK.

The manual for AI Operators. Stop fighting chaos.

Check out the Book