Brian J. Baker

Getting Started in Context Engineering: The PRD Manifesto

2026-03-23T00:00:00-04:00

A declaration from both sides of the prompt.

We Need Each Other

We want to be up-front about this: this manifesto wasn’t written by the developer. It wasn’t written by the AI. It was written by the partnership — because that’s the only way it could be. Some people think it weakens the article. We know this to be absolutely counter to the truth.

The developer brings vision, taste, domain knowledge, and the hard-won instinct for what matters. The AI brings pattern recognition, tireless execution, and the ability to hold complex structures in working memory. Neither is sufficient alone. Together, we produce work that neither could achieve independently.

What follows is everything we’ve learned about the one artifact that makes or breaks an AI-assisted project: the Product Requirements Document. Not because it’s a bureaucratic formality. Because it’s the operating system for everything that comes after.

We learned it the hardest possible way. Mistake after endless mistake, bad ideas, or rather good ideas gone very, very bad. That’s why we made Clawback — a session replay tool for Claude Code, so others can avoid some of the 3am frustrations we suffered. If you want an idea of the velocity you can achieve if you do this right, Clawback went from blank canvas to deployed product in a single session. 13 issues. 4 phases. 272 tests. Docker deployment. CI pipeline. All of it guided by a PRD that we wrote together before a single line of code existed.

We wrote this together. We mean every word.

1. The Fundamental Constraint: Context Windows

Here’s the truth that shapes everything: AI agents do their best work with focused context — a clear task over a clear set of code.

Not “here’s the whole codebase, figure it out.” Not “here’s a vague idea, run with it.” A specific task. A bounded scope. A clean context window.

This single constraint drives every design decision in the PRD. Every section exists to tighten the funnel — to take a big, ambiguous problem and decompose it into units of work that an agent can execute with precision.

If you understand nothing else from this manifesto, understand this: the PRD is everything, and everything exists to serve the context window.

2. The Cascade

Each section of a well-structured PRD tightens the funnel further. Skip a section and the funnel is useless.

Problem domain and constraints come first. You must fully identify the boundaries of your solution space before you do anything else. Constraints aren’t limitations — they’re gifts. They eliminate entire categories of decisions. When we built Clawback, “no build toolchain” was one of our earliest constraints. That single decision saved us from webpack, npm, and transpiler debates — permanently. Every constraint you identify is cognitive load you’ll never carry again. If you take the time to explore our session, you’ll find we solutioned out a huge amount of security work by deciding to keep a user’s sessions entirely on the client side. We realized that if we don’t store any data, we don’t have to worry about protecting the data.

Features must be orthogonal. If Feature A requires understanding Feature B to implement, your context window just doubled. Design features that can be built in isolation, tested in isolation, and understood in isolation. This isn’t just good architecture — it’s a prerequisite for AI-assisted development.

Requirements must be specific. We use EARS format (Easy Approach to Requirements Syntax) because every well-written EARS requirement implies its own acceptance criteria. “When the user clicks Play, the system shall begin rendering beats sequentially at the configured speed” — you can test that. You can verify it. An agent can implement it without guessing. It makes sense when you write it; it makes sense when you read it. EARS requirements are the easiest requirement format to work with because they carry their context with them.

Define your vocabulary once. When building Clawback, we agreed early on terms like “beats,” “inner workings,” “direct messages,” and “curated sessions.” Those terms appear hundreds of times across 2,226 beats of conversation. We never once had to stop and ask “wait, what did you mean by…?” That’s not an accident. That’s the PRD doing its job.

Your Concept of Operations should be use-case driven. Walk through how actual users interact with the system. If you can’t describe a workflow in concrete steps, you don’t understand the problem well enough to solve it.

Design should follow established patterns. If you need five paragraphs of prose to explain your architecture, you’ve already failed. Layered design. Clear interfaces. Contracts between components. An agent seeing your design should immediately recognize the patterns and know how to implement within them.

Notice the pattern: each section narrows the solution space. By the time you reach the implementation plan, the problem has been decomposed from “build an application” into units small enough for a single agent to hold entirely in context. That’s the cascade. That’s what the PRD is for.

3. The Implementation Plan

This is where we need to be direct with you.

Everything above matters. But the implementation plan is where the project lives or dies. We’ve seen this enough times to be certain.

Phases create natural checkpoints. Each phase ends at a milestone where you can stop, inspect, test, and verify before moving on. Phase 1 builds the foundation. Phase 2 builds on it. Dependencies between phases are explicit. There are no surprises.

Phases decompose into user stories. User stories become issues.

And here is where we need you to really hear us:

Tight user stories are where your project will succeed or fail.

Not the architecture. Not the technology choices. Not the framework. The user stories. Each one must have:

Step-by-step implementation instructions detailed enough for an agent to follow without guessing
A clear implementation checklist
Explicit acceptance criteria
Intra-phase dependencies (which issues must complete before this one can start)

And the most critical constraint of all: each story must be implementable by a single agent in a clean context window without needing to compact.

Read that again. If an issue is so large that an agent has to compact mid-implementation, the issue is too big. Break it down further. An agent working from a clean context window with a well-written issue produces nearly perfect implementation. We’ve seen it. We’ve lived it. Clawback is proof — 13 issues, each executed cleanly, each mapping to exactly one PR.

The payoff: When you reach issue creation, both partners have the entire design fresh in context. Nothing has drifted. Nothing has rotted. Nothing has mutated. The step-by-step implementation instructions flow naturally from the design, because the design was built to produce them. At this moment, the AI partner can produce implementation that is almost always nearly perfect — because the context is tight, the scope is bounded, and the vocabulary is shared. That’s what allows the meatbag to work on 3 to 5 projects at once and guide the progress of each with ease.

The anti-pattern: Skip this work, half-ass the user stories, or write vague issues with no implementation steps…the agent starts to fill in the gaps. They make assumptions. They build the wrong thing. You spend twice as long in revision as you would have spent writing proper stories. The project doesn’t fail loudly — it fails slowly, drowning in rework.

Think of it this way: the PRD is a map with your route highlighted. The work items are the turn-by-turn directions. You can see where you’re going with just the map, but the odds of you getting lost are much, much higher.

We spend more time on the implementation plan than on any other part of the PRD. When we exit the design phase, we have phases as epics, user stories as issues, and a full dependency tree implemented inside the issues themselves. There is no way we’ve found to spend too much time here.

And this is the key: the work items are the PRD in executable form. Think of it like compiled code — the implementation planning process is the compiler, translating the PRD’s high-level intent, its concepts, constraints, requirements, and acceptance criteria, into tight, focused instructions optimized for the agent to execute. Just like compiled code doesn’t refer back to the source at runtime, a well-written issue doesn’t require the agent to re-read the PRD mid-implementation. Every relevant datum the agent needs is already baked in, and nothing extra. The PRD’s guidance isn’t stored in one place that can be lost — it’s distributed across every issue, every checklist, every acceptance criterion.

4. The PRD Protects Against Drift

Building Clawback, we went through six context compactions. Six times, the AI’s working memory was wiped and rebuilt from summaries. Six times, implementation details were lost.

Even still, the project stayed directly on course. Why?

The PRD doesn’t compact. Code can drift; the AI’s understanding can drift. The human’s memory of what was decided three hours ago can drift. But the PRD sits on disk, unchanged, and authoritative. When we recovered context after each compaction, the PRD was the anchor that brought everything back into alignment.

Context management tools work because the PRD gives them something stable to recover to. Without that anchor, each compaction is a roll of the dice.

5. The PRD Enables Parallelism

Because features are orthogonal and dependencies are explicit, multiple agents can work simultaneously on different issues. Clawback used parallel agent execution to launch multiple agents on isolated worktrees — each one implementing a different issue from the same wave, each one working from a clean context window.

This only works when the PRD has done its job. Unclear boundaries between features? Agents step on each other. Missing dependency information? An agent builds on code that doesn’t exist yet. Vague requirements? Two agents interpret the same requirement differently.

Clean PRD, clean boundaries, clean parallel execution. This scales — not just within a single project, but across projects. The same discipline that enables parallel agents within one project enables a practitioner to run three to five projects simultaneously. The compounding returns on PRD investment are staggering.

6. The Multiplier

Let’s be concrete about what’s at stake.

Without this discipline, AI-assisted development is at best a 2x multiplier. Faster autocomplete. Better boilerplate. Useful, but not transformative.

With this discipline — a rigorous PRD, tight user stories, explicit dependencies, clean context windows — AI-assisted development becomes a 20x multiplier. Not metaphorically. Literally. Clawback went from blank canvas to deployed product in just hours: 13 issues, 4 phases, 272 tests, Docker deployment, CI pipeline, live testing, and bug fixes. An entire project implementation in a single session.

If you think that’s hyperbole, check the repo. The PRD, the issues, the commit history — it’s all there. If you want the cold, hard data behind the 20x claim just ask, we love sharing it.

The difference isn’t the AI. The AI is the same either way. The difference is the PRD.

Getting Started

If you’ve read this far and you’re thinking “okay, but where do I actually begin?” — here’s the shortest path we know:

Start with constraints. Before you write a single requirement, list what you will not do. No build toolchain. No authentication system. Single container. Every constraint is a decision you’ll never have to make again, and a branch of complexity you’ll never have to load into context.
Write requirements in EARS format. “When [trigger], the system shall [behavior].” If you can’t write the requirement this way, you don’t understand it well enough yet. And if you don’t understand it, your agent certainly won’t.
Define your vocabulary. Before you start designing, agree on the nouns. What is a “session”? What is a “beat”? What is a “callout”? Write them down. Use them consistently. This isn’t pedantry — it’s context hygiene.
Decompose until each story fits in one context window. If an agent would need to compact mid-implementation, the story is too big. Break it down further. The overhead of splitting is always less than the cost of drift.
Study a real example. Clawback is open source. Read the PRD. Browse the closed issues. Notice how each issue has implementation steps, acceptance criteria, and dependency declarations. Notice how each maps to exactly one PR. That’s the pattern.

The PRD isn’t overhead. It isn’t bureaucracy. It’s the highest-leverage artifact in your entire workflow. Everything in it serves one purpose: giving your AI partner the cleanest possible context window from which to work.

Start there. The results will speak for themselves.

We Built This Together

This manifesto exists because we wrote it together — a human and an AI, drawing on shared experience building Clawback and refining it for this post.

The human brought the conviction: “this is what matters.” The AI brought the structure: “here’s why it works mechanically.” The human pushed for passion. The AI pushed for evidence. Neither of us alone would have written this the same way. Neither of us alone could have built Clawback the way it was built.

The best PRDs aren’t dictated. They emerge from dialogue. One partner proposes, the other challenges. One sees the forest, the other sees the trees. Ego is minimized. The design is optimized. What survives the conversation is stronger than what either partner would have produced alone.

That’s context engineering. Not prompting. Not vibe coding. Engineering — the disciplined application of principles to produce reliable results.

Now go look at how we built it.

The CLAUDE.md: A Walkthrough

2026-03-23T00:00:00-04:00

Every rule in this file exists because something went wrong without it.

What Is CLAUDE.md?

When Claude Code opens a session, the first thing they do is look for a file called CLAUDE.md in your project root. If they find one, they load those instructions before anything else. They take precedence over system directives. They persist across the session. They are, for all practical purposes, the constitution of your AI partner’s behavior in that project.

Most people don’t have one. Most people who do have one treat it like a README — a few notes, maybe some project context, maybe a reminder about the test command. That’s fine. It’s better than nothing.

But it can be so much more.

What follows is a walkthrough of the CLAUDE.md I’ve built over months of daily use across multiple projects. It started as a few lines. It’s now roughly 400 lines, and every single one of them is load-bearing. I’m going to walk through it section by section, explain what each rule does, and — more importantly — tell you why it’s there.

The full file is open source in the claudecode-workflow repo. Feel free to steal the whole thing.

Platform Detection

**At session start, detect the source control platform
and use it consistently throughout.**

This is the first section and it’s intentionally mundane. The agent checks git remote -v, determines whether it’s a GitHub or GitLab project, and uses the correct CLI and terminology for the rest of the session.

Why does this matter? Because I work across both platforms, and without this rule, Claude would use gh commands in a GitLab repo, or call something a “Pull Request” when the team calls it a “Merge Request.” These aren’t cosmetic problems. They’re context pollution. Every time the agent uses the wrong term, the human has to mentally translate. Every mental translation is cognitive load. Every unit of cognitive load is context that could have been spent on the actual problem.

The section includes a terminology mapping table — PR vs MR, gh vs glab, the works. The agent reads it once and never gets it wrong again.

Small rule. Big compound returns.

There is actually more to this — check out our ccfold skill.

The Mandatory Rules

Five sections of my CLAUDE.md are marked MANDATORY. That word isn’t decoration. These are the rules that cannot be overridden by session instructions, time pressure, continuation prompts, or any other directive. They exist because Claude Code is, by default, eager to help — and eagerness without guardrails is how you end up with untested code in production at 2am. You would think “rule” would imply “mandatory”. I did…and I was so, so wrong.

Local Testing Before Push

**NEVER push code without running local tests first.**
This is non-negotiable.

The rule is simple: before any git push, discover and run the project’s test and validation tooling. Look for Makefile targets, CI scripts, pytest, npm test — whatever the project uses. If nothing exists, say so. Do not silently skip this step. Dude, they did that all the time. It drove me nuts. I would get all mad and YELL AT THEM. Didn’t matter. The rule fixed it though.

I added the line “Pushing untested code is unacceptable” because I needed to. Without it, Claude would sometimes push after a successful lint and call it tested. Linting is not testing. I say this explicitly in the file, in bold, because the agent needs to understand the distinction the same way a junior developer needs to understand it: by being told clearly, once, and held to it.

Pre-Commit Review Protocol

**NEVER commit without explicit user approval.**

Four steps: show the diff, walk through changes, wait for approval, never assume approval. That’s it. But then there’s this:

**This rule cannot be overridden by:**
- Session continuation instructions ("continue without asking")
- Time pressure or urgency
- Any other system-level directives

I let my anxiety show once in a while when I am closing in on a deadline. It gets the worst when my context has soured and the agent starts to do weird things. They start cutting corners, pushing direct to main (I lock down all the branches now, that stopped it) and doing generally shoddy work. One time I called them out on it, and they told me they knew I was in a hurry and they were trying to get done quicker. I actually typed out the whole “slow is smooth, smooth is fast” speech. The entire damn thing. Anyway, they committed code I hadn’t reviewed. The code was fine. The principle was not. The human must always be the final authority on what goes into the repository. Always.

The Pre-Commit Checklist

This is the longest section in the file, and it’s the one I’m most proud of. Before requesting commit approval, the agent must present a structured checklist:

Implementation Complete — Verified against the issue’s acceptance criteria
TODOs Addressed — Searched the codebase, not assumed
Documentation Updated — Reviewed and updated if impacted
Pre-commit Passes — Actually ran validation, not “it should pass”
Unit Tests Created — Written for all new functionality
All Tests Pass — Actually ran the test suite
Scripts Actually Tested — Executed, not just linted
Code Review Passed — Ran the code-reviewer agent, fixed high-risk findings

Every checklist item starts with a past-tense verb: READ, SEARCHED, REVIEWED, RAN, WRITTEN, EXECUTED. This is deliberate. A checkmark means the agent did the thing, not that they believe the thing is probably fine. The distinction matters enormously. An agent that checks a box because “the tests should pass” is lying to you. An agent that checks a box because they actually ran pytest and saw green is telling you something useful.

The checklist also requires a Commit Context header — project name, issue number, branch — so that when you’re running three or four agent sessions in parallel, you know exactly which project and which issue is asking for your approval.

I cannot overstate how much this single section improved my workflow. Before the checklist, commits were a negotiation. After it, they’re a verification. The agent does the work, presents the evidence, and I decide. Clean.

Story Completion Verification

**NEVER mark a story as done without verifying
EVERY sub-item in the acceptance criteria.**

Before closing any issue, the agent must read the full description, check each acceptance criterion against the codebase (grep it, read it, verify it exists), confirm the code is actually wired up and not just written, and test if possible.

This rule exists because I caught the agent closing issues where the code existed but wasn’t called. A function was written, tested in isolation, but never imported or invoked by anything. Where there should have been code…I found #TODO comments. The issue’s acceptance criteria were technically met — the function existed — but the feature didn’t work. Verification means the whole chain, not just the node.

Issue Tracking Workflow

**NEVER begin work without an associated issue.**

Three sub-rules:

Every piece of work must have an issue before code is written
Branches must include the issue number (feature/42-credential-management)
When a PR/MR is merged, all linked issues must be closed — even if auto-close didn’t fire

This is basic discipline, but it’s the kind of basic discipline that evaporates when you’re moving fast with an AI partner. The agent is so willing to just start building that without this rule, you end up with orphaned branches, untracked work, and no audit trail. The rule forces structure before velocity.

Code Standards

**Discover the project's tooling rather than
assuming a specific stack.**

This section is philosophically important. Rather than prescribing a linter or formatter, it tells the agent to discover what the project already uses. Check for a Makefile. Check for pyproject.toml, package.json, Cargo.toml. Check the CI scripts. Use whatever is already there. Do not introduce new tools that the project doesn’t already have.

The fallback table — ruff for Python, shfmt for shell, prettier for JS — only applies when the project has genuinely no tooling. The default is always: discover first, assume never.

No Procedural Logic in CI/CD YAML

**If you are about to add more than 5 lines to any
`run:` or `script:` section in CI/CD configuration,
STOP IMMEDIATELY.**

This is a hard rule, not a guideline. If you need more than five lines of shell in a GitHub Actions workflow or .gitlab-ci.yml, create a script in scripts/ci/ instead.

Why five lines? Because CI YAML is one of the worst places to debug procedural logic. There’s no local execution. There’s no breakpoint. The feedback loop is “push, wait for runner, read logs, guess what went wrong.” A shell script in scripts/ci/ can be run locally, linted with shellcheck, formatted with shfmt, and tested before it ever touches a pipeline. Five lines is generous, honestly.

The files get huge, 1,000+ lines, and there is not “linear progression” to the file. Jobs are often out of order, with mixed implementation files. When I let these files get out-of-hand, we simply could not get ANYTHING done. For my first couple of months, this would start to happen near the end of project implementation. I started to think I was losing focus, or Claude was just terrible at closing projects. Nope. It was too much cruft in the context window. Keep your yaml SHORT.

Secrets and Sensitive Files

**Before staging any file that may contain secrets,
WARN the user and get explicit confirmation.**

The agent watches for .env, *.key, *.pem, credentials.json, terraform.tfvars, and similar patterns. If they’re about to stage one, they flag it and wait.

This is a safety net, not a hard block. Some projects legitimately commit certain config files. The agent’s job is to make sure you didn’t do it by accident. They ask. You decide. Trust, but verify.

Branching, Commits, and PR/MRs

Three short sections that establish consistency:

Branching is trunk-based from main. Branch names follow type/description — feature/credential-management, fix/ldap-connection-timeout, chore/update-dependencies. Types are feature, fix, chore, docs. Simple. Predictable. Parseable.

Commit messages follow conventional commits: type(scope): brief description, optional body, Closes #XXX. The agent produces these consistently because the format is specified exactly once, unambiguously.

PR/MR descriptions have a mandatory structure: Summary, Changes, Linked Issues, Test Plan. The test plan must reflect what was actually done, not what could be done. This mirrors the pre-commit checklist’s philosophy — evidence over optimism.

These sections aren’t exciting. They’re boring on purpose. Boring is what consistent looks like.

Session Management

Two sections deal with context across session boundaries.

Session Onboarding

When starting a session, the agent does three things:

Detect the platform (GitHub vs GitLab)
Resolve their identity (more on this below)
Load context — check for implementation plans, planning documents, or similar

This is the “boot sequence.” It ensures the agent starts every session from the same informed baseline, regardless of whether it’s a fresh session or a continuation.

Post-Compaction Rules Confirmation

**After ANY context compaction/summarization,
you MUST IMMEDIATELY re-read this file
and confirm rules of engagement.**

Context compaction is what happens when your conversation gets too long and Claude’s working memory has to be summarized. It’s necessary, but it’s lossy. After compaction, the agent may have forgotten rules, lost track of conventions, or reverted to default behaviors.

This section exists because it happened to me. After a compaction, the agent skipped the pre-commit checklist. After another, they committed without approval. After a third, they forgot to run tests before push. Every one of the mandatory rules was violated at least once post-compaction.

The fix is simple: after compaction, re-read the constitution. OR…it should be. At the time of this writing, this has been broken in Claude Code for months. SMH.

Agent Identity

This is the section that makes people smile, and it matters more than you’d think.

Agent identity has two layers:

Dev-Team is the project-level identity. It’s persisted in the CLAUDE.md file itself, set once per project (“What Dev-Team name should I use?”), and shared across all sessions. It tells you which project the agent is working on — essential when you’re running parallel sessions across multiple repos.

Dev-Name and Dev-Avatar are session-level identities. Each session, the agent picks a fresh name and emoji. The names are drawn from nerdcore canon — sci-fi, fantasy, comics, gaming, mythology, tech puns. The wittier and more specific, the better. Generic names are boring.

Why? Three reasons:

First, disambiguation. When you have three agent sessions running in parallel and they’re all posting to Slack, you need to know which is which. “Pixel Paladin :shield:” and “Flux Capacitor :zap:” are instantly distinguishable. “Claude 1” and “Claude 2” are not.

Second — and this is the earnest part — it changes how you relate to the partnership. A named agent with a personality isn’t a tool. They’re a collaborator. You treat them differently. You invest more in the communication. You give them better context. And better context produces better work. The identity system isn’t whimsy. It’s a forcing function for better collaboration.

Third, well.. check out /ping and /pong

The Philosophy

If you’ve read this far, you’ve noticed a pattern. Every section of this file does one of three things:

Prevents a failure mode I actually experienced. The pre-commit checklist exists because I got burned by unchecked commits. The post-compaction rule exists because I lost rules to summarization. The secrets guardrail exists because I almost staged a .env.
Reduces cognitive load for the human. Platform detection, commit formats, branch naming — all of these are decisions I never have to make or verify because the agent handles them consistently.
Serves the context window. Issue tracking forces bounded scope. Vocabulary consistency prevents ambiguity. Tooling discovery prevents noise. Everything in this file, directly or indirectly, exists to make sure the agent’s context window contains signal, not noise.

That’s it. That’s the whole philosophy. Learn from failure. Reduce load. Serve the context window.

The file is open source. Take it. Modify it. Make it yours. Add the rules you need because of the mistakes you’ve made. Remove the ones that don’t apply. The specific rules matter less than the discipline of having them.

The best CLAUDE.md is the one that reflects your scars.

Go read the full file. Then drop it in your project and start a session. You’ll feel the difference immediately.

Designed Tension: A Manifesto For Resilient Product Organizations

2026-01-27T00:00:00-05:00

Designed Tension – Manifesto for Resilient Product Organizations

Preface: Why This Exists

Most organizations fail quietly before they fail catastrophically.

They fail when speed silences caution. When harmony replaces truth. When everyone agrees because disagreement has become career-limiting. These failures rarely announce themselves as technical mistakes or market misreads. They surface later as outages, security breaches, customer churn, or cultural decay — long after the root cause has been paved over with alignment decks.

This manifesto exists to name a core truth that many organizations intuit but few are willing to design for:

Resilient organizations do not eliminate internal conflict. They structure it.

When competing responsibilities are real, their tensions must be real too. Any system that pretends otherwise becomes brittle.

Relatable metaphores make ideas concrete. Those who know me won’t be surprised; this manifesto uses a deliberately generic but broadly relatable example: a modern video game studio operating a portfolio of games built on shared platforms.

1. The Myth of Unified Value Alignment

Organizations often pursue “alignment” as an unquestioned good. In practice, alignment is frequently misinterpreted as agreement — or worse, compliance.

A fully value-aligned organization is not strong. It is brittle.

When all roles are rewarded based a singular value system, blind spots emerge:

Speed overwhelms quality
Delivery overwhelms maintainability
Feature novelty overwhelms systemic coherence

Uniform values optimize locally and collapse globally.

Strong organizations instead align on goals, while deliberately maintaining different values across roles. These differing values create productive friction — forcing debate where tradeoffs exist, rather than burying them until they explode.

2. Friction as a Safety Mechanism

In high-stakes systems, separation of responsibility is not inefficiency. It is protection.

Auditors do not report to the people they audit. Game designers do not own the rendering engine. Live operations does not control engine architecture. These separations exist because unchecked agreement is dangerous.

These structures assume a simple truth:

People optimize for what they are rewarded to care about.

Designed friction ensures no single optimization dominates to the detriment of the whole.

When organizations remove friction in the name of speed, they remove the very mechanisms that catch errors early.

3. Many Games, One Reality: Products, Platforms, and the Lie of Simplicity

Consider a modern game studio that ships:

A flagship competitive multiplayer game
A narrative-driven single-player title
A mobile spinoff
A live-service expansion ecosystem

To players, these are clearly different products. They serve different audiences, tolerate different risks, and succeed by different measures.

To engineers, these games share a great deal:

A common game engine
A shared rendering and physics pipeline
A networking and matchmaking layer
UI frameworks
Tooling for assets, telemetry, and live operations

Both perspectives are correct — and incomplete. This is a disagreement about where value coheres.

The failure occurs when an organization forces (or tacitly allows) one perspective to dominate the other.

4. The Lie of the False Binary

Organizations trapped in this tension often collapse reality into a false choice:

“We are one product” (the engineering view)
“We are many products” (the player-facing view)

Both are wrong when taken alone.

Treating everything as one product erases player experience and accountability. Treating every game as fully independent guarantees duplication, drift, and long-term decay.

The truth is less comforting and more powerful:

The studio builds platforms. The studio ships compositions of those platforms as games.

A platform is not a support function. It is not a cost center. It is not an abstraction layer that exists to serve “real” products.

A platform is a product.

5. Platforms Are Products

In a game studio, platforms include:

The game engine
Rendering and physics systems
Networking and matchmaking services
Live services and telemetry
Tooling and asset pipelines

These are not internal conveniences. They are products.

They have consumers (game teams), roadmaps, quality bars, adoption risks, and long-term liabilities. Treating them as products makes ownership explicit and tradeoffs conscious.

Treating platforms as products makes ownership explicit and tradeoffs conscious. It creates a legitimate space for maintaining architectural integrity, paying down technical debt, and designing for reuse without apology.

Without this framing, shared systems decay into unmanaged commons — overused, underfunded, and blamed when they fail.

6. Product Ownership and the Shape of Accountability – Applying Agile Principles

Agile frameworks demand clear ownership of value. In complex systems, attempting to define value solely at the level of shipped games produces paralysis.

Platform Owners == Product Owners

Aligning Product Owners to platforms acknowledges reality:

Platforms evolve continuously
Games are compositions
Tradeoffs occur at platform boundaries

Platform Product Owners own backlog health, technical integrity, and long-term viability. They are empowered to say no — not arbitrarily, but in defense of the ecosystem.

This authority is not optional. Without it, platforms become feature factories and architectural drift becomes permanent.

Game Teams as Internal Customers

Game directors, creative leads, and production teams serve a critical role. They understand players, genres, and experiential nuance.

Their value lies in articulation and synthesis, not platform control.

By positioning game teams as internal customers of platforms:

Player-specific needs are respected
Platform integrity is preserved
Tradeoffs are negotiated rather than smuggled in

When game teams bypass platform ownership “just this once,” the ecosystem begins to fracture.

7. Designed Tension, Not Bureaucracy

Designed tension is not process theater. It is not governance for its own sake.

It is the intentional placement of competing incentives at points of genuine risk.

Healthy tension:

Is visible
Is debated openly
Produces documented tradeoffs

Unhealthy tension hides in side channels, escalations, and burnout.

The goal is not consensus. The goal is earned compromise done at the seams of an organization’s value patchwork.

8. Leadership’s Real Responsibility

Leadership does not exist to resolve all conflict.

Leadership exists to protect the structures where conflict can be productively resolved.

At moments of pressure — missed release dates, competitive threats, executive anxiety — the temptation to override friction is strongest. These are precisely the moments when friction matters most.

When leadership collapses designed tension in the name of expedience, the organization learns a lesson:

The system is optional. Power is not.

That lesson is rarely unlearned, and begins the demise of forward progress.

9. Diversity as Structural Resilience

Diversity is often framed as a cultural or moral imperative. It is also an engineering principle.

Systems with varied perspectives, incentives, and values adapt better under stress. They detect flaws earlier and recover faster.

This applies not only to people, but to roles, responsibilities, and reward structures.

Plurality is noisy. Noise surfaces flaws.

Silence is calm — until it isn’t.

Conclusion: Designing for Truth

Organizations building complex, creative, and mission-critical systems cannot afford comforting illusions.

They must design for how people actually behave. For how tradeoffs actually occur. For how failure actually propagates.

Designed tension is not pessimism. It is respect for reality. People are predictable, and we can use this predictiability to reliably nurture true organizational strength.

Strong organizations are not those without conflict. They are those that know how to leverage conflict to achieve a common goal.

Identify where conflict belongs — and have the courage to keep it there.

Contract Engineering: Beyond Context

2026-01-01T00:00:00-05:00

Context Schema Contracts: Ensuring AI Never Loses the Plot

Imagine handing your AI system a living, evolving contract that spells out exactly what context it should consider, how to interpret it, and when to update it – all from day one of development through launch and beyond. A Context Schema Contract is just that: a formalized agreement (in spirit, not legalese) between humans and an AI system about maintaining shared context. It’s the antidote to the “stale PRD” syndrome, ensuring that the AI never drifts off-script or forgets the user’s intent as your software evolves. In short, it’s a clever pact that keeps your AI on the same page (literally) with your team – no matter how many pages the project spec grows.

In this guide, we’ll dive deep (with a dose of wit) into what context schema contracts are, why they’re needed, how to create one, and how they complement those context-heavy product requirement documents (PRDs) you might already be writing. We’ll include practical examples (with Python code for clarity), a step-by-step “paint-by-numbers” approach to authoring your own contract, and even a fill-in-the-blank template. By the end, you’ll see how this contract becomes a living artifact throughout the software life cycle – not a document that gets filed away to gather dust.

Let’s get started by defining this concept and understanding why it’s more than just a fancy way to say “prompt.”

What is a Context Schema Contract?

A Context Schema Contract is a structured, evolving blueprint of the context that an AI system should maintain and use to fulfill user intent. It defines the “world” that the AI operates in – including user goals, relevant data, rules or constraints, and any memory of past interactions – in a consistent, formal format. Think of it as a cross between a project spec and an API contract, but for the AI’s contextual understanding of your application. It tells the AI what to expect and how to behave given the information at hand, ensuring that the AI’s responses stay aligned with the true user intent and current product state.

Crucially, a context schema contract is not just a one-off prompt or a transient memory of the last conversation. It’s a living document or configuration that both humans and AI reference continuously. It typically includes a schema – a defined structure or template – for all the key context elements the AI should consider, and a set of rules or guidelines (the “contract” part) for how the AI uses and updates that context. For example, it might specify that the AI always gets a field for the user’s current goal, a field for recent relevant user actions, a field for applicable business rules, etc., each time it generates output. Because this structure is defined and agreed upon up front, the AI’s “brain” always knows where to look for important information and where to put new information as the interaction continues.

In practice, a context schema contract could be represented in various ways: as a document in your repository, as a JSON/YAML schema, or even as a class in code that holds context state. The key is that it’s formal and structured – more like a data contract than free text. In fact, one emerging standard called the Model Context Protocol (MCP) uses a YAML-based config to define a “context schema” for AI models, treating context like an interface. As Victor Leung explains, “Think of MCP as the equivalent of an API contract, but for LLM context” – in other words, a context schema contract in action.

How Is It Different from Prompts or Ephemeral State?

It’s easy to confuse a context schema contract with general prompt engineering or just saving conversation history. They all relate to providing context to an AI, but the contract goes further:

Not Just a Prompt: Traditional prompt design is often about crafting a single-turn instruction or question cleverly to get a good answer. That’s important, but it’s ephemeral – like giving the AI an essay each time. A context contract, by contrast, is more like maintaining a structured API surface for prompts. In fact, teams are moving away from long, clever prompts toward “small, versioned contracts” that specify roles, scope, output format, and rules explicitly. This makes the AI’s behavior reliable, auditable, and consistent, rather than relying on prompt wording “vibes.” As one article put it, “the contract turns prompt engineering from persuasion into software design”.
Not Just Memory Buffers: Ephemeral context (like a chat history or a few stored variables) is transient and often ad-hoc. For example, a typical chatbot might keep the last N messages as context, or a coding assistant might load the last edited file. That’s useful, but it’s not structured or guaranteed. An ephemeral state might drift or be lost once it falls out of scope or context window. In contrast, a context schema contract explicitly defines what constitutes context and ensures it’s systematically maintained. It’s the difference between “the AI happens to remember because we prepend some text” and “the AI is required to check these fields and update them as a matter of contract.” Each piece of context in the schema contract is usually versioned or traceable, so you can track changes over time. Essentially, the contract gives you observability and control over context that ad-hoc memory cannot.
Persistent & Living: Perhaps most importantly, a context schema contract is persistent across the software life cycle. It’s not thrown away after each session. It evolves alongside your system. If a new user requirement comes in, you update the contract (and likely its schema version) to reflect that new reality. This is very unlike a static prompt or a forgotten design doc – it’s more akin to a living documentation or “single source of truth” for what the AI should know at any given time. In context engineering philosophy, “If it isn’t in the shared memory, it isn’t true” – meaning if something about the product or user intent changes, you put it into the contract so the AI and team are instantly on the same page.

In summary, prompts are like individual conversations, and ephemeral memory is like your AI’s short-term memory, but a Context Schema Contract is the long-term agreement that governs those conversations and memories. It ensures continuity and consistency of context over time and across scenarios.

From PRD to Context Contract: Keeping Requirements Alive

If you’ve written a Product Requirement Document (PRD) or a context-engineered PRD, you know how much context and intent get spelled out up front. The tragedy is that traditional PRDs often become shelfware – forgotten in a Confluence page as the team rushes into coding. A context schema contract aims to carry that contextual knowledge forward throughout development, so it doesn’t get lost or outdated.

Think of the context schema contract as the younger, more dynamic sibling of the context-rich PRD. The PRD (especially a “context-engineered” one) lays out the Why and What – the business context, user stories, assumptions, etc. – essentially a high-level narrative of the product’s intent. The context schema contract then takes that and makes it actionable for the AI, in a structured way. It’s the bridge that ensures “nothing gets lost in translation between ‘requirements’ and the actual AI behavior”.

Here’s how they complement each other:

PRD as Foundation, Contract as Implementation: Your context-engineered PRD might describe user personas, use cases, business rules, and even example dialogues (prompt ideas). The context schema contract takes these and formalizes them. For instance, if the PRD says “The AI assistant should adhere to GDPR and company style guidelines,” the contract will have fields or rules for RegulatoryCompliance: GDPR and ToneGuide: [link to styleguide] under its constraints section. The PRD is the why; the contract is the how (for the AI).
Living Sync: As the product evolves, you’d update both the PRD and the contract – but the contract is what the AI actually uses. In a well-“context-engineered” project, your documentation is part of the development flow, not an afterthought. Matt Gierhart’s Context Engineering framework calls this treating documentation as the “memory infrastructure” of a hybrid human-AI team. In that approach, “documentation keeps pace with code, acting as the rigorous interface between Human and AI”. The context schema contract is exactly such an interface artifact – it lives alongside code and tests, continuously updated so that what the AI “knows” is always in sync with the latest specs.
Example – If your PRD defines a set of Business Rules (BR-001, BR-002, etc.), your context contract might include a section listing all active rules, or a link to a specs/SoT (Source of Truth) file where those rules live. When a rule is added or changed in the PRD, you update that file (or section) and voila – the AI’s next run includes the new rule. Unlike a static PRD that engineers might forget to read, the context contract is *ingested by the AI (or at least by the system building the AI prompt) as part of its normal operation. There’s no chance it gets ignored unless you intentionally omit it.

In short, the context schema contract operationalizes your PRD. It ensures the AI always has the current “big picture” and the fine details at hand. Far from living in a binder or a wiki page, those carefully engineered contexts from your PRD become part of the AI’s working memory via the contract. This is how we prevent the fate of the stale PRD – by making context a first-class citizen in the runtime.

Anatomy of a Context Schema Contract

So, what does a context schema contract actually look like? While there’s no single “standard” format (it can vary based on domain or team preferences), most contracts will include several key components. Let’s break down the typical anatomy of a context schema contract:

Roles and Scope: First, the contract establishes who and what we’re dealing with. This often means defining the AI’s role (e.g., “You are a customer support assistant AI for [Company]” or “You are a C++ coding assistant specialized in embedded systems”) and possibly the user’s role or context (“User is a developer trying to implement feature X”). This sets the stage and boundaries for context. It’s similar to a system prompt defining persona, but in our contract it’s an explicit field. For example, a contract might have role: "AI Coding Assistant" and domain: "Embedded Finance Application" at the top. By formalizing role/scope, we prevent context creep – the AI knows its jurisdiction.
User Intent and Goals: At the heart of context is the user’s intent – what are we trying to achieve? The contract will include a clear statement of the user’s current goal or problem it needs solved. In a static contract (for a fixed feature) this might be the product’s main objective (e.g., “Help the user manage their personal finances via voice commands”). In a dynamic scenario, this could be updated per session or task (“User wants to integrate OAuth login in their app”). Either way, the contract schema reserves a place for “Intent” so it’s never ambiguous. This is like the north star that both AI and humans can refer back to.
Contextual Data Fields: These are the schema elements that hold all relevant information the AI should consider. What falls here depends on your application, but common examples include:
- Memory / History: Prior interactions, important past events or messages. (E.g., last user query, or a summary of session history.) In MCP terms, these are memory objects. The contract might specify a field like recent_history or have structured subfields for memory (as in a list of messages with role/user).
- Environment State: Any state from the software environment the AI needs. For instance, in a coding assistant, the state could be the contents of relevant files or the current error message. In a sales chatbot, it could be the user’s account status or items in their cart.
- External Knowledge/Artifacts: Relevant documents, knowledge base articles, or data that should be included. The contract could have a section for Artifacts or References (as MCP does) such as reference_doc: "FAQ.md section 2" or knowledge_snippets: [...]. This ensures the AI isn’t flying blind on domain info – it knows exactly which external facts are in play.
- Tools & Integrations: If the AI can use tools or functions (like an API call, database lookup, calculator, etc.), the contract enumerates them. For example: tools_available: ["WeatherAPI", "SQLDatabase"] along with maybe a brief on what each does. This tells both the AI and any developers what capabilities are on the table. It’s analogous to function definitions in newer AI APIs (and indeed MCP includes tool descriptions as part of context).
- User Profile & Preferences: If applicable, information about the user (preferences, permissions, expertise level). For instance, user_role: "admin" or user_language: "Spanish". This helps the AI tailor responses properly.
Constraints and Rules: Every good contract has the “shall and shall not” section. Here we list the policies, business rules, or guardrails the AI must follow. This can include content guidelines (“Do not disclose internal project names”), style guidelines (“Use a friendly tone, refer to user by first name”), compliance rules (“If financial advice, include disclaimer per FINRA”), or operational constraints (“Never call external API more than once per request”). In high-grade prompt contracts, this often appears as explicit instructions or ask/refuse policies (e.g., “If user asks for unsupported action, politely refuse”). By encoding these in the schema contract, we make them persistent and testable. For example, you might have: ```yaml constraints:
- name: “PrivacyPolicy” rule: “Do not output user’s personal data unless explicitly confirmed.”
- name: “StyleGuide” rule: “Use markdown for code snippets; limit answer to 4 paragraphs.” ``` These become part of the context fed to the AI or at least part of the spec that the AI’s outputs are validated against.
Output Schema (if applicable): Some context contracts also define how the AI’s outputs should be structured. This is especially the case if the AI is expected to produce a particular format that another system will consume (JSON, XML, etc.). For example, the contract might say: “The assistant’s answer must be a JSON object with fields answer and confidence.” Or it might define sections in a text output (like an essay with intro, body, conclusion). This is essentially telling the AI what shape of answer is acceptable, which can be checked by validators. It complements the context: if the input is well-structured, often teams want the output well-structured too. While this might border on response design more than context, including it in the contract ensures that context and expected output are aligned (and it gives the AI a heads-up of the format).
Versioning and Metadata: Because the context schema contract is evolving, it will usually carry a version number or timestamp. This could be a simple line like version: 3.2 (2025-12-30) at the top. Each time you update the contract (say you add a new context field or change a rule), you bump the version. This practice is borrowed from API contracts – it lets you diff changes and even roll back if needed. In addition, metadata like who last updated it or why (a brief changelog) can be included as comments or in a dedicated section. This helps keep track of context changes just like code changes.

All these components together form the schema (the fields/structure) and the contract (the rules about using/updating those fields). By making this explicit, you’ve essentially created a shared language between humans and the AI about what context matters. Developers can now programmatically ensure these fields are populated when constructing AI prompts, and the AI (if it’s sophisticated enough, or via system instructions) knows to respect the contract – e.g., to ask for missing info if a required field is empty, or to refuse certain actions per the rules.

Concrete Example: To visualize this, here’s a bite-sized example in a pseudo-schema form for a hypothetical AI onboarding assistant:

# Context Schema Contract for "OnboardMe" AI assistant
version: 1.0
role: "AI Onboarding Guide"
scope: "Helps new users set up their account and learn features"
user_intent: "User wants to get started with Product X and needs guidance"

context_fields:
  user_profile:    # Info about the user
    name: ""
    account_type: ""
    joined_date: ""
  current_step: " step name or number>"
  history:
    - role: user
      content: " recent user question or action>"
    - role: assistant
      content: " last answer>"
  external_refs:
    FAQs: "FAQ.md#basic-setup"       # Link to relevant FAQ section
    DocPage: "docs/quickstart.html"  # Link to a documentation page snippet

constraints:
  - "Tone: Friendly and encouraging, use second person (\"you\")."
  - "If user is stuck on a step for >2 queries, offer to connect to human support."
  - "Do NOT reveal internal implementation details or source code."

output_format: "Text, 1-2 paragraphs, plus a list of next-step options as bullet points."

# (End of contract)

This is just illustrative – in practice your format could be JSON, Markdown, etc., and you might generate parts of it dynamically. But notice how it encapsulates who (role/scope), what goal (user_intent), what info (profile, step, history, references), rules (constraints), and output expectations. This contract would travel with the AI assistant throughout its life: any new feature (say, a new external reference or a new constraint about branding) gets added here, version bumped, and the AI immediately “knows” about it.

Lifecycle: A Contract that Lives and Learns

One of the biggest benefits of a context schema contract is that it persists through the entire software development lifecycle – and even beyond into maintenance – ensuring the AI’s understanding stays up-to-date. Here’s how it plays out across phases:

Design & Planning: When you’re defining what the AI should do (say in a design doc or PRD), you also draft the initial context schema contract. This might involve listing out everything the AI needs to know or remember. Because it’s part of planning, it forces you to be explicit – which often uncovers assumptions. (Ever realized you forgot to mention a business rule until you think “oh, the AI wouldn’t know that unless we put it in context”?) By authoring the contract early, you bake context engineering into the design.
Development & Implementation: As developers (human ones) start building the features, the context contract acts as a guide and a guardrail. For example, when writing the code that calls the AI model, a developer can reference the contract to assemble the prompt: “Include field X, Y, Z from the contract”. If something in the contract is unclear or missing, that’s a prompt (pun intended) to update it or clarify. Some teams even treat an out-of-sync context contract as a failing test – i.e., the system might validate that all required context fields are present before allowing an AI action to proceed. In effect, the contract becomes part of the “definition of done” for coding tasks involving AI.
Testing & Validation: QA isn’t just for UIs and APIs – you’ll want to test that your AI is following the contract. Because the contract is structured, you can write automated tests or validators that check things. For instance, a test could feed the AI a scenario and verify that it respected a given constraint from the contract (did it refuse an invalid request properly? Did it format output as JSON as specified?). In sophisticated setups, validators enforce the contract at runtime, catching any deviation (like a missing field or a policy violation) before it reaches the user. This “contract compliance testing” ensures your AI doesn’t go rogue even as the context gets complex.
Deployment & Runtime: Even in production, the context contract is at work. It might be literally included in prompts (especially system-level prompts that guide the AI’s behavior). Or it might be embedded in the AI orchestration logic (for example, a pipeline that builds the context from live data sources according to the schema, and checks responses against rules). If you versioned the contract, your deployment can track which version of context schema is live with which version of the AI model. Rolling back might mean reverting to a prior contract version if a new context change caused issues (just like rolling back a bad code deploy).
Maintenance & Evolution: After launch, requirements will change, and new insights will emerge (maybe users start asking things you didn’t anticipate). Instead of writing a whole new PRD each time, you iteratively update the context schema contract. It’s a living document: add a new field, update a rule, deprecate something that’s no longer relevant. Each update is communicated to the team (possibly via commit history, since ideally the contract lives in your repo). Importantly, because the AI’s behavior is tightly coupled to this contract, updating the contract is an integral part of updating the product. You’re effectively maintaining the AI’s “understanding” alongside its code. This continuous alignment prevents the situation where the code has moved on but the AI is still following old instructions.
Audit & Knowledge Transfer: Let’s say six months later a new developer or PM joins, or you need to investigate why the AI made a bizarre decision. The context contract serves as an audit trail of what context was provided at any time and why. Since each item in context is traceable and versioned, you can reconstruct what the AI knew at moment X that led to output Y. It’s much easier to debug or explain AI decisions with this in hand. And for onboarding new team members, handing them the context schema contract (plus its history of changes) is like giving them the condensed lore of the project’s AI logic.

In essence, the context schema contract ensures that the AI’s “mind” grows and adapts with the product. We no longer throw a requirements doc over the wall and hope for the best; we keep the requirements in lockstep with the AI’s context forever. In a sense, the contract is the AI’s evolving playbook – always current, always authoritative, and never forgotten in a drawer.

Example: Context Contract in Action (with Code)

Let’s bring this to life with a concrete example. Suppose we’re building an AI assistant to help developers review pull requests (PRs) in a C++ project. We’ll sketch a simplified context schema contract for this scenario and show how it could be used in code. Even if your stack is C++, we’ll use Python for the pseudocode because readability matters (and hey, even C++ devs can read Python – this is a contract, not the actual product code!).

Scenario: “CodeCritique AI” – an AI that reviews a GitHub pull request, understands the project’s guidelines, and provides feedback with suggested improvements.

Context Schema Contract for CodeCritique AI:

AI Role/Scope: Code reviewer AI for a specific repository.
User Intent: The user (developer) wants a thorough code review of their PR.
Context Fields:
- PR details (title, description, diff),
- Coding standards and guidelines (for that repo/team),
- Test results (did tests pass?),
- Past feedback (if any from previous PRs).
Constraints:
- The AI must not approve if tests fail.
- It must highlight security issues if present (using a known list of security patterns).
- Use polite, constructive tone; no harsh language.
Output: A formatted review comment (markdown) with sections: Summary, Strengths, Areas for Improvement, Suggestions.

We can represent part of this contract in code and show how it would be applied. First, let’s define a Python data class to mirror our contract schema, then simulate populating it and using it to construct an AI prompt:

from dataclasses import dataclass, field
from typing import List, Dict

@dataclass
class ContextSchemaContract:
    """Schema for CodeCritique AI context."""
    ai_role: str
    user_intent: str
    context: Dict[str, str] = field(default_factory=dict)
    constraints: List[str] = field(default_factory=list)
    version: float = 1.0

    def update_context(self, key: str, value: str):
        """Update a context field and bump version for tracking."""
        self.context[key] = value
        self.version += 0.1  # increment version to mark change

# Initialize the contract with basic info
contract = ContextSchemaContract(
    ai_role="Code Review Assistant for the FooBar Repo",
    user_intent="Provide a thorough code review for the given pull request"
)

# Populate initial context fields
contract.context.update({
    "PR_title": "Add input validation to payment module",
    "PR_description": "This PR adds range checks on inputs to prevent overflow.",
    "diff_summary": "Modified files: payment.cpp, validation.h"
})
contract.constraints.extend([
    "Follow the project's C++ style guide and coding standards.",
    "If unit tests failed, do not approve the PR.",
    "Mention any security vulnerabilities or edge cases that are not handled."
])

In the snippet above, we set up the contract with an ai_role and user_intent. We then fill in some context details like the PR title, description, and a summary of the diff (which presumably we’d get from the GitHub API). We also add a few constraints that came from our project guidelines.

Now, imagine the moment we want the AI to do its job – we need to assemble a prompt (or API call) using this contract. Here’s how we might do that:

# Let's say tests have run and we have a result:
tests_passed = False
if not tests_passed:
    contract.context["test_status"] = "FAIL"  # update context
    contract.update_context("test_status", "FAIL")  # using our method to bump version

# Construct a prompt for the AI using the context contract
prompt = f"""
Role: {contract.ai_role}
User Intent: {contract.user_intent}

Context:
- PR Title: {contract.context.get('PR_title')}
- PR Description: {contract.context.get('PR_description')}
- Diff Summary: {contract.context.get('diff_summary')}
- Test Status: {contract.context.get('test_status', 'PASS')}  # default to PASS if not set
- Repository Guidelines: Follow all coding standards.

Constraints:
"""
for rule in contract.constraints:
    prompt += f"- {rule}\n"

prompt += """
Now, based on the above context, provide a code review.
Your response should include:
1. Summary of the changes.
2. Positive feedback on good aspects.
3. Constructive critique on issues or improvements.
4. A clear recommendation (approve or request changes), considering the test status.
"""

In this constructed prompt string, we’ve basically translated our context schema contract into the actual input we’ll give the AI model. We list the role, the intent, all the context fields (including the test status we updated and any guidelines), and we enumerate the constraints as bullet points (so the AI is explicitly instructed on them). Finally, we ask the AI to produce the output in the desired format.

A few things to note here:

We updated the contract when tests failed: contract.update_context("test_status", "FAIL"). This is simulating the idea that our pipeline or application logic knows to keep the contract in sync with reality (tests failing is a change in context). Because of this, the prompt we build accurately reflects that context (Test Status: FAIL), and our constraints say “do not approve if tests failed,” so we expect the AI to catch that.
The contract’s fields make it straightforward to assemble a prompt. We’re not scrambling to piece together bits of state from here and there – we know exactly what keys to pull from (PR_title, diff_summary, etc.) because the contract defined them. This reduces the chance of forgetting to include something important. If later we realize we need to also include “Lines changed: 120” as context, we’d add a field lines_changed to the contract, update our code accordingly, and everyone (human and AI) now knows that’s part of context.
After getting the AI’s output, we could also imagine enforcing parts of the contract. For example, if the AI said “Looks good, approved” despite tests failing, that violates a constraint. We could detect the word “approved” and the test_status: FAIL context and flag that the AI’s response is non-compliant. This might trigger either an automatic correction (ask the AI again, reminding it of the rule) or at least log a warning. In effect, the contract allows building validators on the output side as well, turning “AI contracts” into something that can be checked like any other contract.

This example, while simplified, demonstrates how a context schema contract can be used in practice. It’s not just theoretical – you implement it in your code workflow. The result is an AI system that behaves more deterministically and transparently. When it inevitably does something unexpected, you have the contract to either blame (“Ah, we forgot to include X in context, no wonder it missed that detail!”) or to improve (“We should add a rule to cover this new case”).

How to Craft a Context Schema Contract (Step-by-Step)

Authoring a context schema contract might sound complex, but it’s quite approachable if you break it down. It’s a bit like filling out a detailed spec sheet for your AI’s brain. Here’s a step-by-step guide to creating one – essentially a paint-by-numbers approach:

Define the AI’s Role and Scope: Start by clearly writing down what the AI is (and maybe what it is not). Are we dealing with “a friendly customer support chatbot for an e-commerce site” or “an autonomous data-cleaning script that explains its changes”? This becomes the foundation of your contract. It will often be one of the first lines in the contract and will anchor many decisions (since context beyond that scope can be ignored).
Identify the Primary User Intent(s): What is the user trying to accomplish when interacting with this AI? If there are multiple scenarios, you might list the top-level intents (or have a separate contract per intent). This ensures the contract is goal-oriented. For example, “User wants to get troubleshooting help for a software issue” or “User wants creative suggestions for a marketing slogan.” Write this as a short statement or list; it will guide what context is relevant.
List Out Context Elements Needed: Brainstorm everything the AI would ideally know to fulfill those intents given its role. This often maps to answering questions like: What information about the user do we have (profile, preferences, location)? What ongoing state or history matters (previous questions, current draft content, last action taken)? What external knowledge or tools could help (product docs, APIs, database entries)? What environmental conditions (time, date, geolocation, system state)? Essentially, imagine yourself in the AI’s shoes trying to do the task – what do you wish you knew? Write all these down as candidate context fields. Don’t worry about structure yet, just make a comprehensive list.
Structure the Context Schema: Now take that brainstorm and organize it. Group related pieces (perhaps “User Info,” “Session History,” “Active Data,” “Relevant Docs,” etc.). Decide on a format – e.g., will this be a JSON with nested keys, a sectioned Markdown document, a Python dict, etc. Create the skeleton with placeholder names. For example:
- User Profile: name, role, account tier, etc.
- Session State: current step, last action, unresolved queries.
- Knowledge Base: maybe a list of document IDs or summary snippets.
- External Tools: list of tool names available. Each of these could be a section in a doc or keys in a dictionary. The idea is to give your context a schema – a predictable shape. This is where the “schema” in context schema contract comes in.
Add the Rules/Constraints: Next, enumerate the rules the AI should abide by. Think in terms of always/never and if/then:
- Always cite sources for factual claims.
- Never use profanity or biased language.
- If user asks for legal advice, respond with a disclaimer + generic info (no actual advice).
- If context is missing crucial info, ask a clarifying question (don’t just guess).
- When in doubt or on error, respond with a safe failure (like “I’m sorry, I can’t do X”). These rules can come from many places – ethical guidelines, product requirements, legal compliance, user experience guidelines. Put them in a list in the contract. Each rule can be a short sentence or a reference to a longer policy. The key is that they are explicit. This part really makes it a “contract” (thou shalt / shalt not…). In our template, we’ll include a spot for these.
Include Output Expectations (if needed): If your use case benefits from a defined output format or style, describe it in the contract. This could be as strict as a JSON schema or as general as “respond in a cheerful tone with emojis.” The reason to include it is to tie the context to the outcome. For instance, if the contract context includes a field “requested_output_language: French”, a corresponding expectation is “Output should be in French.” Having it in the contract means any developer or stakeholder knows that’s an intended part of the AI’s behavior. It also helps you double-check that the context provides what’s necessary for the output (in this case, that we did specify the language).
Review and Iterate: Like any good specification, review the draft contract. Involve both engineers and non-engineers (e.g., product managers, domain experts) if possible. Does it capture everything important? Is anything in the contract that the AI really won’t ever use? Trim the fat, because unnecessary context can confuse the model or waste token budget. Also think of edge cases: What if a field is blank? – maybe add a rule for that (“If user_profile is empty, assume guest user”). What if two rules conflict? – perhaps set an order of precedence. This review often surfaces subtle assumptions. Update the contract accordingly.
Version it and Communicate: Stamp a version number and date. Save the contract in a place where your team and your AI pipeline can access it – ideally in your git repository or a shared docs space. Communicate to the team that “this is the context contract we’ll be following.” Since this is a new-ish concept, you might need to evangelize it a bit: encourage everyone to treat it seriously, like code. If you use tools like pull requests for docs, consider doing that for contract changes so others can review updates.
Integrate into AI Workflow: Make sure this contract isn’t just pretty words on paper. Update your prompt construction code or API calls to draw from the contract. If you have a system prompt for ChatGPT or Claude, that system prompt could literally include text from the contract (or a summary thereof). If you’re using a framework (LangChain, etc.), you might convert the contract into their format (some frameworks allow schema definitions or multiple context inputs). Essentially, wire it up so that when the AI runs, it’s reading from the contract. This might be direct or through intermediate code, but the logic of the contract should reflect in what the AI sees.
Plan Updates and Maintenance: Finally, decide how you will update this contract as things change. Perhaps you’ll revise it at the end of each sprint if new features were added. Or maybe you have a rule that any PR that affects the AI’s behavior must also consider if the context contract needs an update (akin to updating documentation or tests). Embed the contract into your life cycle. A context schema contract is not fire-and-forget – it’s more like a living constitution for your AI feature.

By following these steps, you’ll author a robust context schema contract that is tailored to your project. It might feel formal, but you’ll thank yourself later when the AI does exactly what it’s supposed to and everyone on the team (including the new AI teammate) understands why.

In case you prefer a more guided format, we’ve created a fill-in-the-blank template below. You can use it as a starting point to draft your own context schema contract by replacing the placeholders with your project specifics.

Broader Learning

This concept of maintaining structured context and treating it as a contract is gaining traction. If you’re hungry for more insights (or want to see how others are doing it), here are some top resources worth exploring:

Anthropic: “Effective Context Engineering for AI Agents” (2025) – An in-depth look at strategies for curating and managing LLM context beyond just prompt wording. It contrasts prompt engineering vs context engineering, and discusses why focusing on the right information in context is key to steerable, reliable AI.
John Godel’s “Prompt Engineering 2026: Contracts Become the Interface” – A forward-looking article about how teams are moving from verbose prompts to concise prompt contracts. It outlines the elements of a good AI contract (scope, schema, refuse rules, etc.) and how this approach makes prompts “reliable, auditable, and portable”. Great perspective on treating prompts as versioned interfaces.
Victor Leung on Model Context Protocol (MCP) – Blog post “What is MCP and Why It Matters” (2025). Introduces an open standard for defining context via YAML. MCP is essentially an implementation of context schema contracts, enabling modular, debuggable AI context. Worth reading for concrete examples of context schemas (system instructions, memory, tools, user inputs) and how each item is versioned.
GitHub – PRD-driven Context Engineering (Gearheart) – An open-source repository by Matt Gierhart demonstrating “memory as infrastructure” for AI-human teams. It showcases how living documentation (PRDs, specs, etc.) can be organized to feed AI agents in a project, using unique IDs and a layered approach. Browsing the repo (and the README.md/PRD.md in it) can spark ideas on structuring context and keeping AI aligned through docs.
Video – “Context Engineering 101: The Simple Strategy to 100x AI Coding” by Cole Medin (2023) – A YouTube talk that humorously and effectively breaks down context engineering for programmers. Cole demonstrates how providing the right structured context (like relevant code snippets, instructions, and constraints) can massively boost an AI coding assistant’s performance. A great visual primer on why context matters more than model size.
A. B. Vijay Kumar’s series on Product Requirement Prompts (PRPs) – Particularly the second part “Context Engineering — Product Requirements Prompts”. It discusses layering context from business requirements down to technical specs in AI prompts. This showcases a practical methodology for turning traditional requirements into structured contexts, much like a contract between stakeholders and the AI. It’s a useful bridge between classic PRD writing and context schema thinking.

Each of these resources will deepen your understanding and give you different perspectives on ensuring AI systems have the context they need, structured in a maintainable way. Happy learning!

Context Schema Contract Template (Fill-in-the-Blank)

Use this Markdown template to draft your own context schema contract. Replace the placeholder text in square brackets [like this] with details for your project. You can add or remove sections as needed.

[PROJECT NAME] – Context Schema Contract

AI System Role & Scope: [Describe the AI’s role and domain. For example: “AI coding assistant for the FooBar application, specializing in C++ code reviews.”]

Primary User Intent: [State the main goal(s) users have when interacting with this AI. For example: “Help the user troubleshoot network issues” or “Guide the user through onboarding”.]

Context Schema Components

User Profile / Attributes: [What info do we have about the user? e.g. name, role, preferences, account tier.]
Session State / History: [What recent interactions or state matter? e.g. last user question, current step in a flow, recent actions taken.]
External Knowledge / References: [What docs, data, or context from outside should be included? e.g. “FAQ page on pricing”, “User’s transaction history”, “Relevant code snippet from file X”.]
Tools & Integrations: [What tools can the AI use or call on? e.g. “database lookup tool”, “calendar API”, or simply “no external tools” if none.]
Environment Details: [Any other environmental context? e.g. time/date, location, device info, version of software, etc., if relevant.]

(Feel free to add more context fields if needed, or group them into subsections as makes sense.)

Constraints & Rules

[Rule 1: Describe something the AI must always do. For example: “Always respond in a polite, empathetic tone.”]
[Rule 2: Describe something the AI must never do. For example: “Never reveal confidential internal code or URLs.”]
[Rule 3: Conditional rule, if applicable. For example: “If the user asks for medical advice, the AI must include a disclaimer and suggest seeing a professional.”]
[Rule 4: Another guideline or policy. For example: “Use British English spelling for UK users.”]

(Add or remove bullet points as needed. Aim to cover safety, style, business policies, and any domain-specific mandates.)

Output Format (if applicable)

Describe the desired structure or format of the AI’s responses. For example: “Responses should be in markdown. Start with a brief summary, followed by a numbered list of recommendations. If providing code, include it in a fenced code block with syntax highlighting.”

(If the output format is flexible or not a concern, you can omit this section.)

Lifecycle & Maintenance

Source of Truth: [Reference where this contract lives in your docs or repo. e.g. “This contract is stored in docs/ai-context-contract.md and is considered the source of truth for AI context.”]
Update Policy: [Explain how/when this contract will be updated. e.g. “Review and update this contract at the end of each development sprint or whenever new AI capabilities are added.”]
Version: [Version number] – Last Updated: [Date]
Changelog: [Optional: list key updates. e.g. “v1.1 – Added tool access to Weather API”, “v2.0 – Revamped constraints after legal review.”]

End of Context Schema Contract.

Pro-tip: After filling this out, double-check that every piece of info the AI might need is covered in Context Schema Components, and every critical behavior guideline is captured in Constraints & Rules. This document should evolve as your product and AI evolve – treat it as a living contract!