Cover image for Context matters more than code
Alexander Hipp

Context matters more than code

We started with a small test, make the logout button work, and ended up rethinking how we build our side projects with AI. This is the first part of a series documenting what I'm learning while experimenting with coding agents, from writing PRDs for machines to designing the "teams" they work in.

When code writes itself

While looking for my next product role, I’ve been working on side projects with a close friend. We’re building a few tools (more on that soon) and experimenting with AI agents to see how far we can push them—not the “build an app in 10 minutes” type, but ones that could actually grow into real micro-SaaS products.

We mostly use Claude Code and Cursor to build features, but lately the output quality has been awful. So we jumped on a call to rethink our workflow and ran a simple test: make the logout button work. Easy, right?

Within minutes, Claude spun up a new route, added Zustand for state management, and rewrote unrelated parts of the app. Technically impressive. Practically a disaster.

That’s when it clicked. The agent wasn’t wrong—it was just too unconstrained. We had agents.md and claude.md files and some basic setup, but we hadn’t explained how the system worked, what mattered, or what to leave alone. It did exactly what we asked, not what we meant.

So we started thinking: how do we make AI build the way we would?
How do we get structure, context, and control, without killing the speed?

Our plan

We need to stop to "only" prompt the AI and start specifying and giving more context.
Less chat, more structure.

The idea is to build a framework inside the repo that gives agents the same kind of context a human teammate would have.

We talked about starting with two simple files:

/CLAUDE.md or /AGENTS.md — the operating manual
A clear guide that explains how the system works and what not to touch.

  • Overall architecture and folder structure
  • Non-negotiables and coding standards
  • Key dependencies and conventions
  • Deployment and environment details

But most importantly, the db schema
That’s the foundation. The schema defines the real shape of the data, the relations between entities, and the single source of truth the AI should never override.
If the agent understands the schema, it understands the product. It can reason about data flow, relationships, and constraints before touching a line of code.
We want the schema to act as the anchor for every decision the agent makes — the bridge between how the system is modeled and how the AI interprets it.

/context.md — the running memory
A lightweight changelog the agent can read before doing anything.

  • New routes, components, or configs
  • Edge cases and gotchas
  • Known limitations
  • Design or architectural notes worth remembering

Once the schema and context live side by side, the agent has both a map and a memory; the two things it’s been missing all along.

AGENTS.md example

From prompts to PRDs

After seeing the agents go rogue a few times, we realized that simple prompts and loose context don’t scale.
They’re too vague, too easy to misread, and too inconsistent.

That’s when we started putting more focus on PRDs.
I heard something in a podcast recently that stuck with me: code is cheap, the real value is in the PRD.
It’s where clarity lives and where the real debates should happen.

If we want consistent results, we need to give the AI the same level of clarity we’d give a human developer, clear, structured specs that explain what to build, why it matters, and what to leave out.

Instead of conversational nudges, we want explicit intent: rules, constraints, and definitions that can be reused and refined over time.
This shifts AI development from ad-hoc prompting to a repeatable process.

We’ll keep them simple at first, probably one PRD per feature, living in a /prds folder. Each one should capture:

  • Goal and scope
  • Inputs and data contracts
  • UI states and empty states
  • API requests and responses
  • Out-of-scope notes
  • Done criteria

In practice, these docs become our contract with the agent. They front-load all the context, what matters, what’s fixed, and what success looks like, before execution starts.
We can review each PRD together, tweak the wording, and then see how even small phrasing changes affect the agent’s output.

The goal isn’t just better prompts. It’s to build a shared language between us and the AI, one that’s precise enough to guide the work, and structured enough to learn from

Thinking like an organization

Halfway through the call, we realized we were talking less about code and more about team design.

If one agent can ship a feature, what happens when we run ten in parallel?
They’ll need boundaries, just like people.

We started imagining small “agent teams,” each with a clear domain:

  • A frontend agent that owns React components and UI logic
  • A backend agent that handles routes and data models
  • A platform agent that maintains shared utils and standards
  • A spec agent that reviews PRDs before implementation

And then each agent team has a specific purpose like being responsible for the onboarding flow or the core functionality.

Each one would have its own rules file and coordinate through shared context, not long prompts.
It’s basically Conway’s Law for machines: your system architecture will mirror your agent structure, whether you plan for it or not.

Our first iteration

Our first goal is to make this loop work:

  1. Review current context and schema
  2. Write or update a PRD
  3. Run the agent to generate code
  4. Review, fix, or rerun
  5. Update context.md

We want to see if this cycle can self-improve. If the output is off, we fix the PRD or rules instead of rewriting prompts. The workflow should learn as we go.

We’ll probably start with two parallel work streams:

  • Hygiene: small, low-risk things like login, logout, invites
  • Core product: the more complex logic around our boards, data translation, and audience views

The hygiene stream lets us experiment fast. The core stream keeps us focused on the product we actually want to build.

Where we want to go next

We also threw around some bigger ideas we want to test later:

  • PRD diffing: when one spec changes, downstream PRDs get auto-checked for updates
  • Visual PRD graphs: nodes as features, edges as dependencies
  • PRD pipeline: break every task into repeatable steps: plan, build, test, update
  • Self-updating context: after each PR, the agent logs what it changed
  • PRD sequencing: Sounds like a roadmap topic.

We’re not there yet. But it already feels like the right direction or at least a super interesting one to learn more. Designing systems that teach AI how and why to build, not just what to build.

We ended the call talking about how strange this feels. We’re not managing only code anymore; we’re designing a lightweight organization where humans set intent and AI handles execution.

That’s where we’ll pick things up next. Exploring what it means to design for synthetic teams, how to structure roles and rules for agents, and what new kinds of product organizations might emerge when the code starts writing itself.

Stay tuned for the learnings.

Article by Alexander Hipp (Product builder and advisor)
Context matters more than code · Alexander Hipp