Cover image for Context matters more than code

October 2025

Context matters more than code

We started with a small test, make the logout button work, and ended up rethinking how we build our side projects with AI. This is the first part of a series documenting what I'm learning while experimenting with coding agents, from writing PRDs for machines to designing the "teams" they work in.

When code writes itself

While looking for my next product role, I’ve been working on side projects with a close friend. We’re building a few tools (more on that soon) and experimenting with AI agents to see how far we can push them—not the “build an app in 10 minutes” type, but ones that could actually grow into real micro-SaaS products.

We mostly use Claude Code and Cursor to build features, but lately the output quality has been awful. So we jumped on a call to rethink our workflow and ran a simple test: make the logout button work. Easy, right?

Within minutes, Claude spun up a new route, added Zustand for state management, and rewrote unrelated parts of the app. Technically impressive. Practically a disaster.

That’s when it clicked. The agent wasn’t wrong—it was just too unconstrained. We had agents.md and claude.md files and some basic setup, but we hadn’t explained how the system worked, what mattered, or what to leave alone. It did exactly what we asked, not what we meant.

So we started thinking: how do we make AI build the way we would?
How do we get structure, context, and control, without killing the speed?

Our plan

We need to stop to "only" prompt the AI and start specifying and giving more context.
Less chat, more structure.

The idea is to build a framework inside the repo that gives agents the same kind of context a human teammate would have.

We talked about starting with two simple files:

/CLAUDE.md or /AGENTS.md — the operating manual
A clear guide that explains how the system works and what not to touch.

Overall architecture and folder structure
Non-negotiables and coding standards
Key dependencies and conventions
Deployment and environment details

But most importantly, the db schema
That’s the foundation. The schema defines the real shape of the data, the relations between entities, and the single source of truth the AI should never override.
If the agent understands the schema, it understands the product. It can reason about data flow, relationships, and constraints before touching a line of code.
We want the schema to act as the anchor for every decision the agent makes — the bridge between how the system is modeled and how the AI interprets it.

/context.md — the running memory
A lightweight changelog the agent can read before doing anything.

New routes, components, or configs
Edge cases and gotchas
Known limitations
Design or architectural notes worth remembering

Once the schema and context live side by side, the agent has both a map and a memory; the two things it’s been missing all along.

From prompts to PRDs

After seeing the agents go rogue a few times, we realized that simple prompts and loose context don’t scale.
They’re too vague, too easy to misread, and too inconsistent.

That’s when we started putting more focus on PRDs.
I heard something in a podcast recently that stuck with me: code is cheap, the real value is in the PRD.
It’s where clarity lives and where the real debates should happen.

If we want consistent results, we need to give the AI the same level of clarity we’d give a human developer, clear, structured specs that explain what to build, why it matters, and what to leave out.

Instead of conversational nudges, we want explicit intent: rules, constraints, and definitions that can be reused and refined over time.
This shifts AI development from ad-hoc prompting to a repeatable process.

We’ll keep them simple at first, probably one PRD per feature, living in a /prds folder. Each one should capture:

Goal and scope
Inputs and data contracts
UI states and empty states
API requests and responses
Out-of-scope notes
Done criteria

In practice, these docs become our contract with the agent. They front-load all the context, what matters, what’s fixed, and what success looks like, before execution starts.
We can review each PRD together, tweak the wording, and then see how even small phrasing changes affect the agent’s output.

The goal isn’t just better prompts. It’s to build a shared language between us and the AI, one that’s precise enough to guide the work, and structured enough to learn from

Thinking like an organization

Halfway through the call, we realized we were talking less about code and more about team design.

If one agent can ship a feature, what happens when we run ten in parallel?
They’ll need boundaries, just like people.

We started imagining small “agent teams,” each with a clear domain:

A frontend agent that owns React components and UI logic
A backend agent that handles routes and data models
A platform agent that maintains shared utils and standards
A spec agent that reviews PRDs before implementation

And then each agent team has a specific purpose like being responsible for the onboarding flow or the core functionality.

Each one would have its own rules file and coordinate through shared context, not long prompts.
It’s basically Conway’s Law for machines: your system architecture will mirror your agent structure, whether you plan for it or not.

Our first iteration

Our first goal is to make this loop work:

Review current context and schema
Write or update a PRD
Run the agent to generate code
Review, fix, or rerun
Update context.md

We want to see if this cycle can self-improve. If the output is off, we fix the PRD or rules instead of rewriting prompts. The workflow should learn as we go.

We’ll probably start with two parallel work streams:

Hygiene: small, low-risk things like login, logout, invites
Core product: the more complex logic around our boards, data translation, and audience views

The hygiene stream lets us experiment fast. The core stream keeps us focused on the product we actually want to build.

Where we want to go next

We also threw around some bigger ideas we want to test later:

PRD diffing: when one spec changes, downstream PRDs get auto-checked for updates
Visual PRD graphs: nodes as features, edges as dependencies
PRD pipeline: break every task into repeatable steps: plan, build, test, update
Self-updating context: after each PR, the agent logs what it changed
PRD sequencing: Sounds like a roadmap topic.

We’re not there yet. But it already feels like the right direction or at least a super interesting one to learn more. Designing systems that teach AI how and why to build, not just what to build.

We ended the call talking about how strange this feels. We’re not managing only code anymore; we’re designing a lightweight organization where humans set intent and AI handles execution.

That’s where we’ll pick things up next. Exploring what it means to design for synthetic teams, how to structure roles and rules for agents, and what new kinds of product organizations might emerge when the code starts writing itself.

Stay tuned for the learnings.

Article by Alexander Hipp (Product builder and advisor)