Canonical specs that agents trust

A professional view on the paradigm shift in software engineering and why a single source of truth matters in the AI era.

Benedikt Bingler

Benedikt Bingler

LinkedIn

Software engineering has gone through several paradigm shifts in the last 30 years. Be it from waterfall to agile, from monolithic systems to microservices or from hand-written documentation to API-first designs with each shift having demanded new mental models and tooling. The current shift is driven by AI coding agents and my be the most impactful so far. Rather than adding automation to existing software development workflows, it's fundamentally questioning what constitutes the source of truth for software systems.

Andreessen Horowitz, in a May 2025 analysis of emerging developer patterns, articulates this precisely: "In agent-driven workflows, the source of truth may shift upstream toward prompts, data schemas, API contracts, and architectural intent. Code becomes the byproduct of those inputs, more like a compiled artifact than a manually authored source." The Git SHA, long the canonical reference for "the state of the codebase," begins to lose semantic value. A SHA tells you that something changed, but not why or whether it is valid. In AI-first workflows, a more useful unit of truth might be the combination of the prompt that generated the code and the tests that verify its behavior.

This is not theoretical. Gartner predicts that by 2028, 75% of enterprise software engineers will use AI code assistants, up from less than 10% in early 2023. The shift is underway. The question is whether we will be ready for it.

A brief history of shifts

To understand the magnitude of the current shift, it helps to recall what came before. In the early 1990s, source code lived on shared file systems; version control was nascent. CVS and later Subversion centralized the repository, making "the latest trunk" the de facto source of truth. Then Git decentralized history itself - every clone was a full copy - and the merge became the critical operation. Teams that mastered branching and merging scaled; those that did not, struggled.

In the 2000s, the rise of REST and API-first design shifted truth again: the contract became the boundary. OpenAPI specs, schema registries, and API versioning emerged. Teams that treated the contract as the source of truth could evolve services independently. In the 2010s, infrastructure as code and GitOps made the repository the source of truth for deployment: what is in the repo is what runs. Each shift required new tools, new practices, and new mental models.

The AI era demands another shift. If code is increasingly generated from specs, prompts, and constraints, then those inputs are the true source of truth. The repository remains important - it holds the artifacts - but the semantic truth moves upstream. This is not a minor adjustment. It is a fundamental reordering of what we consider authoritative.

Scattered truth

Today, most teams operate with fragmented sources of truth. Specifications live in Confluence, Notion, or scattered markdown files. API contracts may be in OpenAPI specs, Postman collections, or undocumented code. Architecture decisions are captured in ADRs - if they are captured at all - but often in different formats, different tools, different locations. AI coding agents, when they run, are fed whatever context we give them: open files, recent chat history, maybe a RAG index over docs. They do not have a canonical, authoritative view. They guess, interpolate, and sometimes hallucinate.

The result is predictable: specs diverge from code, code diverges from intent, and agents amplify the divergence. An agent updating a spec in one place may not know about a conflicting change in another. An agent implementing a feature may use an outdated API contract. There is no single place that both humans and agents treat as the system of record.

Microsoft, Oracle, and others have begun standardizing agent specifications - AgentSchema, Open Agent Specification, ai-guide.md - precisely because they recognize that agents need structured, machine-readable definitions. But these efforts focus on defining agents themselves, not on defining the domain that agents operate over: the specs, architecture, and constraints of the software being built.

Specs as upstream of code

The insight from a16z and from practitioners building agentic workflows is that we must treat specifications as upstream of code. Not as documentation written after the fact, but as the primary artifact that drives generation. Prompts, schemas, and architectural intent become the inputs; code becomes the output. This inverts the traditional model, where code was the source of truth and docs were derived or aspirational.

Replit CEO Amjad Masad has described the evolution from completion tools (Copilot) to assistants (ChatGPT) to truly autonomous agents that understand context and execute multi-step programming tasks. In that model, the agent needs something to trust. It cannot trust scattered files, conflicting versions, or stale Confluence pages. It needs a canonical workspace - a single place where specs live, where updates propagate, and where the agent can read and write with clear semantics.

The Model Context Protocol (MCP), adopted by OpenAI and gaining broad traction, provides a standard interface for connecting LLMs to external data sources and tools. MCP servers expose resources - context and data - that agents can consume. The protocol itself does not define what should be the source of truth; it enables toolmakers to expose one. The opportunity is to make specifications first-class MCP resources: structured, versioned, and authoritative.

Gartner has also noted that AI code assistants will increasingly rely on "context layers" - curated, validated sources of truth that reduce hallucination and inconsistency. A canonical spec workspace is precisely such a context layer: not a random sampling of files, but a deliberately maintained, authoritative corpus that agents can trust.

Expected changes and innovations

We can anticipate several innovations in the coming years:

Specs as first-class artifacts. Specifications will be stored, versioned, and synced like code - but with semantics optimized for both human and agent consumption. Markdown or structured formats (YAML, JSON Schema) will carry not just prose but machine-readable constraints, API contracts, and architectural rules.

Sync ahead of commits. Today, we commit code to Git and push. The source of truth is the repo. In an agentic world, specs can - and should - stay in sync ahead of commits. Agents can read and update specs in a dedicated workspace, outside GitHub, so that context is current even before code is written. This decouples the "what we are building" from the "what we have committed."

Documentation for machines. a16z notes that "docs are written as much for machines as for humans." Products like Mintlify are already structuring documentation as semantically searchable databases that coding agents cite as context. The next step is to treat specifications themselves as agent-consumable resources - with clear structure, minimal ambiguity, and explicit actionability.

Prompt + test bundles as versionable units. As a16z suggests, we may track "prompt + test bundles" as versionable units. The "state" of an application might be represented by the inputs to generation (prompt, spec, constraints) and a suite of passing assertions, rather than a frozen commit hash. Git starts to function as an artifact log - tracking not just what changed, but why and by whom.

Pain points today

Teams already feel the pain. When multiple agents - or agents and humans - edit the same domain, conflicts arise. Without a canonical source, there is no single place to resolve them. When an agent is given stale context, it produces wrong or inconsistent output. When specs live in Confluence and code lives in repos, nobody is sure which is current. The overhead of keeping things aligned grows with every new tool and every new agent.

The pain will intensify as agent adoption grows. More agents mean more potential sources of drift. More automation means faster divergence if there is no governance. The teams that thrive will be those that establish a canonical spec workspace early - a place that agents sync to, read from, and treat as authoritative.

Solutions

We need systems that:

  1. Provide a single source of truth for specifications. One workspace, one canonical state. Agents and humans read from and write to the same place. No scattered copies, no "which version is current?"

  2. Stay in sync ahead of commits and pushes. Specs should be updatable outside the code repo. They can live in a dedicated workspace that agents and humans access directly, so context is always current - even before code is written or committed.

  3. Expose specs in agent-consumable form. Structured, machine-readable, with clear semantics. Compatible with MCP or similar protocols so that any agent can pull context reliably.

  4. Integrate with existing toolchains. Cursor, Claude Code, Gemini CLI, and other agents must be able to sync specs with minimal friction. A single command - sync specs - should be sufficient to keep local and canonical state aligned.

  5. Support multiple agents and humans without overwrites. Concurrent edits create branches, not blind overwrites. Conflict resolution happens in the workspace, with clear attribution and auditability.

  6. Enable governance and traceability. A canonical workspace is not enough if changes are not versioned, attributable, and reversible. The same governance layer that applies to living specs - history, provenance, restore - must apply to the canonical source. Otherwise, "canonical" becomes "latest overwrite wins," which is no improvement.

The "Outside GitHub" advantage

A critical insight is that specs can - and arguably should - live outside the code repository. Today, many teams keep specs in the repo alongside code. That creates a lockstep: you cannot update the spec without a commit; you cannot have a spec that represents "what we are about to build" before any code exists. In an agentic world, the spec is the input to generation. It should be current ahead of commits. A dedicated spec workspace, outside GitHub, allows exactly that. Agents and humans edit specs in real time. When code is generated, it is generated from the current spec. The commit happens after the fact, as an artifact of the process, not as the gating mechanism for truth.

This decouples "what we are building" from "what we have committed." It is a subtle but profound shift. The repository remains the log of what was built. The spec workspace is the source of what we intend to build - and what we are building right now.

Why this matters

Economically, the cost of fragmented context is hidden but real. Rework, miscommunication, and agent-induced errors compound when there is no canonical source. The teams that invest in a single source of truth for specs will reduce that cost; those that delay will pay it repeatedly as agent adoption grows.

The last thirty years have taught us that the teams that adapt to paradigm shifts early gain lasting advantages. Those who treated tests as first-class artifacts prospered when CI/CD arrived. Those who adopted API-first design were ready for microservices. The shift to agentic development is no different. The teams that establish a canonical spec workspace now will have a durable advantage: their agents will have something to trust, their humans will have something to align around, and their context will stay correct and up to date instead of fragmented and stale.

Specularis exists to be that workspace: specs as the canonical system of record, outside GitHub and the repo, where agents and humans work from the same source of truth. The paradigm is shifting. The question is whether we will lead it or be overwhelmed by it.


References

  • Andreessen Horowitz (Yoko Li), "Emerging Developer Patterns for the AI Era," May 2025. a16z.com
  • Gartner, "Gartner Says 75% of Enterprise Software Engineers Will Use AI Code Assistants by 2028," April 2024.
  • Replit, "Replit Agent" documentation; Amjad Masad on AI coding agents and autonomy.
  • Model Context Protocol (MCP) Specification, modelcontextprotocol.info.
  • Microsoft AgentSchema, Oracle Open Agent Specification, ai-guide.md - specifications for agent-defined structures.
Ready to give your specs a home?

By entering my email, I agree to the Privacy Policy.

Canonical specs that agents trust