Posts

The AI Control Plane for Software Engineering

AI is not a feature to add to individual tools. It is the coordination layer that should sit above all of them. Here is the model and what I am seeing in practice.


Modern software engineering is a coordination problem masquerading as a technical one. The raw capability to write, test, deploy, and observe software is well-understood. What is largely unsolved is how to coordinate the flow of intent, context, and action across the systems that execute these activities.

Think about what a senior engineer actually does when resolving a production incident. They move from an alert in their observability platform, to a log query interface, to a source code repository, to a deployment system, to a communication thread, and back to code. Each transition requires them to carry context that no system holds. The engineer is the integration layer.

This is not incidental. It is structural. The ecosystem evolved as a collection of best-of-breed point solutions: version control, CI/CD, issue tracking, code review, artifact registries, deployment platforms, observability stacks. Each is independently valuable. None coordinates the others. The result is a high coordination tax paid by every engineer, every day.

Larger organizations build internal platforms to reduce this tax - developer portals, golden paths, paved roads. These are genuine improvements. But they optimize the human-as-integrator model rather than replacing it. The human remains in the coordination loop, now with better tools but still responsible for translation between systems.

There is also a compounding context loss problem. The decisions made during software development - why a particular approach was chosen, what alternatives were rejected, what constraints shaped a solution - are rarely captured in any durable form. They live in the heads of engineers, in ephemeral chat messages, in commit messages written under time pressure. When engineers leave, this context goes with them.

The Control Plane Model

In distributed infrastructure, a control plane is the component responsible for managing and coordinating data plane components. The Kubernetes control plane does not execute workloads - it coordinates the systems that do. It maintains desired state, reconciles actual state against desired state, and dispatches instructions to actuators. The data plane executes. The control plane orchestrates.

This maps directly to software engineering. There are three layers.

The Tool Layer

The tool layer is the data plane: Git repositories, CI runners, deployment systems, observability backends, code analysis tools, testing frameworks. Each has well-defined capabilities and interfaces. They execute work. They do not coordinate.

The Orchestration Layer

The orchestration layer translates high-level intent into sequences of tool invocations, monitors execution, handles failures, and aggregates results. In most organizations today, this layer is occupied by humans. Platform engineering teams build automation for specific, well-defined workflows - but the long tail of coordination tasks remain human-executed.

AI systems are now capable of operating at this layer. Large language models can interpret natural language intent, decompose it into tool calls, execute sequences across heterogeneous APIs, observe results, and adapt. The architectural move I am advocating is deliberate: design AI into the orchestration layer rather than as a feature of individual tools.

The Intent Layer

Above orchestration sits intent. Intent is the expression of what an engineer, team, or organization wants to achieve - not the specific tool invocations required. “Resolve this incident” is intent. The specific sequence of queries, code changes, deployments, and validations required to resolve it is orchestration.

Current AI coding assistants operate at the intent layer - they accept natural language and generate code or suggestions. But they are disconnected from the orchestration layer. They produce artifacts; they do not coordinate the downstream systems that consume those artifacts.

A mature AI control plane connects all three layers: it accepts intent, constructs and executes orchestration across the tool layer, observes results, and closes the loop back to the intent layer.

What a Control Plane Must Have

For AI to function as the control plane for software engineering, the architecture needs:

  • Persistent context: State that spans sessions, tools, and time. The control plane must remember what was decided, why, and with what constraints.
  • Tool federation: Standardized interfaces to heterogeneous tools. Model Context Protocol (MCP) is an early instantiation of this.
  • Bidirectional observability: The control plane must emit signals that humans can inspect, and must consume signals from the tool layer to close feedback loops.
  • Auditability: Every action must be traceable to the intent that triggered it.
  • Human-in-the-loop hooks: The control plane must support delegation of decisions back to humans when confidence is low or stakes are high.

What I Am Seeing in Practice

A few patterns that have emerged from watching agentic systems operate in real software engineering contexts:

The context window is the current bottleneck. AI systems treat the context window as the primary state store. This creates a fundamental architectural tension: the context window is transient, limited in size, and not shared across instances. Control planes require persistent, shared state. The teams deploying AI agents effectively are building explicit state management layers outside the model - memory stores, structured logs, durable context objects. This is becoming a first-class architectural primitive.

Tool interface heterogeneity is the primary integration tax. AI agents attempting to coordinate across tools face the same integration tax that human engineers face, but at higher speed and without the cognitive flexibility to handle undocumented edge cases. Standardized tool interfaces (MCP, OpenAPI with semantic annotations) reduce this tax significantly. Teams that invest in clean tool interfaces unlock qualitatively different automation than those working against poorly-documented APIs.

Humans remain load-bearing at the intent layer. Current agentic systems handle orchestration well for bounded, well-defined intents. They struggle when intent is ambiguous, when goals conflict, or when constraints are implicit. This is not a temporary limitation to be engineered away - it reflects a genuine division of labor that platform designs should accommodate explicitly rather than design around.

The most interesting automation is not the automation of existing workflows. It is the workflows that emerge when coordination overhead drops to near-zero - the things that were not worth attempting when humans had to execute them manually. Automated post-incident context capture, continuous architectural drift detection, proactive dependency health monitoring. These are examples observed in early deployments. None of them were anyone’s explicit goal - they emerged from the capability becoming available.

What This Means

For platform engineering: The mandate expands from “reduce friction for humans” to “build the infrastructure that makes AI coordination possible.” This means investing in tool interface quality, observability instrumentation accessible to AI systems, state management primitives, and auditability infrastructure. Platform engineers who understand these requirements will be positioned as AI-era architects, not just toolchain operators.

For developer tooling: Tools designed as standalone products will face increasing pressure to expose clean, composable interfaces. The competitive advantage of a best-of-breed tool will increasingly depend on how well it participates in AI-coordinated workflows - not just on its standalone UX. Teams building developer tools should treat AI agent consumers as first-class API consumers.

For organization design: If AI systems assume coordination responsibilities that humans currently hold, engineering organizations will see role redistribution. Some coordination roles will shrink. The roles that will grow are those that operate at the intent layer - engineers who can express clear, well-constrained intent; architects who can design systems with AI coordination in mind; platform engineers who build and maintain the control plane itself.

Open Questions

Several things I do not have good answers to yet:

  • What is the right granularity for intent expression? Too coarse and the system cannot act with confidence. Too fine and the human is back to doing orchestration manually.
  • How should conflict resolution work when multiple agents share a control plane and their actions interact? Locking, consensus, and rollback semantics for AI-coordinated workflows are not yet well-defined.
  • What auditability standards are appropriate for AI-executed engineering actions? Current audit logs are designed for human actions. AI actions may require different granularity and different semantic content.
  • How does the control plane model degrade gracefully? A control plane that fails completely is more dangerous than no control plane. What are the failure modes?
  • What does “desired state” mean for a software engineering system, analogous to Kubernetes desired state? This is harder than it sounds - software systems have many dimensions of desired state and they interact in complex ways.