AI and the Illusion of Productivity

Producing code without a strict validation framework isn’t genuine engineering; it’s merely creating a mountain of technical debt.

Sign post reading, Danger Slippery Slope. Warning sign near seafront on overcast day.

The grand promise of generative artificial intelligence was that it would finally eliminate our backlogs. Automated coding agents were supposed to generate boilerplate code at incredible speeds, enabling teams to deliver precisely what the business required. However, as we move deeper into 2026, the reality is much less comfortable. Artificial intelligence isn’t going to rescue developer productivity because writing code was never the primary bottleneck in software engineering. The real impediment lies in validation, integration, and profound system comprehension. Producing code without a rigorous validation framework isn’t engineering; it’s simply mass-producing technical debt.

So, what adjustments are necessary?

Revisiting our approach to code

Firstly, as I recently argued, we must cease viewing code as an isolated asset. Each line of code represents a potential attack surface that demands security, monitoring, maintenance, and seamless integration with its surroundings. Therefore, making code creation cheaper doesn’t reduce the overall workload; instead, it amplifies it by increasing the amount of liability generated per hour.

For many years, developers were treated like highly compensated Jira ticket processors. The prevailing assumption was that you could take a clearly defined requirement, transform it into syntax, and deploy it. Crickett accurately points out that if this is the extent of your work, then your role is absolutely automatable. A machine is capable of basic translation and is perfectly content to perform it ceaselessly without complaint.

However, a machine lacks the ability to grasp crucial business context. AI cannot perceive the financial implications of a compliance error or examine a customer workflow and instinctively discern that the underlying requirement is fundamentally flawed. For these tasks, we require human insight, and we need people to thoughtfully consider precisely what they intend AI to accomplish.

Crickett characterizes this shift as a necessary progression towards spec-driven development. He is correct, but we must be exceptionally clear about what a “specification” entails in the agent era. It’s not merely another Jira ticket; instead, it’s a set of constraints sufficiently stringent to prevent an LLM from deviating. In essence, it’s an executable definition of “done,” fully supported by tests, API contracts, and strict production signals. This is precisely the kind of foundational work we have neglected for decades because it doesn’t appear as tangible output; it manifests as process. You know, the “unexciting stuff” that seems to slow you down.

The tensions are evident in real-time, simply by examining the responses to Crickett’s tweet. You’ll observe individuals desperately attempting to reconcile the complexities of agent-driven development. One commentator tries to reframe the disorder as architecture versus engineering. Another insists that overseeing 19 agents is actually orchestration, not context switching. A third bluntly states that managing more than five agents concurrently starts to resemble “vibe coding,” which is a polite way of describing gambling with live production systems. All these highlight the central issue: the work hasn’t been eliminated. It has merely been shifted from implementation to supervision and review.

The more you parallelize your code generation, the greater the “review debt” you accumulate.

Observability as the solution

This is where Charity Majors, co-founder and CTO of Honeycomb, expresses her frustration. Majors has argued for years that true understanding of code functionality only comes from running it in production, under actual load, with real users, and genuine failure conditions. With AI agents, the development burden entirely shifts from writing code to validating it. Humans are notoriously poor at validating code simply by reviewing large pull requests. We confirm system integrity by observing its behavior in live environments.

Now, extend that concept further into the age of AI agents. For decades, one of the most common debugging strategies was inherently social. A production alert triggers. You examine the version control history, identify the person who authored the code, inquire about their objectives, and reconstruct the architectural intent. But what becomes of that process when no human actually wrote the code? What happens when a human merely skimmed a 3,000-line, agent-generated pull request, clicked merge, and moved on to the next task? When an incident occurs, where is the profound knowledge that once resided with the author?

This is precisely why extensive observability isn’t just a beneficial feature in the agent era; it’s the sole viable replacement for the absent human element. In the age of AI agents, we require instrumentation that captures both intent and business outcomes, not merely generic logs indicating an event happened. We need distributed traces and high-cardinality events rich enough to precisely answer what changed, its impact, and why it failed. Without this, we are attempting to operate a black box constructed by another black box.

Majors also provides crucial operational guidance: deployment freezes are fundamentally a quick fix. The common human reaction when change seems risky is to halt deployments. However, if you continue merging agent-generated code without deploying it, you’re simply accumulating risk, not reducing it. When you finally execute a deployment, you’ll have absolutely no idea which specific AI hallucination just disrupted your payment gateway. Therefore, if you must freeze anything, freeze merges. Better yet, make the merge and the deployment feel like a single, indivisible action. The faster that cycle operates, the less variance you encounter, and the simpler it becomes to pinpoint the exact cause of a breakdown.

Golden paths are the solution

The remedy for this impending chaos isn’t to depend on individual heroic engineers. As Majors emphasizes, resilient engineering demands a commitment to platform engineering and golden paths (a stance I’ve also advocated). Such golden paths make proper behavior incredibly straightforward and incorrect behavior exceedingly difficult. The most effective teams of the next decade won’t be those with the most freedom to adopt any framework an agent suggests, but rather those that operate securely within the most effective constraints.

So, how do we assess success in the agentic era?

The essential metrics remain the unglamorous ones, because they measure actual business results. The DORA metrics continue to be our most reliable sanity check as they directly link delivery speed to system stability. They quantify deployment frequency, lead time for changes, change failure rate, and time to restore service. None of these metrics are concerned with the number of commits your agents generated today. They only care whether your system can absorb changes without failing.

Therefore, absolutely utilize coding agents. Employ them vigorously! But do not conflate code generation with productivity. Productivity emerges after code generation, when code is properly constrained, validated, observed, deployed, rolled back, and understood. This is the cornerstone of enterprise safety and developer productivity.

Software DevelopmentIT Skills and TrainingCode SecuritySecurityDevops

Trending →

Nvidia’s Fresh Take on India’s Homegrown AI

Apple, the home for your digital life.

OpenAI Teams Up with Consulting Giants to Bring AI Tools to Businesses

OpenAI’s ‘Frontier’: Their Bid to Lead Enterprise AI Agents

Future-Proof Your Skills for AI

AI and the Illusion of Productivity

Producing code without a strict validation framework isn’t genuine engineering; it’s merely creating a mountain of technical debt.

Revisiting our approach to code

Observability as the solution

Golden paths are the solution

Leave a Reply Cancel reply

You Might Also Like ↷

What’s Next for MySQL?

Popular VS Code Extensions Exposed 128 Million Users to Attacks

JSR: What Comes After NPM?

Bring your apps to life

Trending →

Producing code without a strict validation framework isn’t genuine engineering; it’s merely creating a mountain of technical debt.

Revisiting our approach to code

Observability as the solution

Golden paths are the solution

Share this:

Leave a Reply Cancel reply