Building AI

Matt Asay
11 Min Read

Forward-thinking developers are recognizing the benefits of crafting clear, uniform, and thoroughly documented code that AI agents can readily interpret. This ‘boring’ approach enhances agent dependability.

Robot hand and human hand extend toward either side of a virtual touchscreen to click a humanoid figure on the screen.
Image Credit: Summit Art Creations / Shutterstock

The significance of large language models (LLMs) and AI agents in software engineering doesn’t stem from their ability to generate code with incredible velocity. As previously discussed, unchecked speed merely leads to an abundance of technical debt. Rather, agent-driven coding is crucial because it redefines the very essence of sound software engineering practices.

Historically, developers often prioritized personal preferences. Whether a framework resonated with them, a workflow seemed graceful, or a codebase aligned with their specific vision of software construction, these individual inclinations often sufficed. Machines would ultimately execute the given instructions. However, AI agents alter this dynamic. They don’t favor the most ingenious approach, but rather the most understandable, and progressively, the one tailored for their operation. While this might appear daunting, it’s a beneficial shift.

Consider the insights of Hamel Husain.

Communicating with AI Systems

One often encounters opinionated developers on platforms like Hacker News who firmly believe in specific tools for development. Hamel Husain, however, stands apart. When he shared his decision to discontinue using nbdev, he wasn’t abandoning just any side venture. He was letting go of his own creation, a project he co-developed and advocated for years. The rationale? Its incompatibility with AI. He observed, “I was working against the current,” as nbdev’s unique methodology felt “like battling the AI instead of collaborating with it.” Instead, he articulates a desire to operate in a setting where AI can achieve “the greatest likelihood of success.” He is now orienting his development practices to align with machine preferences, rather than solely his own. He is likely not the only one adopting this stance.

Developers have traditionally viewed tools as a means of self-expression, and at times, this holds true. Yet, AI agents are transforming tools into something akin to infrastructure. Husain credits Cursor’s success to its familiarity, enabling developers to transition habits smoothly rather than requiring an immediate paradigm shift. This echoes my point in “Why ‘boring’ VS Code keeps winning.” While familiarity once appealed to humans seeking ease of use, it now also appeals to models. A repository structure, framework, or language mirroring the training data offers the model a higher chance of effective performance. In the age of agents, alignment is not surrender; it’s an advantage.

Data from GitHub’s recent Octoverse analysis supports this idea. By August 2025, TypeScript surpassed both Python and JavaScript to become GitHub’s most widely used language. GitHub posits that AI compatibility is now a core factor in technology selection, not merely an optional perk. The report also indicates a 66% year-over-year growth for TypeScript, attributing this to the fact that strongly typed languages provide models with more defined boundaries, leading to the generation of more trustworthy and contextually accurate code. As Husain articulated regarding his choice to move beyond a Python-exclusive approach towards TypeScript, “typed languages enhance the reliability of AI-generated code in production environments.”

This doesn’t imply that every team must immediately undertake a TypeScript refactor. However, it certainly indicates a declining justification for unconventional, poorly documented, or “trust-me-it’s-clever” engineering. Agents thrive on clarity. They appreciate structured schemas. They benefit from defined boundaries.

Essentially, they favor predictability.

Fundamentals of Engineering Economics

This represents a profound shift in software engineering. The narrative surrounding AI agents isn’t primarily focused on code creation. Instead, it revolves around the economics of engineering. When code production costs decrease, the primary limitation shifts. I’ve previously argued that actual typing speed is rarely the true bottleneck in software engineering; validation and integration are. Agents do not eliminate this issue; rather, they make code generation inexpensive while making verification costly, thereby restructuring the entire software development lifecycle.

The most compelling public evidence for this phenomenon originates from two distinct sources, or more accurately, from their apparent discrepancies.

One compelling piece of evidence is a METR study involving seasoned open-source developers. In a controlled experiment, developers utilizing early-2025 AI tools required 19% more time to resolve issues in familiar repositories, despite believing they had worked faster. This contrasts sharply with OpenAI’s recent “harness engineering” publication, which describes how a small team leveraged Codex to generate approximately one million lines of code and integrate about 1,500 pull requests over five months. These findings appear contradictory at first glance, until one recognizes that the METR survey assessed the unsophisticated application of AI, while OpenAI’s case illustrates the outcome when a team fundamentally re-architects software development specifically for agents, rather than superficially integrating AI into existing processes.

During OpenAI’s trial, engineers’ roles shifted from writing code to primarily “designing environments, articulating intent, and establishing feedback loops” that enabled agents to perform dependably. Throughout the pilot, they discovered an initial lack of specificity in the agents’ operating environment, but ultimately transitioned to prioritizing the creation of systems where generated code could be verified and relied upon.

Naturally, this implies that AI-powered coding still necessitates human involvement to the same degree as previously. However, the nature of that intervention has evolved.

This trend is already evident in the job market as I write (and indeed, I authored this piece myself). Kenton Varda recently commented: “Concerns about software developer jobs disappearing are misguided.” He is generally correct. Should agents reduce software development costs, the probable outcome will be an increase in software creation, not a decrease. As he suggests, we can expect a surge in specialized applications, in-house tools, and bespoke systems that were previously uneconomical to develop. In fact, we observe the software developer employment sector growing considerably faster than the broader job market, despite claims that AI is poised to displace these roles.

That’s inaccurate. Human oversight remains essential for direction, even as agents assume greater responsibility for execution.

Evaluating AI Agents

This is precisely why Husain’s emphasis on evaluations is so critical. In his LLM Evals FAQ, he notes that teams he’s collaborated with dedicate 60% to 80% of their development efforts to error analysis and assessment. He has also provided one of the most lucid explanations I’ve encountered regarding agent-driven software development: Documentation instructs the agent, telemetry confirms its operation, and evaluations determine the quality of its output. Anthropic echoes this sentiment in its Best Practices for Claude Code, asserting that the “most impactful action” one can take is to equip the model with mechanisms for self-verification, such as tests, visual checks, or predefined results.

This also alters the fundamental nature of a repository. Previously, it served as a location for humans to store source code and offer hints to fellow human developers. Increasingly, it functions as an operational guide for AI agents. OpenAI reported that Codex initially used an AGENTS.md file, but they soon realized that a single, expansive agent manual quickly became outdated and ineffective. A more successful approach involved using AGENTS.md as a concise entry point to a well-organized, in-repository knowledge base. This represents a crucial agent-centric revelation. Build scripts, testing procedures, architectural diagrams, design specifications, limitations, and excluded objectives are no longer secondary documentation. They now form an integral part of the executable development context.

To put it plainly? Context has become infrastructure.

Numerous teams are on the verge of realizing that their software development methodologies are more flawed than they previously imagined. Things like undocumented scripts, opaque local configurations, unreliable tests, architecture understood only through word-of-mouth, ambiguous tickets, inconsistent naming conventions, and the notion that “each senior engineer has their own method.” Humans have simply adapted to these issues. Agents, however, instantly highlight these deficiencies. An ill-defined environment doesn’t foster innovation; it generates poor quality. If an agent struggles within a disorganized codebase, it’s not always a flaw in the agent. Frequently, it serves as a highly effective assessment of your engineering rigor. The repository, at last, reveals its true state.

Hence, I would now assert that my previous claim—that AI coding necessitates developers becoming better managers—was accurate, albeit incomplete. Indeed, developers must excel at managing machines. Crucially, though, they also need to become better engineers in the traditional sense: more proficient in crafting specifications, defining boundaries, establishing “ideal workflows,” and so forth. The age of agents prioritizes discipline over ingenuity, a shift that is arguably long overdue.

Therefore, the central narrative of coding agents is not merely their capacity to generate code; even basic chatbots could mimic that. The profound story is how they are redefining what constitutes proficient software engineering. Agents favor precisely those attributes developers have long espoused but frequently neglected: clarity, uniformity, testability, and verification. In this agent-driven era, straightforward software engineering not only scales more effectively but also enhances nearly every aspect—including collaboration and debugging.

AISoftware EngineeringProfessional Paths
Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *