AI Brings Back Real Connections

Cloud computing once allowed developers to overlook infrastructure; now, AI demands their renewed focus.

For a long time, the cloud’s primary advantage was enabling providers such as AWS to handle the routine, complex tasks of infrastructure management. Require computing power? A simple click. Storage? Another click. A database? Leave the intricacies to others. The core purpose of managed infrastructure was to free businesses from the daily grind of deep-level systems engineering.

Artificial intelligence, however, is revealing the limits of this abstraction.

As previously discussed, the true hurdle for enterprise AI is no longer model training, but rather inference: the ongoing application of models to regulated enterprise data, adhering to practical latency, security, and budgetary limits. This transition is crucial, because as inference solidifies as a regular business operation, infrastructure — once considered mundane yet essential — suddenly gains strategic importance.

This is particularly evident when it comes to networking.

Networking: A surprising comeback?

For many years, the value of networking stemmed from its dependable and unremarkable nature. The very idea was to avoid excitement; standards organizations progressed deliberately, and kernel updates were cautious, prioritizing reliability above all. This conservative approach was logical when the majority of enterprise tasks were lenient, and the network’s role was primarily to remain unobtrusive.

Curiously, networking gained considerable attention during periods of major technological disruption. Consider the internet infrastructure surge and dot-com boom from 1999 to 2001. Then, in 2007, broadband and mobile connectivity expanded significantly. Subsequently, cloud networking saw consolidation between 2015 and 2022. We are now on the verge of another substantial increase in networking relevance, driven by AI.

While many still focus on X posts detailing training processes, model dimensions, and massive data center investments, the true challenges likely lie elsewhere. For most organizations, occasional model training isn’t the primary difficulty. The greater task involves executing inference continually on sensitive data within shared environments, all while meeting stringent performance demands. Though network engineers might prefer to work unnoticed, AI eradicates that possibility. In the age of AI, network efficiency becomes a critical constraint, as applications no longer solely depend on CPU or storage. Instead, they rely on the seamless flow of context, tokens, embeddings, model invocations, and system states across distributed architectures.

Simply put, AI isn’t just generating more traffic; it’s fundamentally altering the network’s function.

Rethinking network perception

This isn’t an unprecedented transformation in networking. As Thomas Graf, CTO of Cisco Security, cofounder of Isovalent, and the architect behind Cilium, noted in a discussion, “The emergence of Kubernetes and microservices initiated the initial surge in east-west traffic. We disassembled monolithic applications into smaller components, which instantly mandated security not merely at the perimeter firewall, but intrinsically within the infrastructure itself, covering east-west flows.”

AI intensifies this change significantly. These tasks aren’t merely additional services communicating. They encompass synchronized GPU arrays, data retrieval pipelines, vector searches, inference proxy servers, and a growing number of agents constantly sharing state across various systems. This represents a distinct operational landscape compared to what most traditional enterprise networks were designed for. “AI workloads,” Graf further explains, “generate a hundredfold increase [in data movement]. This isn’t due to increased componentization, but because AI operates on a much grander scale, demanding an extraordinary volume of data.”

This immense data volume is precisely why networking has regained prominence, and why developers must reconsider its role.

Within AI ecosystems, the network fabric is progressively integrating into the compute system itself. GPUs continuously exchange gradients, activations, and model states. Packet loss, far from a minor inconvenience, can impede collective processes and render costly hardware inactive. Conventional north-south monitoring is insufficient because much critical traffic never traverses a typical perimeter (e.g., user requests to a server). Consequently, security policies cannot be confined to the network edge, as vital data flows frequently occur east-west within the cluster. Furthermore, as organizations are still defining their AI requirements, scalability is also crucial. Networks must expand progressively, accommodate diverse workloads, and support evolving designs without necessitating a complete overhaul with every AI roadmap adjustment.

Essentially, AI is transforming the network from mere utility infrastructure into an integral component of the application runtime.

Cilium: A critical examination

This is precisely why eBPF holds such significance. The official eBPF project describes Cilium as a method for securely executing isolated programs within the kernel, thereby enhancing kernel functionalities without altering its source code or requiring module loads. While the technical specifics are vital, the overarching message is clear: eBPF brings monitoring and policy enforcement directly to the point where data packets and system calls originate. In an ecosystem characterized by east-west traffic, temporary services, and ultra-fast inference, this capability is profoundly impactful.

Cilium stands out as a key manifestation of this transformation. Leveraging eBPF, it delivers Kubernetes-native networking, comprehensive observability, and robust policy enforcement at speeds matching the network’s raw capacity, ensuring it doesn’t create a significant performance bottleneck. This is paramount for optimal network functioning. Predictably, Cilium has become an essential component in the networking infrastructure of major cloud providers. (Google’s GKE Dataplane V2, Microsoft’s Azure CNI Powered by Cilium, and AWS’s EKS Hybrid Nodes all either rely on or offer support for Cilium.) In fact, as highlighted by the 2025 State of Kubernetes Networking Report, the majority of Kubernetes users deploy Cilium-based networking.

Despite Cilium’s significance, the broader narrative is that AI is compelling businesses to revisit infrastructure particulars they once gladly simplified. This doesn’t imply every firm needs to develop its own network stack from scratch, but it does signify that platform teams can no longer regard networking as a static utility. If inference marks the true operationalization of enterprise AI, then factors like latency, monitoring, segmentation, and internal traffic management cease to be minor considerations. Instead, they become fundamental to product excellence, system dependability, and the developer workflow.

Beyond just networking

This phenomenon isn’t confined solely to Cilium or networking in general. AI consistently compels us to address aspects we previously wished to ignore. As I’ve previously noted, while impressive AI demonstrations are captivating, the genuine effort lies in ensuring these systems function dependably, securely, and affordably in live environments. Equally crucial, as we strive to deploy AI reliably at an enterprise level, we must not neglect the imperative to enhance the entire stack’s usability for developers, its manageability for IT/operations, and its speed under practical operational demands.

“An AI-powered service that reacts more quickly and responsively will achieve superior market performance. The bedrock for this is a high-performing, low-latency network free of bottlenecks,” Graf observes. “For me, this mirrors high-frequency trading closely. When computers took over from human traders, network latency and data throughput rapidly evolved into key competitive advantages.”

This perspective resonates. The leading enterprises in AI won’t merely be those possessing the largest models. Victory hinges on ensuring inference is dependable, well-managed, and cost-effective when applied to genuine data under actual operational stress. A portion of this challenge will be met through superior models. However, a greater share of success, often underestimated by many businesses, will be secured in the seemingly mundane foundational layers, such as networking.

CareersArtificial IntelligenceDevelopment ApproachesSoftware Development

Trending →