Powering the AI data center boom: the infrastructure upgrades behind innovation

2 hours ago 1
Glowing server racks inside a data center.
Image Credit: Shutterstock (Image credit: Shutterstock)

As hyperscalers and other data center operators accelerate efforts to deliver the capacity required for generative AI and large-scale model training, modernizing data centers for the latest GPU technologies has become a defining challenge.

The shift to highly advanced accelerators demands radical improvements across power, cooling, and high-speed connectivity. What was once considered cutting-edge even a few years ago is no longer enough to support today’s AI workloads, forcing operators to rethink everything from rack design to thermal strategy.

Data Centre Director at Onnec.

The scale of global investment reflects this pivot. McKinsey estimates that data center spending will reach $6.7 trillion by 2030, with the majority funneled into facilities engineered specifically for AI.

Yet the industry’s rapid expansion is already confronting major constraints: supply chain bottlenecks for GPUs and interconnects, design limitations that restrict density, and shortages of skilled engineers capable of supporting complex builds.

These pressures have helped fuel the rise of “neocloud” providers whose business models revolve entirely around high-performance GPU compute.

A neocloud surge

Neoclouds have become one of the most dynamic forces reshaping data center infrastructure. Unlike traditional operators, which must balance AI capacity against broader cloud demands, these firms design everything around GPU acceleration.

With demand for generative AI growing faster than existing data centers can be upgraded, neoclouds are capturing momentum by deploying at extraordinary speed and offering high-performance compute at competitive rates.

Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!

The scale and ambition of these projects are unprecedented. CoreWeave, for instance, has rapidly grown from modest deployments to tens of thousands of GPUs per build, complemented by the rollout of NVIDIA’s GB300 NVL72 systems.

The performance gains are dramatic, up to ten times greater responsiveness and significant improvements in energy efficiency versus previous generations. Meanwhile, NScale’s 230-megawatt facility in Norway aims to deliver 100,000 GPUs by 2026, powered entirely by renewable energy.

Nebius underscored the scale of market appetite with a multi-billion-dollar GPU infrastructure agreement with Microsoft, a deal that instantly transformed its market position.

The motivation extends far beyond a technical race for capacity. Nations increasingly view AI infrastructure as a pillar of long-term competitiveness. Countries capable of deploying at speed stand to attract investment and talent. Those who move too slowly risk watching opportunities go elsewhere.

The engineering bottlenecks

Building AI-ready infrastructure exposes the limits of even recently constructed facilities. Power density requirements are rising sharply, while cooling and bandwidth constraints frequently demand wholesale redesigns.

Many operators face the uncomfortable reality that retrofits may be costlier or more disruptive than expected, resulting in delayed projects or cancelled expansions.

The most significant shift is the transition from traditional air-cooled systems to various forms of liquid cooling, particularly direct-to-chip.

These systems enable dense GPU clusters to operate within acceptable thermal limits, but require entirely new facility considerations such as fluid distribution and containment to power integration and safety protocols.

Connectivity presents another critical challenge. AI workloads depend on vast east–west traffic flows between GPUs, pushing interconnect technologies such as InfiniBand and advanced fiber optics to the limits.

Supplies of these components remain constrained globally, while the installation itself requires specialist skills and careful coordination. Dense GPU fabrics are only as strong as the cabling underpinning them; poorly designed or cheap deployments quickly become performance chokepoints.

This surge in complexity is echoed in workforce requirements. AI data center builds routinely require several times the manpower of conventional projects, often involving teams with specialized fiber, power, and cooling expertise.

Coordinating these disciplines while maintaining speed, quality, and safety has become a defining operational challenge of the AI era.

Why the right partners matter

That’s why data center operators are increasingly turning to trusted partners capable of bringing technical depth, global experience, and operational scale. No single operator, regardless of size, can shoulder the full burden of AI projects alone.

Strong partners help bridge gaps across engineering, logistics, compliance, and workforce mobilization, enabling operators to move quickly without sacrificing quality or resilience.

These partners contribute in several critical ways. Their familiarity with high-density cabling architectures, advanced cooling solutions, and GPU cluster integration allows them to design and execute upgrades that match the demands of next-generation AI systems.

They also help navigate local regulatory and permitting environments, mitigating risks that can halt or delay builds. On the operational side, they can mobilize large, skilled teams at pace.

That means sourcing, training, and coordinating engineers while ensuring health and safety, and quality control remain robust under accelerated timelines.

In short, the ability to draw on partners with deep technical capability and agile delivery models can be the difference between an ambitious design on paper and a functioning AI data center ready for commercial workloads.

Winning the infrastructure race

The race to build AI-ready data centers is no longer about deploying the latest GPUs. It is a test of coordination between technology, regulation, labor, and supply chains. Operators that combine strong internal leadership with the right external partnerships will be best positioned to bring capacity online quickly and reliably.

As global demand for GPU compute continues to outstrip supply, those able to deliver advanced infrastructure at speed will secure a decisive competitive advantage. In this new era of hyperscale AI, collaboration and capability will determine who leads.

We've featured the best data migration tool.

This article was produced as part of TechRadarPro's Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

Sales director, UK, Onnec Group.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.

Read Entire Article