- The AI boom is colliding with a shortage of GPUs and high-bandwidth memory
- Hyperscalers refresh their training clusters every few years, leaving millions of still-useful GPUs behind
- Redeploying that hardware to the network edge could unlock the real economic payoff of AI
Fact: the global AI revolution depends on two things—vast numbers of GPUs and enormous amounts of high-bandwidth memory. And both are becoming increasingly constrained.
The GPU shortage is already pushing prices for Nvidia chips sky-high and slowing hyperscaler buildouts. Less widely understood is the pressure on HBM (high-bandwidth memory), the specialized memory used in AI accelerators. Supply of HBM is expected to remain tight through the next couple of years as demand from AI training clusters continues to surge.
The combination means hyperscalers will have to pay eye-watering prices for the hardware they need. That will make Nvidia even richer and absurdly rich hyperscalers minutely poorer. Both of those outcomes are acceptable.
But there is a more serious consequence. Hardware bottlenecks risk slowing the rollout of AI systems that could transform the global economy. And that is not OK, because the “A” in AI doesn’t stand for America. Artificial intelligence is an industrial technology with global consequences. It will reshape business, manufacturing, logistics, medicine and scientific research everywhere—right up until America starts World War III and blows the world into teeny tiny pieces, obviously.
I can’t prevent Armageddon or fix America’s war addiction. But I do have an idea for easing the computing crunch.
Ready? Recycling. (No, wait. Don’t go.)
The pace of development in AI hardware means hyperscalers typically refresh GPU clusters every two to three years as new architectures arrive. The latest generation becomes the backbone for large-scale training runs, while the previous generation quietly moves down the food chain to less demanding applications.
But here’s the thing: edge AI applications don’t need the newest GPUs.
Training massive models in hyperscale data centers requires the bleeding edge. Running inference for industrial automation, logistics systems, robotics or smart infrastructure does not. Those workloads can comfortably run on hardware that hyperscalers consider obsolete.
So here’s the idea: when hyperscalers replace GPUs in their training clusters, those chips—and the HBM attached to them—should be systematically redeployed into the network edge.
Older GPUs could power AI systems in factories, warehouses, hospitals, energy networks and transportation systems. They could sit inside enterprise infrastructure instead of hyperscale clouds. They could run the physical AI applications that actually drive productivity.
This already happens a little, but not in any organized way. That’s partly because hyperscalers are focused on the core of the network, where the money is minted.
Training ever larger models in ever larger clusters remains the dominant economic engine of the AI boom in the United States. Edge computing, by contrast, is fragmented, messy and harder to monetize; that's why hyperscalers are happy to leave it to the carriers.
But that’s where the real economic value of AI may ultimately lie.
Consumer chatbots are entertaining. Industrial AI systems that optimize logistics networks, manufacturing lines or energy grids are transformational. And those applications don’t need the newest GPUs—just lots of reasonably capable ones distributed across the network.
China appears to have grasped this earlier than the U.S., with much of its AI strategy focused on industrial deployment and edge applications rather than purely on training ever larger models in centralized data centers.
Which brings us back to recycling.
Recycling laws in the United States are mostly controlled by state and local regulators and are generally as rubbish as the rubbish they’re meant to deal with. But a federal framework encouraging the reuse and redeployment of AI hardware could have real teeth.
In practice, the policy wouldn’t be complicated. A federal “AI hardware reuse” framework could require hyperscalers to report and certify retired accelerator hardware, create tax incentives for redeploying GPUs and memory into enterprise and industrial deployments, and establish a secondary market clearinghouse where companies building edge AI systems can access those chips at scale.
Instead of drifting into fragmented resale markets, yesterday’s training hardware would become tomorrow’s industrial infrastructure. Instead of letting older GPUs drift into secondary markets by accident, the U.S. could create incentives to push them toward enterprise and industrial AI deployments.
The real question facing the AI industry isn’t how to build ever larger training clusters; it’s how to put intelligence where the real economy lives. And the fastest way to do that may not be building more data centers. It may simply be using the hardware we already have — much more intelligently. It might not make Nvidia any poorer. But it could make the rest of the world a lot smarter — and do it without building a single additional data center.