Tech
Blue and Orange 3D Cubes Representing Interconnected AI Systems and Digital Transformation
Getty Images

Alibaba researchers devise efficient GPU pooling system, reducing GPU use 82%

Drastically reducing the amount of GPUs needed for running AI models could have big consequences for the scale of huge data centers, while benefiting smaller organizations. It also could reduce demand for pricey new GPUs from Nvidia.

Jon Keegan, Matt Phillips

Researchers at Peking University and Alibaba have announced a new system that can drastically reduce GPU demand, by efficiently “pooling” computing across multiple models rather than assigning each model its own GPU.

Named “Aegaeon,” the system addresses a problem with assigning computing resources to the many AI models on the market: dedicating a set of GPUs to a specific model leaves precious processing cycles underutilized when the model is not receiving a lot of requests.

In the research paper, the authors noted that a small number of popular models, like Meta’s Llama, DeepSeek, and Qwen, dominate utilization, and 17.7% of GPUs serve only 1.35% of requests. That’s a lot of wasted GPU cycles.

The researchers use a system of “token-level auto-scaling,” which assigns computing at a granular level using tokens (the smallest unit of text an LLM processes, sometimes only a few letters) rather than at the “request” level, which might see one heavy computational task holding up the queue.

Using the Aegaeon system, in Alibaba Cloud’s production tests, the company was able reduce GPU demand by 82%. What would normally take 1,192 GPUs, the researchers were able to do with just 213 Nvidia H20 GPUs.

The consequences of this system could be significant. If AI companies can do more with less, maybe those massive data centers running AI models don’t need to be so huge, and maybe they don’t have to find as many complicated financing schemes to pay for all those GPUs.

But this also means that smaller players could be more competitive, especially in places like China, where export controls are making the most powerful processors hard to come by.

It could also be bad news for Nvidia, though Aegaeon is built on Nvidia software. And on Monday, some analysts on Wall Street pointed to the reports on Aegaeon as a reason for the day’s weakness in some previously high-flying data center stocks.

Oracle was down sharply for the second straight session. Hard disk drive makers Seagate Technology Holdings and Western Digital — big beneficiaries of the data center trade this year — also declined, as did AI energy plays Constellation Energy and Vistra.

More Tech

See all Tech
tech

Report: Microsoft weighs Xbox spin-off amid major overhaul

Microsoft is reportedly considering spinning out or restructuring its struggling Xbox unit, per The Information. While new Xbox CEO Asha Sharma, who took over in February, is preparing for layoffs, shes simultaneously planning to boost investment in its biggest franchises like “Halo,” “Fallout,” and “Minecraft.”

The latest potential shake-up comes as the gaming division battles major headwinds, following a massive 33% plunge in Q3 console sales and a recent move to slash Game Pass prices while removing new Call of Duty titles.

The latest potential shake-up comes as the gaming division battles major headwinds, following a massive 33% plunge in Q3 console sales and a recent move to slash Game Pass prices while removing new Call of Duty titles.

mythos robots

Anthropic’s Mythos gets tired, hates bad users, and wants to be thanked

Reminder: these models are not people, they don’t think, and when you close the tab, the model isn’t pondering your last interaction.

Jon Keegan6/11/26
Oracle Stock's Rises Sharply After Reporting Ultra High Demand For Cloud Computing Services

Oracle is trying really hard to convince investors it won’t have a debt problem

It’s coming up with new metrics to allay fears about its ballooning capex and debt load.

Rani Molla6/11/26

Latest Stories

Sherwood Media, LLC and Chartr Limited produce fresh and unique perspectives on topical financial news and are fully owned subsidiaries of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, Robinhood Money, LLC, Robinhood U.K. Ltd, Robinhood Derivatives, LLC, Robinhood Gold, LLC, Robinhood Asset Management, LLC, Robinhood Credit, Inc., Robinhood Ventures DE, LLC and, where applicable, its managed investment vehicles.