Tech
Blue and Orange 3D Cubes Representing Interconnected AI Systems and Digital Transformation
Getty Images

Alibaba researchers devise efficient GPU pooling system, reducing GPU use 82%

Drastically reducing the amount of GPUs needed for running AI models could have big consequences for the scale of huge data centers, while benefiting smaller organizations. It also could reduce demand for pricey new GPUs from Nvidia.

Jon Keegan, Matt Phillips

Researchers at Peking University and Alibaba have announced a new system that can drastically reduce GPU demand, by efficiently “pooling” computing across multiple models rather than assigning each model its own GPU.

Named “Aegaeon,” the system addresses a problem with assigning computing resources to the many AI models on the market: dedicating a set of GPUs to a specific model leaves precious processing cycles underutilized when the model is not receiving a lot of requests.

In the research paper, the authors noted that a small number of popular models, like Meta’s Llama, DeepSeek, and Qwen, dominate utilization, and 17.7% of GPUs serve only 1.35% of requests. That’s a lot of wasted GPU cycles.

The researchers use a system of “token-level auto-scaling,” which assigns computing at a granular level using tokens (the smallest unit of text an LLM processes, sometimes only a few letters) rather than at the “request” level, which might see one heavy computational task holding up the queue.

Using the Aegaeon system, in Alibaba Cloud’s production tests, the company was able reduce GPU demand by 82%. What would normally take 1,192 GPUs, the researchers were able to do with just 213 Nvidia H20 GPUs.

The consequences of this system could be significant. If AI companies can do more with less, maybe those massive data centers running AI models don’t need to be so huge, and maybe they don’t have to find as many complicated financing schemes to pay for all those GPUs.

But this also means that smaller players could be more competitive, especially in places like China, where export controls are making the most powerful processors hard to come by.

It could also be bad news for Nvidia, though Aegaeon is built on Nvidia software. And on Monday, some analysts on Wall Street pointed to the reports on Aegaeon as a reason for the day’s weakness in some previously high-flying data center stocks.

Oracle was down sharply for the second straight session. Hard disk drive makers Seagate Technology Holdings and Western Digital — big beneficiaries of the data center trade this year — also declined, as did AI energy plays Constellation Energy and Vistra.

More Tech

See all Tech
tech

OpenAI’s models are officially coming to Amazon

Amazon is finally getting in on the hottest ticket in tech.

After Microsoft announced yesterday that it has agreed to give up its exclusive rights to sell OpenAI’s models, Amazon, as expected, will start offering them to customers — something Amazon Web Services CEO Matt Garman says users have been asking for “for a really long time.” Some models are available now in preview, and the most powerful GPT versions will show up “in the coming weeks.”

This is a big shift in the AI cloud wars. Microsoft’s early bet on OpenAI gave Azure an edge by locking up the most in-demand models. Now that exclusivity is gone, Amazon and other competitors can finally offer them too, closing a key gap and competing more directly for AI customers.

This is a big shift in the AI cloud wars. Microsoft’s early bet on OpenAI gave Azure an edge by locking up the most in-demand models. Now that exclusivity is gone, Amazon and other competitors can finally offer them too, closing a key gap and competing more directly for AI customers.

tech

Ship-tracking app surges as Iran war continues

As Middle East peace talks stretch on, with Tehran reportedly offering to reopen the Strait of Hormuz if the US lifts its blockade and the war ends, the owner of shipping intelligence platform MarineTraffic revealed that the app has gained millions of new users since the conflict began.

MarineTraffic’s user count jumped to 8.5 million this April, up from 3.5 million a year ago, the cofounder of its parent company, Kpler, said in an interview with the Financial Times. Paid subscribers, often workers within companies and governments looking for more data on supply chains and commodities trading, rose 11,000 in the same period.

Kpler, which also owns shipping intelligence platform FleetMon, draws its data from a range of sources, including the Automatic Identification System, satellites, and more than 500 people on-site, like port terminal operators.

Per Appfigures data, MarineTraffic is estimated to have raked in almost $1 million across March and April in app revenue (through April 27), more than double the ~$346,500 from the same months last year. Across the full year, Kpler expects to earn between $300 million and $400 million in annual recurring revenues.

tech
Tom Jones

Google will supply AI models to Pentagon in classified deal, per The Information

Google has become the latest tech company to ink an agreement to supply the Department of Defense (War) with AI, having reportedly closed a classified deal that allows the Pentagon to use its AI for “any lawful government purpose,” according to The Information.

The Information initially reported talks between the Alphabet-owned company and the US government around two weeks ago, following the messy breakdown of the relationship between Anthropic and the Trump administration — and the rushed OpenAI deal that took its place.

The move has reportedly sparked opposition among Google employees, with The Washington Post reporting that over 600 workers signed a letter to CEO Sundar Pichai to ask him to bar the Defense Department from using the company’s AI models for any classified work.

The Information initially reported talks between the Alphabet-owned company and the US government around two weeks ago, following the messy breakdown of the relationship between Anthropic and the Trump administration — and the rushed OpenAI deal that took its place.

The move has reportedly sparked opposition among Google employees, with The Washington Post reporting that over 600 workers signed a letter to CEO Sundar Pichai to ask him to bar the Defense Department from using the company’s AI models for any classified work.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, Robinhood Derivatives, LLC, or Robinhood Money, LLC. Futures and event contracts are offered through Robinhood Derivatives, LLC.