Tech
Blue and Orange 3D Cubes Representing Interconnected AI Systems and Digital Transformation
Getty Images

Alibaba researchers devise efficient GPU pooling system, reducing GPU use 82%

Drastically reducing the amount of GPUs needed for running AI models could have big consequences for the scale of huge data centers, while benefiting smaller organizations. It also could reduce demand for pricey new GPUs from Nvidia.

Researchers at Peking University and Alibaba have announced a new system that can drastically reduce GPU demand, by efficiently “pooling” computing across multiple models rather than assigning each model its own GPU.

Named “Aegaeon,” the system addresses a problem with assigning computing resources to the many AI models on the market: dedicating a set of GPUs to a specific model leaves precious processing cycles underutilized when the model is not receiving a lot of requests.

In the research paper, the authors noted that a small number of popular models, like Meta’s Llama, DeepSeek, and Qwen, dominate utilization, and 17.7% of GPUs serve only 1.35% of requests. That’s a lot of wasted GPU cycles.

The researchers use a system of “token-level auto-scaling,” which assigns computing at a granular level using tokens (the smallest unit of text an LLM processes, sometimes only a few letters) rather than at the “request” level, which might see one heavy computational task holding up the queue.

Using the Aegaeon system, in Alibaba Cloud’s production tests, the company was able reduce GPU demand by 82%. What would normally take 1,192 GPUs, the researchers were able to do with just 213 Nvidia H20 GPUs.

The consequences of this system could be significant. If AI companies can do more with less, maybe those massive data centers running AI models don’t need to be so huge, and maybe they don’t have to find as many complicated financing schemes to pay for all those GPUs.

But this also means that smaller players could be more competitive, especially in places like China, where export controls are making the most powerful processors hard to come by.

It could also be bad news for Nvidia, though Aegaeon is built on Nvidia software. And on Monday, some analysts on Wall Street pointed to the reports on Aegaeon as a reason for the day’s weakness in some previously high-flying data center stocks.

Oracle was down sharply for the second straight session. Hard disk drive makers Seagate Technology Holdings and Western Digital — big beneficiaries of the data center trade this year — also declined, as did AI energy plays Constellation Energy and Vistra.

More Tech

See all Tech
tech

Apple closes at record high for first time in 2025

After spending the day at intraday highs, Apple set an all-time closing high of $262.24 Monday, following reports of increased iPhone 17 sales and an analyst upgrade. Loop Capital raised its price target to a Street high of $315.

The stock’s previous all-time closing high was in December 2024.

Apple reports its fiscal year 2025 results later this month, during which analysts expect the company’s all-important iPhone sales to return to growth.

two faces

A tale of two Teslas from two analyst notes by guys named Dan

Ahead of Tesla’s third-quarter earnings, Barclays’ Dan Levy and Wedbush Securities’ Dan Ives weigh in.

tech

Data center frenzy taxes natural resources, sparks anger around the globe

The race to build ever-larger power-hungry data centers isnt limited to the US. In Ireland, more than 20% (!!!) of the country’s electricity is consumed by data centers. In Mexico, poor communities near data center sites are seeing water supplies dry up and their fragile power grids falter.

A New York Times report examines what these data center projects look like around the world and tracks the local opposition mounted by environmental groups seeking to block future projects.

The report notes that despite growing local opposition, countries are still bending over backward to lure the billions of dollars in investment that come with these data center projects, offering rich tax incentives to the companies developing the projects, in exchange for a relatively small number of jobs and promises of various, if vague, local benefits.

Much like in the US, the data center deals are shrouded in secrecy, with elected officials required to sign NDAs and the extensive use of shell companies masking the identity of the massive tech companies behind the projects.

A New York Times report examines what these data center projects look like around the world and tracks the local opposition mounted by environmental groups seeking to block future projects.

The report notes that despite growing local opposition, countries are still bending over backward to lure the billions of dollars in investment that come with these data center projects, offering rich tax incentives to the companies developing the projects, in exchange for a relatively small number of jobs and promises of various, if vague, local benefits.

Much like in the US, the data center deals are shrouded in secrecy, with elected officials required to sign NDAs and the extensive use of shell companies masking the identity of the massive tech companies behind the projects.

Man Working at Machine

OpenAI claimed a math breakthrough this weekend, only to be smacked down

The embarrassing episode sprouted from a misunderstood post, amplified by an OpenAI executive as proof of GPT-5’s mathematical prowess, but turned out not to be what it seemed.

tech

Analysts expect iPhone revenue to return to growth this year and next

Sales of Apple’s latest iPhone are shaping up for a good year, after a couple of pretty crappy ones, according to the latest analyst consensus estimates from FactSet.

Analysts have been revising up their iPhone revenue expectations for the fiscal year ended in late September — which includes a half month of the latest iPhone sales — and now expect iPhone revenue to rise 4.5% in FY 2025 to $210 billion. Growth for FY 2026 is now pegged at 5.5%. Last year, sales were basically flat after declining more than 2% in FY 2023. Of course, as Apple’s hold on the global smartphone market has grown over the years, its latest growth expectations pale in comparison to the early 2010s, but still represent the strongest growth since the pandemic.

Some are crediting the iPhone 17’s physical redesign for positive sales indicators, but we suspect the boost has more to do with a natural upgrade cycle than any specific features.

The stock is trading up nearly 2% premarket and is expected to open near a record high today, following positive early sales estimates from Counterpoint Research and an upgrade from Loop Capital which raised its price target to $315, a Street high.

Apple reports its 2025 fiscal year results on October 30.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, or Robinhood Money, LLC.