Tech
Blue and Orange 3D Cubes Representing Interconnected AI Systems and Digital Transformation
Getty Images

Alibaba researchers devise efficient GPU pooling system, reducing GPU use 82%

Drastically reducing the amount of GPUs needed for running AI models could have big consequences for the scale of huge data centers, while benefiting smaller organizations. It also could reduce demand for pricey new GPUs from Nvidia.

Jon Keegan, Matt Phillips

Researchers at Peking University and Alibaba have announced a new system that can drastically reduce GPU demand, by efficiently “pooling” computing across multiple models rather than assigning each model its own GPU.

Named “Aegaeon,” the system addresses a problem with assigning computing resources to the many AI models on the market: dedicating a set of GPUs to a specific model leaves precious processing cycles underutilized when the model is not receiving a lot of requests.

In the research paper, the authors noted that a small number of popular models, like Meta’s Llama, DeepSeek, and Qwen, dominate utilization, and 17.7% of GPUs serve only 1.35% of requests. That’s a lot of wasted GPU cycles.

The researchers use a system of “token-level auto-scaling,” which assigns computing at a granular level using tokens (the smallest unit of text an LLM processes, sometimes only a few letters) rather than at the “request” level, which might see one heavy computational task holding up the queue.

Using the Aegaeon system, in Alibaba Cloud’s production tests, the company was able reduce GPU demand by 82%. What would normally take 1,192 GPUs, the researchers were able to do with just 213 Nvidia H20 GPUs.

The consequences of this system could be significant. If AI companies can do more with less, maybe those massive data centers running AI models don’t need to be so huge, and maybe they don’t have to find as many complicated financing schemes to pay for all those GPUs.

But this also means that smaller players could be more competitive, especially in places like China, where export controls are making the most powerful processors hard to come by.

It could also be bad news for Nvidia, though Aegaeon is built on Nvidia software. And on Monday, some analysts on Wall Street pointed to the reports on Aegaeon as a reason for the day’s weakness in some previously high-flying data center stocks.

Oracle was down sharply for the second straight session. Hard disk drive makers Seagate Technology Holdings and Western Digital — big beneficiaries of the data center trade this year — also declined, as did AI energy plays Constellation Energy and Vistra.

More Tech

See all Tech
tech

Waymo is now offering autonomous rides in Miami

Google subsidiary Waymo announced Thursday that it’s officially open for autonomous ride-hailing in Miami, expanding the company’s coverage area to six US cities. The company will be “inviting new riders on a rolling basis” to take rides across its 60-square-mile service area, which includes the Design District, Wynwood, Brickell, and Coral Gables. Waymo said it plans to expand to Miami International Airport “soon.”

Competitor Tesla currently operates a ride-hailing service with a safety monitor in the vehicle in Austin and the Bay Area.

tech

Apple to promote Siri from assistant to chatbot

Bloomberg reports that Apple plans to transform its Siri assistant into a full-fledged chatbot similar to OpenAI’s ChatGPT.

The chatbot would be integrated throughout the iPhone’s operating system rather than offered as a stand-alone app. It’s expected to arrive later this year and would be separate from more incremental, non-chatbot improvements to Siri rolling out in the coming months aimed at making the existing assistant more usable.

Both updates will be powered by Google’s AI models, Bloomberg reports, but the chatbot upgrade will be more advanced and akin to the much-lauded Gemini 3.

While the difference between an assistant and a chatbot may sound subtle, it represents a meaningful shift for Apple, which has long avoided a fully conversational interface and has lagged rivals that embraced one. Any new Siri chat capabilities could also eventually extend to other Apple devices under development, including wearables such as the pin Apple is developing.

Both updates will be powered by Google’s AI models, Bloomberg reports, but the chatbot upgrade will be more advanced and akin to the much-lauded Gemini 3.

While the difference between an assistant and a chatbot may sound subtle, it represents a meaningful shift for Apple, which has long avoided a fully conversational interface and has lagged rivals that embraced one. Any new Siri chat capabilities could also eventually extend to other Apple devices under development, including wearables such as the pin Apple is developing.

tech

OpenAI shares how it will charge for ChatGPT ads

Last week, OpenAI announced that ads were going to be rolling out in ChatGPT in the coming weeks.

Now we have more details about what OpenAI is telling advertisers. According to a report from The Information, the company has reached out to “dozens” of advertisers, and will charge based on ad views.

Advertisers are still waiting for further details, but OpenAI is asking for less than $1 million each in ad spending while it tests out the new system, per the report.

Ads are supposed to begin in February, and will only appear for free ChatGPT and ChatGPT Go users.

Advertisers are still waiting for further details, but OpenAI is asking for less than $1 million each in ad spending while it tests out the new system, per the report.

Ads are supposed to begin in February, and will only appear for free ChatGPT and ChatGPT Go users.

tech

Apple is reportedly working on a wearable AI pin

Move over OpenAI, Apple is reportedly also developing a mysterious AI-powered wearable device: a pin that looks like a thin, flat, circular disc with an aluminum-and-glass shell.”

The Information reports that the device is the size of an Apple AirTag and has two cameras, a speaker, three microphones, and wireless charging. It could be available by early 2027.

Apple, which has lagged its peers in AI and recently teamed up with Google to support its upcoming Siri revamp, is hoping to keep up with ChatGPT and Google, which, like Apple, has an AI smartphone. Meta and Google are both also pushing into smart AI glasses.

It’s not to be mistaken with OpenAI’s secretive wearable AI device, which is being made in conjunction with former Apple designer Jony Ive and expected to debut in late 2026. The latest rumors suggest the unnamed device, meant to eventually compete with smartphones, might be earbuds.

Apple, which has lagged its peers in AI and recently teamed up with Google to support its upcoming Siri revamp, is hoping to keep up with ChatGPT and Google, which, like Apple, has an AI smartphone. Meta and Google are both also pushing into smart AI glasses.

It’s not to be mistaken with OpenAI’s secretive wearable AI device, which is being made in conjunction with former Apple designer Jony Ive and expected to debut in late 2026. The latest rumors suggest the unnamed device, meant to eventually compete with smartphones, might be earbuds.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, or Robinhood Money, LLC.