Tech
tech
Jon Keegan

Quit the yapping: New AI technique could cut costs 90% by saying less

A consensus is emerging in AI circles that the way forward involves models that use “chain of reasoning” to get better performance, at the expense of costlier computing resources. This process involves instructing the model to break a problem down into detailed step-by-step instructions. The problem is that these steps can be pretty verbose, and when it comes to AI, more words = more cost.

A new paper from researchers at Zoom shows that using a new technique dubbed “chain of draft,” if you tell a model to simply limit those steps to succinct “drafts” of only five words or so, rather than wordy sentences, not only can you still achieve high performance on responses, but you can cut computing costs by up to 90%.

AI models are priced by the number of “tokens” — or portions of words — that are input and output by the model. For example: OpenAI’s o3-mini “reasoning” model costs $1.10 per million tokens input, and $4.40 per million tokens of output. That may seem cheap, but when you’re processing millions of queries, this can really add up.

“By reducing verbosity and focusing on critical insights, CoD matches or surpasses CoT in accuracy while using as little as only 7.6% of the tokens, significantly reducing cost and latency across various reasoning tasks,” the paper reports.

Translation: it’s faster, cheaper, and sometimes better than chain of thought.

This approach is also notable for its ease of use. You can simply change the prompts you enter to get this benefit. That said, most of the gains were found using larger models like OpenAI’s GPT-4o and Anthropic’s Claude 3.5 Sonnet, while using smaller models resulted in poorer performance.

Go deeper: Here are OpenAI’s 50 Laws of Robotics

A new paper from researchers at Zoom shows that using a new technique dubbed “chain of draft,” if you tell a model to simply limit those steps to succinct “drafts” of only five words or so, rather than wordy sentences, not only can you still achieve high performance on responses, but you can cut computing costs by up to 90%.

AI models are priced by the number of “tokens” — or portions of words — that are input and output by the model. For example: OpenAI’s o3-mini “reasoning” model costs $1.10 per million tokens input, and $4.40 per million tokens of output. That may seem cheap, but when you’re processing millions of queries, this can really add up.

“By reducing verbosity and focusing on critical insights, CoD matches or surpasses CoT in accuracy while using as little as only 7.6% of the tokens, significantly reducing cost and latency across various reasoning tasks,” the paper reports.

Translation: it’s faster, cheaper, and sometimes better than chain of thought.

This approach is also notable for its ease of use. You can simply change the prompts you enter to get this benefit. That said, most of the gains were found using larger models like OpenAI’s GPT-4o and Anthropic’s Claude 3.5 Sonnet, while using smaller models resulted in poorer performance.

Go deeper: Here are OpenAI’s 50 Laws of Robotics

More Tech

See all Tech
Form Energy iron-air battery system leaving Form Factory 1

Big batteries are the newest answer to Big Tech’s big energy needs

America’s booming energy demand is creating a powerful case for large-scale energy storage.

Patrick Sisson4h
Astronaut on the Moon

Over 50 years since it last sent astronauts to the moon, the US is now reentering a very different space race

The successful launch of the Artemis II lunar flyby marked one small step for NASA, while China’s already making giant leaps in its own space program.

tech
Jon Keegan

Judge blocks Pentagon’s move to blacklist Anthropic

A federal judge in Northern California has granted a preliminary injunction blocking the Pentagon from labeling Anthropic as a national security supply chain risk.

The ruling temporarily prevents the Defense Department from restricting the AI company’s access to federal contracts amid a dispute over its refusal to allow certain military and surveillance uses of its technology. The designation could also have shifted lucrative government work toward competitors, including OpenAI.

Earlier this month, Anthropic, the company behind Claude, sued 17 federal agencies and their heads, alleging the government exceeded its statutory authority.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, Robinhood Derivatives, LLC, or Robinhood Money, LLC. Futures and event contracts are offered through Robinhood Derivatives, LLC.