Brain in a Bubble

illusion of thinking

With WWDC on deck, Apple says “reasoning” AI models collapse with complexity

Apple tested state-of-the-art “chain of thought” models and found that they aren’t “reasoning,” but merely pattern matching, calling into question the direction the industry is taking.

Jon Keegan

6/9/25 10:34AM

Apple’s troubled AI rollout was plagued by a series of remarkable feature failures and product delays.

What was supposed to be the year of “Apple Intelligence” has failed to deliver an AI-enhanced Siri on par with voice assistants from competitors like Google, OpenAI, and Meta. This week, all eyes are on Apple as it holds its Worldwide Developers Conference (WWDC) to see what it’s planning to get back in the AI race.

But behind the scenes, researchers at Apple have been digging into the competition’s latest and greatest “reasoning” models to see how they respond to tricky challenges as they scale in complexity.

In a new paper, Apple’s researchers found that the leading state-of-the-art “chain of thought” models “face a complete accuracy collapse” when they dialed up the complexity of puzzle-based tests. The spectacular failures of the models led the researchers to question their “reasoning” label, calling it instead “the illusion of thinking.”

The suite of tests included puzzles like “Tower of Hanoi,” in which the player must stack a series of disks of various sizes from one post to another, one disk at a time, only moving the top disk, and always placing smaller disks on larger ones.

Screenshot from apple “Illusion of Thinking” paper — A figure from the “Illusion of Thinking” Apple paper showing models’ collapse in accuracy as the complexity is dialed up. (Source: Apple)

While the models could solve the simplest versions of the puzzles, they fell on their face once things got more complex. The research tested reasoning models DeepSeek-R1, OpenAI’s o3-mini, and Anthropic’s Claude 3.7 Sonnet Thinking.

Chain of “thought”

After hitting performance plateaus from the “more data, more compute” approach, the industry followed OpenAI’s o1 release and started to build “chain of thought” reasoning models, which showed their “thought” processes.

This technique did boost the performance of large language models to new levels, offering a promising new pathway out of what looked to be a computational dead end. While they required vastly higher computation resources and time, the approach seemed to be the way forward.

Apple’s research seems to show that rather than reasoning, these models are merely displaying sophisticated pattern matching.

Apple researchers also examined the “thought” processes behind each solution to the puzzle, to better understand exactly how the models approached solutions.

The fact of the matter is that very little is known about how these recent models actually work. It remains to be seen if Apple has been cooking up an alternate approach, but reports indicate an AI-enhanced Siri isn’t likely to make a debut at this week’s WWDC.

Rani Molla19h

After Tesla earnings, prediction markets think unsupervised FSD is less likely than ever to be rolled out this year

Tesla’s unsupervised full self-driving technology, which would autonomously ferry passengers around without a human driver having to pay attention, is supposed to help catapult the electric vehicle company’s valuation further into the stratosphere. It was also supposed to be available this year, but prediction markets participants, as well as former Tesla self-driving leaders, no longer think that will happen.

On Tesla’s earnings call this week, CEO Elon Musk said the company now had “clarity” on achieving unsupervised full self-driving — something he’s repeatedly said would be available at least in some markets this year.

The comments seemed to give Polymarket prediction markets participants some clarity. There, the market-implied probability that Tesla will release unsupervised FSD this year reached its lowest point since the event contract was opened in May.

The odds of it happening had been pretty high up until late June, when Tesla’s long-awaited robotaxi launched with a safety driver in the passenger seat. The unsupervised FSD event contract specifies the feature can have “no requirement for human intervention.”

Rani Molla10/24/25

Banks prepare record $38 billion debt financing to fund Oracle-tied data centers

Banks led by JPMorgan and Mitsubishi UFJ are preparing a $38 billion debt offering to fund two Oracle-tied data centers in Texas and Wisconsin, Bloomberg reports. The projects, developed by Vantage Data Centers, will support Oracle’s $500 billion Stargate AI infrastructure push with OpenAI and Nvidia.

The loans — $23.25 billion for Texas and $14.75 billion for Wisconsin — are expected to mature in four years, price about 2.5 percentage points higher than the benchmark rate, and mark the largest AI infrastructure financing to date.

Oracle executives recently said that the company anticipates cloud gross margins will reach 35% and that it expects to see $166 billion in cloud infrastructure revenue by FY 2030.

Oracle is up 1.5% premarket.

Record $38 Billion Debt Sale Nears for Oracle-Tied Data Centers

Oracle executives recently said that the company anticipates cloud gross margins will reach 35% and that it expects to see $166 billion in cloud infrastructure revenue by FY 2030.

Oracle is up 1.5% premarket.

Rani Molla10/24/25

Google rises on official announcement of Anthropic deal worth “tens of billions”

Google has made its deal to expand AI compute to Anthropic, reported earlier this week by Bloomberg, official. In order to train and serve its Claude model, Anthropic has agreed to pay Google Cloud “tens of billions of dollars” to access up to 1 million tensor processing units, or TPUs, as well as other cloud services.

Google, of course, has a 14% stake in Anthropic, making this one of the many circular AI deals happening at the moment.

“Anthropic and Google have a longstanding partnership and this latest expansion will help us continue to grow the compute we need to define the frontier of AI,” Anthropic CFO Krishna Rao said in the press release. “Our customers — from Fortune 500 companies to AI-native startups — depend on Claude for their most important work, and this expanded capacity ensures we can meet our exponentially growing demand while keeping our models at the cutting edge of the industry.”

The announcement has sent Google up again, more than 1% premarket.

Rani Molla10/24/25

Report: Snap seeking $1 billion to finance its AR glasses division in “existential” fundraise

Snap is down more than 1% this morning following news that the company is attempting to raise $1 billion for its AR glasses unit in what someone told Sources.news was an “existential” fundraise.

A Snap spokesperson countered, “We do not need to raise money to execute against our plans to publicly launch Specs in 2026, but remain open to opportunities that could accelerate our growth.”

Multiple investors are involved in the talks, including Saudi Arabia’s Public Investment Fund, according to Sources.news. The report also noted that Snap plans to turn the unit that makes its Specs glasses into an independent subsidiary à la Google’s Waymo “that can continue raising capital from investors.”

Snap plans to produce about 100,000 units of next year’s Specs, pricing them around $2,500.

The beleaguered stock saw quite a bit of retail interest last month, amid r/WallStreetBets chatter that its low nominal price made it a potential acquisition target.

Snap eyes $1 billion fundraise for AR glasses

Snap plans to produce about 100,000 units of next year’s Specs, pricing them around $2,500.

The beleaguered stock saw quite a bit of retail interest last month, amid r/WallStreetBets chatter that its low nominal price made it a potential acquisition target.

Rani Molla

10/23/25

After a record quarter, analysts point out that Tesla’s car sales don’t really matter anymore

Tell that to Apple and its iPhone.

New Tesla Model 3 With Full Self-Driving Activated

With WWDC on deck, Apple says “reasoning” AI models collapse with complexity

Chain of “thought”

More Tech

After Tesla earnings, prediction markets think unsupervised FSD is less likely than ever to be rolled out this year

Banks prepare record $38 billion debt financing to fund Oracle-tied data centers

Record $38 Billion Debt Sale Nears for Oracle-Tied Data Centers

Google rises on official announcement of Anthropic deal worth “tens of billions”

Report: Snap seeking $1 billion to finance its AR glasses division in “existential” fundraise

Snap eyes $1 billion fundraise for AR glasses

Latest Stories