Full frame Close up Background Blueberries

(Sunlight7/Getty Images)

GPT-5: “A legitimate PhD level expert in anything” that sucks at spelling and geography

OpenAI spent a lot of time talking about how “smart” GPT-5 is, yet it failed spectacularly at tasks that a second grader could achieve.

Jon Keegan

8/8/25 10:47AM

Yesterday, OpenAI spent a lot of time talking about how “smart” its new GPT-5 model was.

OpenAI cofounder and CEO Sam Altman said of GPT-5: “It’s like talking to an expert, a legitimate PhD level expert in anything,” and called it “the most powerful, the most smart, the fastest, the most reliable and the most robust reasoning model that we’ve shipped to date.”

Demos showed GPT-5 effortlessly creating an interactive simulation to explain the Bernoulli effect, diagnosing and fixing complex code errors, planning personalized running schedules, and “vibecoding” a cartoony 3D castle game. The company touted benchmarks showing how GPT-5 aced questions from a Harvard-MIT mathematics tournament and got high scores on coding, visual problem-solving, and other impressive tasks.

But once the public got its chance to kick GPT-5’s tires, some cracks started to emerge in this image of a superintelligent expert in everything.

AI models are famously bad at spelling different kinds of berries, and GPT-5 is no exception.

I had to try the “blueberry” thing myself with GPT5. I merely report the results.

[image or embed]
— Kieran Healy (@kjhealy.co) August 7, 2025 at 8:04 PM

Another classic failure of being bad at maps persisted with GPT-5 when it was asked to list all the states with the letter R in their names. It even offered to map them, with hilarious results:

My goto is to ask LLMs how many states have R in their name. They always fail. GPT 5 included Indiana, Illinois, and Texas in its list. It then asked me if I wanted an alphabetical highlighted map. Sure, why not.

[image or embed]
— radams (@radamssmash.bsky.social) August 7, 2025 at 8:40 PM

To be fair to OpenAI, this problem isn’t unique to GPT-5, as similar failures were documented with Google’s Gemini.

During the livestream, in an embarrassing screwup, a presentation showed some charts that looked like they had some of the same problems.

If this was all described as a “technical preview,” these kinds of mistakes might be understandable. But this is a real product from a company that’s pulling in $1 billion per month. OpenAI’s models are being sold to schools, hospitals, law firms, and government agencies, including for national security and defense.

OpenAI is also telling users that GPT-5 can be used for medical information, while cautioning that the model does not replace a medical professional.

“GPT‑5 is our best model yet for health-related questions, empowering users to be informed about and advocate for their health.”

Why is this so hard?

The reason why such an advanced model can appear to be so capable at complex coding, math, and physics yet fail so spectacularly at spelling and maps is that generative models like GPT-5 are probabilistic systems at their core — they predict the most likely next token based on the volumes of data they have been trained on. They don’t know anything, and they don’t think or have any model of how the world works (though researchers are working on that).

When the model writes its response, you see the thing that has the highest score for what you should see next. But with math and coding, the rules are more strict and the examples it’s been trained on are consistent, so it has higher accuracy and can ace the math benchmarks with flying colors.

But drawing a map with names or counting the letters in a word are weirdly tough, as it requires a skill the model doesn’t really have and has to figure out step by step from patterns, which can lead to odd results. That’s a simplification of a very complex and sophisticated system, but applies to a lot of the generative-AI technology in use today.

That’s also a whole lot to explain to users, but OpenAI boils those complicated ideas down to a single warning below the text box: “ChatGPT can make mistakes. Check important info.”

OpenAI did not immediately respond to a request for comment.

Jon Keegan50m

Anthropic’s move to diversify from Nvidia chips may give it an edge against OpenAI

Anthropic has reportedly been upping its revenue forecasts, and appears to be catching up to market leader OpenAI.

Anthropic’s thriving API business is juicing its revenues, and it has made some strategic moves that are boosting its margins.

Unlike OpenAI’s all-Nvidia strategy, Anthropic has diversified to also use chips from Amazon and Google, according to a report from The Information.

The cheaper, more efficient chips may be part of the reason that Anthropic is projecting that it will be profitable in 2027.

The report also notes that OpenAI’s expensive $40 billion “backup” server build-out is part of its plan to eventually monetize hundreds of millions of nonpaying ChatGPT users, while Anthropic is generating 80% of its revenue from paid API access and isn’t spending as much to serve its much smaller base of free users.

Anthropic Projects Cost Advantage Over OpenAI

Unlike OpenAI’s all-Nvidia strategy, Anthropic has diversified to also use chips from Amazon and Google, according to a report from The Information.

The cheaper, more efficient chips may be part of the reason that Anthropic is projecting that it will be profitable in 2027.

Rani Molla4h

In hopes of teasing out more sales, Tesla is renting cars for $60 a day

After a record sales quarter, analysts expect Tesla sales to fall in the current quarter, as the end of the government’s $7,500 EV tax credit crimps electric vehicle sales in general.

Tesla has a plan: it’s now renting Teslas from select dealerships, starting in Southern California, for up to a week at a time, starting at $60 a day.

The company has thrown in freebie features like Supercharging and Full Self-Driving (Supervised), and is giving those who choose to buy a Tesla within a week of their rental experience a $250 credit.

Will that help keep Tesla sales from falling? (Analysts polled by FactSet forecast sales in the fourth quarter to be down 9% and the full year to fall 7%, compared to the same period a year earlier.) Probably not, but supposedly car sales don’t really matter anymore to Tesla anyway: Tesla has its sights set on owning a future without poverty or crime but with driverless robotaxis and robot surgeons.

Shares of Tesla were up 2.3% in premarket trading as broader markets rose. Through Friday’s close, they were up 13% for the year, slightly underperforming the S&P 500.

Tesla can’t sell its cars anymore so it is renting them now

The company has thrown in freebie features like Supercharging and Full Self-Driving (Supervised), and is giving those who choose to buy a Tesla within a week of their rental experience a $250 credit.

Shares of Tesla were up 2.3% in premarket trading as broader markets rose. Through Friday’s close, they were up 13% for the year, slightly underperforming the S&P 500.

Rani Molla11/7/25

Amazon expands low-price Haul section to 14 new markets as Amazon Bazaar app

Amazon is expanding its low-cost Amazon Haul experience to a new stand-alone app called Amazon Bazaar.

Amazon launched its Temu and Shein competitor a year ago as a US mobile storefront on its website and has since expanded to about a dozen markets. Consumers could purchase many items for under $10, as long as they were willing to stomach longer delivery times.

Now, thanks to success in those places, the programming is expanding to 14 new markets — Hong Kong, the Philippines, Taiwan, Kuwait, Qatar, Bahrain, Oman, Peru, Ecuador, Argentina, Costa Rica, the Dominican Republic, Jamaica, and Nigeria — with a new app and name: Amazon Bazaar.

“Both Amazon Haul and Amazon Bazaar deliver the same ultra low-price shopping experience, with different names chosen to better resonate with local language preferences and cultures,” the company said in a press release.

Amazon extends Haul’s ultra low-price shopping experience to 14 new destinations with new Amazon Bazaar app

Big Tech’s most important infrastructure is at the bottom of the sea

While data centers on land are getting all the attention, Big Tech’s vast network of undersea fiber-optic cables carry 99% of all international network traffic.

Jon Keegan11/7/25

After watching small drones reshape the battlefield in Ukraine, the US Army has announced plans to buy 1 million drones over the next two to three years, according to a report from Reuters.

The military threat of China’s dominance of the quadcopter-style drone industry is also driving the decision. But China’s control over much of the supply chain for drones, including rare earth magnets, sensors, and microcontrollers, will make it much harder for American drone manufacturers to catch up.