Tech
Full frame Close up Background Blueberries
(Sunlight7/Getty Images)

GPT-5: “A legitimate PhD level expert in anything” that sucks at spelling and geography

OpenAI spent a lot of time talking about how “smart” GPT-5 is, yet it failed spectacularly at tasks that a second grader could achieve.

Jon Keegan

Yesterday, OpenAI spent a lot of time talking about how “smart” its new GPT-5 model was.

OpenAI cofounder and CEO Sam Altman said of GPT-5: “Its like talking to an expert, a legitimate PhD level expert in anything,” and called it “the most powerful, the most smart, the fastest, the most reliable and the most robust reasoning model that weve shipped to date.”

Demos showed GPT-5 effortlessly creating an interactive simulation to explain the Bernoulli effect, diagnosing and fixing complex code errors, planning personalized running schedules, and “vibecoding” a cartoony 3D castle game. The company touted benchmarks showing how GPT-5 aced questions from a Harvard-MIT mathematics tournament and got high scores on coding, visual problem-solving, and other impressive tasks.

But once the public got its chance to kick GPT-5’s tires, some cracks started to emerge in this image of a superintelligent expert in everything.

AI models are famously bad at spelling different kinds of berries, and GPT-5 is no exception.

I had to try the “blueberry” thing myself with GPT5. I merely report the results.

[image or embed]

— Kieran Healy (@kjhealy.co) August 7, 2025 at 8:04 PM

Another classic failure of being bad at maps persisted with GPT-5 when it was asked to list all the states with the letter R in their names. It even offered to map them, with hilarious results:

My goto is to ask LLMs how many states have R in their name. They always fail. GPT 5 included Indiana, Illinois, and Texas in its list. It then asked me if I wanted an alphabetical highlighted map. Sure, why not.

[image or embed]

— radams (@radamssmash.bsky.social) August 7, 2025 at 8:40 PM

To be fair to OpenAI, this problem isn’t unique to GPT-5, as similar failures were documented with Google’s Gemini.

During the livestream, in an embarrassing screwup, a presentation showed some charts that looked like they had some of the same problems.

If this was all described as a “technical preview,” these kinds of mistakes might be understandable. But this is a real product from a company thats pulling in $1 billion per month. OpenAI’s models are being sold to schools, hospitals, law firms, and government agencies, including for national security and defense.

OpenAI is also telling users that GPT-5 can be used for medical information, while cautioning that the model does not replace a medical professional.

“GPT‑5 is our best model yet for health-related questions, empowering users to be informed about and advocate for their health.”

Why is this so hard?

The reason why such an advanced model can appear to be so capable at complex coding, math, and physics yet fail so spectacularly at spelling and maps is that generative models like GPT-5 are probabilistic systems at their core — they predict the most likely next token based on the volumes of data they have been trained on. They don’t know anything, and they don’t think or have any model of how the world works (though researchers are working on that).

When the model writes its response, you see the thing that has the highest score for what you should see next. But with math and coding, the rules are more strict and the examples its been trained on are consistent, so it has higher accuracy and can ace the math benchmarks with flying colors.

But drawing a map with names or counting the letters in a word are weirdly tough, as it requires a skill the model doesn’t really have and has to figure out step by step from patterns, which can lead to odd results. That’s a simplification of a very complex and sophisticated system, but applies to a lot of the generative-AI technology in use today.

Thats also a whole lot to explain to users, but OpenAI boils those complicated ideas down to a single warning below the text box: “ChatGPT can make mistakes. Check important info.”

OpenAI did not immediately respond to a request for comment.

More Tech

See all Tech
tech

Meta launches federal super PAC to fight state AI policy proposals

Meta has launched a federal super PAC called the American Technology Excellence Project, spending “tens of millions” of dollars to fight what it considers “onerous AI and tech policy bills across the country,” Axios reports. Last month, Meta launched a California super PAC to back pro-AI candidates in the state.

Silicon Valley in general has been rushing behind pro-AI PACs, seeking to fight proposals like Senator Mark Kelly’s that would force AI companies to foot some of the bill for the societal ills they cause.

tech

Wedbush: Nvidia investment in OpenAI is a “watershed moment”

Wedbush Securities analyst Dan Ives thinks Nvidia’s $100 billion investment in OpenAI says a lot of things about the importance of the moment we’re in. It’s a “watershed moment,” a “Ryder Cup moment,” and a “validation sign that the AI Arms Race is heating up among Big Tech firms.” In a note this morning, Ives wrote:

“We believe the AI Revolution is now heading into its next stage of growth as the tidal wave of Big Tech capex spending coupled by enterprise use cases now exploding across verticals is creating a number of AI winners in the tech world. The last few months we have seen a major validation moment for our AI Revolution bull thesis as the cloud stalwarts Microsoft, Amazon, and Google are leading the charge on this unprecedented spending cycle. Nvidia’s recent robust earnings and demand commentary from the Godfather of AI Jensen speaks to the evolution of AI spend now spreading beyond Big Tech to governments, enterprises, energy capacity, and overall infrastructure build outs around the globe.”

He does not consider it a bubble — or at least not yet. “While there are worries about an ‘AI Bubble’ and stretched valuations we continue to view this as a 1996 Moment for the Tech World and NOT a 1999 Moment,” Ives wrote, suggesting the situation is more like the early days of the internet, when there was a lot of investment in internet companies and a lot of experimentation — and when the dot-com bubble bursting was still a few years off.

Megazord

If having multiple CEOs is better for stock market returns, Oracle is quadrupling down

But buyer beware: the last time Oracle had co-CEOs, shares underperformed.

tech
Rani Molla

Ives raises Apple price target to Wall Street high of $310, citing a “real upgrade cycle” for iPhones

Wedbush Securities analyst Dan Ives raised his Apple price target to $310 from $270 thanks to “early strong demand signs” for the iPhone 17, which he says is tracking 10% to 15% ahead of the iPhone 16 at this point.

That $310 price target is the highest among Wall Street analysts polled by Bloomberg.

Ives said the Street’s estimate of about 230 million iPhone unit sales for Apple’s upcoming fiscal year is conservative and instead thinks the company is on track to sell 240 million to 250 million units in FY26. Ives wrote:

“The combination of a pent-up consumer upgrade cycle with our estimates of 315 million of 1.5 billion iPhones globally not upgrading their iPhones in the last 4 years, coupled with some design changes/enhancements have been the magical formula out of the gates.”

Sherwood News reported last week that redesigned iPhone models, which went on sale Friday, are seeing more interest than they have in three years — a phenomenon we speculate might have less to do with the iPhone itself and more to do with a natural upgrade cycle, as the rush of phones purchased in 2020 and 2021 become obsolete.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, or Robinhood Money, LLC.