Tech
Full frame Close up Background Blueberries
(Sunlight7/Getty Images)

GPT-5: “A legitimate PhD level expert in anything” that sucks at spelling and geography

OpenAI spent a lot of time talking about how “smart” GPT-5 is, yet it failed spectacularly at tasks that a second grader could achieve.

Jon Keegan

Yesterday, OpenAI spent a lot of time talking about how “smart” its new GPT-5 model was.

OpenAI cofounder and CEO Sam Altman said of GPT-5: “Its like talking to an expert, a legitimate PhD level expert in anything,” and called it “the most powerful, the most smart, the fastest, the most reliable and the most robust reasoning model that weve shipped to date.”

Demos showed GPT-5 effortlessly creating an interactive simulation to explain the Bernoulli effect, diagnosing and fixing complex code errors, planning personalized running schedules, and “vibecoding” a cartoony 3D castle game. The company touted benchmarks showing how GPT-5 aced questions from a Harvard-MIT mathematics tournament and got high scores on coding, visual problem-solving, and other impressive tasks.

But once the public got its chance to kick GPT-5’s tires, some cracks started to emerge in this image of a superintelligent expert in everything.

AI models are famously bad at spelling different kinds of berries, and GPT-5 is no exception.

I had to try the “blueberry” thing myself with GPT5. I merely report the results.

[image or embed]

— Kieran Healy (@kjhealy.co) August 7, 2025 at 8:04 PM

Another classic failure of being bad at maps persisted with GPT-5 when it was asked to list all the states with the letter R in their names. It even offered to map them, with hilarious results:

My goto is to ask LLMs how many states have R in their name. They always fail. GPT 5 included Indiana, Illinois, and Texas in its list. It then asked me if I wanted an alphabetical highlighted map. Sure, why not.

[image or embed]

— radams (@radamssmash.bsky.social) August 7, 2025 at 8:40 PM

To be fair to OpenAI, this problem isn’t unique to GPT-5, as similar failures were documented with Google’s Gemini.

During the livestream, in an embarrassing screwup, a presentation showed some charts that looked like they had some of the same problems.

If this was all described as a “technical preview,” these kinds of mistakes might be understandable. But this is a real product from a company thats pulling in $1 billion per month. OpenAI’s models are being sold to schools, hospitals, law firms, and government agencies, including for national security and defense.

OpenAI is also telling users that GPT-5 can be used for medical information, while cautioning that the model does not replace a medical professional.

“GPT‑5 is our best model yet for health-related questions, empowering users to be informed about and advocate for their health.”

Why is this so hard?

The reason why such an advanced model can appear to be so capable at complex coding, math, and physics yet fail so spectacularly at spelling and maps is that generative models like GPT-5 are probabilistic systems at their core — they predict the most likely next token based on the volumes of data they have been trained on. They don’t know anything, and they don’t think or have any model of how the world works (though researchers are working on that).

When the model writes its response, you see the thing that has the highest score for what you should see next. But with math and coding, the rules are more strict and the examples its been trained on are consistent, so it has higher accuracy and can ace the math benchmarks with flying colors.

But drawing a map with names or counting the letters in a word are weirdly tough, as it requires a skill the model doesn’t really have and has to figure out step by step from patterns, which can lead to odd results. That’s a simplification of a very complex and sophisticated system, but applies to a lot of the generative-AI technology in use today.

Thats also a whole lot to explain to users, but OpenAI boils those complicated ideas down to a single warning below the text box: “ChatGPT can make mistakes. Check important info.”

OpenAI did not immediately respond to a request for comment.

More Tech

See all Tech
tech

OpenAI is now officially showing ads

Just a day after Anthropic’s Super Bowl ad aired, making fun of the concept of ad-backed AI chatbots, OpenAI began testing ads in ChatGPT for its free and Go subscription tiers.

In a blog post, OpenAI reiterated that ads wouldn’t affect ChatGPT’s responses and would be “clearly labeled as sponsored and visually separated from the organic answer.”

“Our goal is for ads to support broader access to more powerful ChatGPT features while maintaining the trust people place in ChatGPT for important and personal tasks,” the company wrote. “We’re starting with a test to learn, listen, and make sure we get the experience right.”

Advertising is one way the company, which is expected to go public late this year, could offset the massive cost of running its service.

The Information previously reported that OpenAI aiming for ad spending commitments of less than $1 million per advertiser during the testing phase — far cheaper than a Super Bowl prime-time spot like Anthropic’s.

“Our goal is for ads to support broader access to more powerful ChatGPT features while maintaining the trust people place in ChatGPT for important and personal tasks,” the company wrote. “We’re starting with a test to learn, listen, and make sure we get the experience right.”

Advertising is one way the company, which is expected to go public late this year, could offset the massive cost of running its service.

The Information previously reported that OpenAI aiming for ad spending commitments of less than $1 million per advertiser during the testing phase — far cheaper than a Super Bowl prime-time spot like Anthropic’s.

tech

New study finds AI doesn’t reduce work — it intensifies it

The rapid adoption of AI by businesses was fueled by the promise of huge productivity boosts that could supercharge workers. A new study has found that while it did indeed boost workers’ productivity, the use of generative AI at work also made work more intense and creep into workers’ downtime.

Researchers Aruna Ranganathan and Xingqi Maggie Ye followed about 200 workers at a US tech company for eight months. They found that AI did speed up work, allowing employees to take on more responsibilities. But after the novelty of their newfound superpowers wore off, workers reported “cognitive fatigue, burnout, and weakened decision-making.”

The researchers noted that to avoid AI-inspired burnout and turnover, organizations should adopt an “AI practice,” spelling out how the technology is expected to be used and what kinds of limits are in place.

Researchers Aruna Ranganathan and Xingqi Maggie Ye followed about 200 workers at a US tech company for eight months. They found that AI did speed up work, allowing employees to take on more responsibilities. But after the novelty of their newfound superpowers wore off, workers reported “cognitive fatigue, burnout, and weakened decision-making.”

The researchers noted that to avoid AI-inspired burnout and turnover, organizations should adopt an “AI practice,” spelling out how the technology is expected to be used and what kinds of limits are in place.

tech

Report: Anthropic staffing up to build as much as 10 gigawatts’ worth of data centers

Anthropic has been hiring a team of executives with a very particular set of skills: building huge data centers. The Information is reporting that Anthropic may be planning to build up to 10 gigawatts of AI computing capacity over several years.

According to the report, Anthropic has hired several former Google executives with deep experience building data centers, which aligns with Anthropic’s heavy use of Google’s tensor processing units.

Ten gigawatts would be incredibly expensive. OpenAI executives reportedly have said that building a 1-gigawatt data center costs about $50 billion — putting the cost of 10 gigawatts in the ballpark of $500 billion. But Anthropic told investors it would spend only $180 billion on AI computing servers through 2029, per the report.

In November, Anthropic announced a deal with Fluidstack to build its first data centers, based in New York and Texas, investing $50 billion in the projects. Anthropic is racing alongside OpenAI to pull off an IPO later this year.

Ten gigawatts would be incredibly expensive. OpenAI executives reportedly have said that building a 1-gigawatt data center costs about $50 billion — putting the cost of 10 gigawatts in the ballpark of $500 billion. But Anthropic told investors it would spend only $180 billion on AI computing servers through 2029, per the report.

In November, Anthropic announced a deal with Fluidstack to build its first data centers, based in New York and Texas, investing $50 billion in the projects. Anthropic is racing alongside OpenAI to pull off an IPO later this year.

tech

Report: OpenAI tells employees it is growing again, with Codex eating into Claude Code’s market share

The competition between OpenAI and Anthropic continues to intensify. Last night during the Super Bowl, a comedic Anthropic ad poked fun at OpenAI’s plans to add advertisements to ChatGPT, something it says it will not do to its Claude chatbot. And both companies released new models last week with improved coding capabilities.

In case OpenAI employees were beginning to sweat from all the pressure, CEO Sam Altman sought to assure the team that the company has gotten its mojo back.

According a new report from CNBC, Altman told employees in an internal Slack group that the company is “back to exceeding 10% monthly growth” and is seeing “insane” growth in its Codex coding tool.

A chart circulated among OpenAI employees shows that this new tool is winning market share from Claude Code, per a screenshot viewed by CNBC.

Per the report, Altman said another new model was coming this week. The company is reportedly working on what could end being a $100 billion investment round.

In case OpenAI employees were beginning to sweat from all the pressure, CEO Sam Altman sought to assure the team that the company has gotten its mojo back.

According a new report from CNBC, Altman told employees in an internal Slack group that the company is “back to exceeding 10% monthly growth” and is seeing “insane” growth in its Codex coding tool.

A chart circulated among OpenAI employees shows that this new tool is winning market share from Claude Code, per a screenshot viewed by CNBC.

Per the report, Altman said another new model was coming this week. The company is reportedly working on what could end being a $100 billion investment round.

tech

Google plans $15 billion US bond sale as capex surges

Alphabet is preparing a roughly $15 billion US investment-grade bond sale, Bloomberg reports, citing people familiar with the deal. The offering is expected to be split into as many as seven tranches, with initial price talk for the longest maturity — a 2066 bond — at about 120 basis points over Treasurys. JPMorgan is leading the sale alongside Goldman Sachs and Bank of America.

In a sign of just how attractive lending money to Alphabet is to investors, the bond sale has already attracted more than $100 billion in orders.

The sale follows Google parent Alphabet’s $17.5 billion US bond deal in November and underscores how even tech companies flush with cash are turning to the bond market to finance their huge AI ambitions. Alphabet expects its capital spending to balloon to $175 billion to $185 billion this year, as it races other tech giants shelling out record sums to get ahead in artificial intelligence. In 2025, the company’s total operating income was $129 billion.

In a sign of just how attractive lending money to Alphabet is to investors, the bond sale has already attracted more than $100 billion in orders.

The sale follows Google parent Alphabet’s $17.5 billion US bond deal in November and underscores how even tech companies flush with cash are turning to the bond market to finance their huge AI ambitions. Alphabet expects its capital spending to balloon to $175 billion to $185 billion this year, as it races other tech giants shelling out record sums to get ahead in artificial intelligence. In 2025, the company’s total operating income was $129 billion.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, or Robinhood Money, LLC.