Sam Altman - 11th Breakthrough Prize Ceremony - Arrivals

(Taylor Hill/FilmMagic)

AGENT FOR HIRE

OpenAI’s new “agent” is here to help you buy tuxedos, book luxury hotels, and order batches of cupcakes excruciatingly slowly

The new tool is trained to help execute real-world web-based tasks, but comes with significant risks and caveats.

7/18/25 9:29AM

“Agentic AI” is the latest buzzword in tech, as the industry furiously races to monetize the costly technology. Yesterday OpenAI announced its second offering in the emerging category: “ChatGPT Agent,” which comes on the heels of January’s browser-controlling “Operator” tool.

The promise of a superpowered AI agent is that with a simple prompt, your agent can toil away behind the scenes and take care of the busywork you can’t bother yourself with. That could be boring tasks in your day-to-day life, chores, or the things you actually get paid to do for your job.

In a post on X, OpenAI cofounder and CEO Sam Altman said of the new technology:

watching chatgpt agent use a computer to do complex tasks has been a real "feel the agi" moment for me; something about seeing the computer think, plan, and execute hits different.
— Sam Altman (@sama) July 17, 2025

In the livestream announcing Agent, the OpenAI team demonstrated the tool tackling some of these boring tasks that Agent could spare you from.

For example, you’re probably dreading all of the complex planning and research required to attend your dear friends’ wedding in Hawaii. What outfit should I buy? And what gift? Agent has you covered.

Perhaps Agent was trained to solve problems for people with AI researcher pay packages, but after thinking for about 18 minutes (with a break to ask the user a question), Agent came to the rescue to help find a $1,500 Brooks Brothers tuxedo and a nice hotel to stay at in Maui for $4,600 (five nights).

In a demo for Wired, an OpenAI researcher showed how Agent could order a batch of cupcakes — which took it one hour to complete.

The fact that a newly released, bleeding-edge AI tool like Agent takes so long to execute a query isn’t just about wasting your valuable time.

The kind of “reasoning” that Agent is undertaking while executing these queries is among the most computationally expensive of the services that OpenAI offers.

OpenAI describes the advanced “deep research” queries it offers as “very compute intensive,” which the company says is a part of Agent’s capabilities. OpenAI is currently losing money on its $200 per month all-you-can-eat plan for intensive queries.

That could also be a significant amount of water and energy consumed for what would normally be very lightweight tasks when performed by a human.

The company said Agent would be rolling out to ChatGPT Plus, Pro, and Team users yesterday. Pro and Team subscribers get 400 Agent queries per month, and Plus users get 40 per month. (It wasn’t available for my account to try out before I wrote this.)

Striving to Excel

In addition to the ability to sift through your emails and calendars, OpenAI is playing up Agent’s ability to create and edit documents like slide decks and spreadsheets. The announcement highlighted Agent’s superior accuracy when editing spreadsheets “derived from real-world scenarios” compared to Microsoft Copilot using Excel: “ChatGPT agent outperforms existing models by a significant margin.”

Agent (with .xlsx access) scored a 45.5% accuracy rating, while Copilot in Excel scored only 20%. But ChatGPT Agent’s score is actually not great when you consider that humans scored a 71.3%.

“New risk surface”

OpenAI has always offered frank assessments of the risks of its new tools, and Agent is no exception. Choosing to take a “precautionary approach,” OpenAI is treating Agent as having “High Biological and Chemical capabilities” according to its “Preparedness Framework.” That document describes this new “high” capability as:

“The model can provide meaningful counterfactual assistance (relative to unlimited access to baseline of tools available in 2021) to ‘novice’ actors (anyone with a basic relevant technical background) that enables them to create known biological or chemical threats.”

The “associated risk” of this threshold:

“Significantly increased likelihood and frequency of biological or chemical terror events by non-state actors using known reference-class threats.”

In an email to Sherwood News, OpenAI spokesperson Niko Felix explained that Agent mode is not the default model and users are free to choose to use it at any time. Felix said Agent is trained to explicitly ask users for permissions for tasks with real-world consequences, such as online purchases. And for now, Agent is trained to refuse high-risk tasks like banking.

Felix also cited a caveat in the announcement:

“While ChatGPT agent is already a powerful tool for handling complex tasks, today’s launch is just the beginning. We’ll continue to iteratively add significant improvements regularly, making it more capable and useful to more people over time.”

One of the concerns that OpenAI red teams had was the risk of novel “prompt injections” for Agent that could trick users into sharing personal data or taking actions that they shouldn’t.

In a post on X, Altman said he would caution his own family from using Agent for “high-stakes uses” until the company has had a chance to study it in the wild.

“But we do want people to treat this as a new technology and a new risk surface,” Altman said in the livestream video. “But that said, we hope you’ll love it.”

Jon Keegan

39m

Here’s what to look for in today’s Microsoft earnings report

All eyes will be on Microsoft’s Azure cloud revenue growth, and just how much it’s increasing its capital expenditure.

Millie Giles

INSTA PLUS MAX

Meta is testing out premium subscriptions on Instagram, Facebook, and WhatsApp

Ahead of its earnings, expected after the bell today, Meta has announced plans to trial a paid tier on its apps.

Meta Apps - Facebook, WhatsApp, Instagram, and Threads

Rani Molla7h

Amazon cuts another 16,000 roles after laying off 14,000 workers in October

Amazon announced Wednesday that it’s cutting 16,000 roles across the company, having laid off 14,000 workers only three months ago.

“As I shared in October, we’ve been working to strengthen our organization by reducing layers, increasing ownership, and removing bureaucracy,” Senior Vice President of People Experience and Technology Beth Galetti wrote in the press release. “While many teams finalized their organizational changes in October, other teams did not complete that work until now.”

CEO Andy Jassy previously said that the October layoffs were “about culture” rather than AI-related cost cutting. Galetti says layoffs, now totaling 30,000, won’t become a regular occurrence.

“Some of you might ask if this is the beginning of a new rhythm — where we announce broad reductions every few months. That’s not our plan.”

Update on our organization

CEO Andy Jassy previously said that the October layoffs were “about culture” rather than AI-related cost cutting. Galetti says layoffs, now totaling 30,000, won’t become a regular occurrence.

“Some of you might ask if this is the beginning of a new rhythm — where we announce broad reductions every few months. That’s not our plan.”

Jon Keegan22h

Anthropic reportedly doubles current fundraising round to $20 billion

Anthropic has doubled its current fundraising round to $20 billion on strong investor demand, according reporting from the Financial Times. The new fundraising round would value the company at a staggering $350 billion. That’s up 91% from September, when it raised at a valuation of $183 billion.

The company reportedly received interest totaling 5x to 6x its original $10 billion fundraising goal, and it’s expected to haul in several billion more than that tally before the current round closes.

Anthropic’s success with enterprise customers and the popularity of its Claude Code product are boosting the company’s momentum as it chases the current valuation leader of the AI startup pack: OpenAI.

Anthropic doubles VC fundraising to $20bn on surging investor demand

Produce At Whole Foods Market's Flagship Store

Amazon says it’s doubling down on opening Whole Foods stores. That sounds familiar.

The company says it’ll open 100 Whole Foods locations in the next few years. That sounds similar to plans Whole Foods’ CEO laid out in 2024 for opening 30 stores a year. Since then, it appears to have added 14, total.

Rani Molla23h