Tech
Screenshot of OpenAI Operator
A screenshot of OpenAI’s “Operator” agent (OpenAI)
SMOOTH OPERATOR

OpenAI’s “Operator” is here to slowly take over your computer and mess up your life

Operator made a consequential mistake 13% of the time in early testing, such as emailing the wrong person or messing up a reminder for a person to take medication.

Jon Keegan

OpenAI released a “research preview” of its AI agent that can control your web browser. Called “Operator,” it has the ability to control your mouse and keyboard and analyze things it “sees” on your computer — very, very slowly. Currently it’s only available to ChatGPT Pro users in the US.

Operator makes use of the multistep “reasoning” you can find in ChatGPT o1, and the multimodal “vision” capabilities of ChatGPT 4o. This reasoning process achieves better (but slower) performance by breaking tasks into steps. Lots and lots of steps.

In the video demonstrations shared on the product page, you can watch Operator break the task down into dozens of distinct actions like “clicking,” “typing,” and “scrolling.” One example showed 152 steps to take a grammar quiz, and 146 steps to determine the amount of a refund from a canceled online order.

Screenshot from demo of OpenAI Operator
(OpenAI)

The potential for this kind of freewheeling AI web browsing on demand is positioned as an agent that can save you the drudgery of having to order groceries, research holidays, make restaurant reservations, or buy tickets to concerts.

Operator makes high-stakes mistakes

It’s one thing when ChatGPT spits out an incorrect answer, but if your chatbot is actually spending your money and triggering things in the real world, the stakes are much, much higher.

In its testing, OpenAI found that in one test of 100 sample tasks, 13% of the time Operator made a consequential mistake like emailing the wrong person, incorrectly bulk-removing email labels, setting the wrong date for a reminder to take the user’s medication, and ordering the wrong food item. Some of the other mistakes were easily reversible “nuisances.” OpenAI noted after mitigations, they reduced this error rate by approximately 90%.

OpenAI stresses that you have the ability to grab the wheel from the AI at any time, and you can approve any action before it is executed, but in this early evaluation version, you’ll probably have to spend more time babysitting the agent than just going ahead and doing the task on your own.

For now it limits the tasks you can use it for, prohibiting banking or job applications.

OpenAI shared a list of example tasks that some hypothetical user might want an AI to do for them. Ten out of ten times Operator was able to research bear habitats, create a grocery list, and make a ’90s playlist on Spotify.

Medium persuasion

The system card for the model behind Operator — Computer-Using Agent (CUA) — describes the process OpenAI used to assess the risks of letting a prerelease, novel AI agent go hog wild with your computer.

Like other model releases, OpenAI tested the model by using red teams with expertise in social engineering, CBRN (chemical, biological, radiological, and nuclear) threats, and cybersecurity. OpenAI gave itself a “low” risk for everything except “persuasion,” which got a “medium” risk score and is considered safe enough for public release.

High consequence

But there are some important restrictions on how you can use Operator. Because there is a slightly elevated risk of using Operator for influencing people, the usage policy prohibits impersonating people or organizations, concealing the role of AI in tasks, or using it to spread disinformation or false interactions, like fake reviews or fake profiles.

OpenAI prohibits people from using Operator to commit any crimes, but you are also prohibited from using it to bully, harass, defame, or discriminate against others based on protected attributes.

Under a heading titled “high consequence domains,” it notes that you can’t use Operator to make “high-stakes decisions” that might affect your safety or well-being, automate stock trading, or use it for political campaigning or lobbying.

OpenAI’s announcement follows competitor Anthropic’s October release of a similar feature that can control your computer. There is widespread hype that “agentic AI” like Operator will be a breakthrough for how people use these tools.

OpenAI CEO Sam Altman said in an announcement video that Operator is expected to roll out to international ChatGPT Pro and ChatGPT Plus users “soon,” but noted that the European rollout “will unfortunately take a while.”

More Tech

See all Tech
tech

Elon Musk says Tesla Robotaxis are operating without drivers, sending stock higher

Tesla CEO Elon Musk said that Tesla’s Robotaxis are now operating in Austin without a safety monitor. Tesla has been testing driverless cars in the area for about a month, and Musk had previously said the company would remove safety drivers by the end of 2025.

It’s unclear how many exactly of the roughly 50 Robotaxis the company operates in the area don’t have drivers. “Starting with a few unsupervised vehicles mixed in with the broader robotaxi fleet with safety monitors, and the ratio will increase over time,” Tesla head of AI Ashok Elluswamy posted shortly after Musk. Ethan McKenna, the person behind Robotaxi Tracker, estimates it's two or three vehicles.

What is clear is that the move is good for Tesla’s stock, which is currently up 3.5%, extending its gains after Musk’s tweet. Morgan Stanley said yesterday that it considers the removal of safety drivers a “precursor to personal unsupervised FSD rollout.” Unsupervised FSD is widely considered to be integral to the would-be autonomous company’s value proposition.

At Davos earlier on Thursday, Musk said, "self-driving cars is essentially a solved problem at this point."

tech

Survey: CEOs and workers have wildly different thoughts on AI productivity gains

One of the main reasons companies are rushing to adopt AI is to give their workers the miraculous productivity boost that AI companies have been promising — and believe will quickly earn back their investment.

But now that companies have been using AI for a while, a growing perception gap is emerging between the C-suite and their employees.

The Wall Street Journal reported on new findings by research firm Section, which surveyed 5,000 white-collar workers from companies with more than 1,000 employees.

More than 70% of the corporate executives in the survey said they were “excited” by AI, and 19% of them said the tools have saved them more than 12 hours of work per week.

But nonmanagement workers had a very different take on AI. Almost 70% of this group said AI made them feel “anxious or overwhelmed,” and 40% said the tools saved them no time at all.

The Wall Street Journal reported on new findings by research firm Section, which surveyed 5,000 white-collar workers from companies with more than 1,000 employees.

More than 70% of the corporate executives in the survey said they were “excited” by AI, and 19% of them said the tools have saved them more than 12 hours of work per week.

But nonmanagement workers had a very different take on AI. Almost 70% of this group said AI made them feel “anxious or overwhelmed,” and 40% said the tools saved them no time at all.

tech

Tesla jumps as Musk says he expects Optimus sales next year, European and Chinese FSD approval next month

Tesla CEO Elon Musk now says he thinks the company’s Optimus robots will be for sale to the public “by the end of next year.”

According to Musk, “That’s when we are confident that there is very high reliability, very high safety, and the range of functionality is also very high.”

Like many of Musk’s other timelines, that’s later than he previously predicted. In 2024, for example, Musk said the AI robots would be for sale in 2025.

Speaking with BlackRock CEO Larry Fink on a panel today at the World Economic Forum, Musk said the robots are currently doing “simple tasks” in Tesla factories, but believes “they’ll be doing more complex tasks and be deployed in an industrial environment” by the end of this year, before going on sale to the public in 2027.

Musk forecasts a future with “billions” of AI robots that “saturate all human needs.”

On a separate topic, Musk was bullish on regulatory approval for what Tesla calls Full Self-Driving technology in markets outside the US. “We hope to get supervised Full Self-Driving approval in Europe, hopefully next month, and then maybe a similar timing for China,” he said. Musk has said in the past that the pending regulatory approval for FSD in Europe is a key reason why Tesla’s sales in the region have been tanking.

tech

Waymo is now offering autonomous rides in Miami

Google subsidiary Waymo announced Thursday that it’s officially open for autonomous ride-hailing in Miami, expanding the company’s coverage area to six US cities. The company will be “inviting new riders on a rolling basis” to take rides across its 60-square-mile service area, which includes the Design District, Wynwood, Brickell, and Coral Gables. Waymo said it plans to expand to Miami International Airport “soon.”

Competitor Tesla currently operates a ride-hailing service with a safety monitor in the vehicle in Austin and the Bay Area.

tech

Apple to promote Siri from assistant to chatbot

Bloomberg reports that Apple plans to transform its Siri assistant into a full-fledged chatbot similar to OpenAI’s ChatGPT.

The chatbot would be integrated throughout the iPhone’s operating system rather than offered as a stand-alone app. It’s expected to arrive later this year and would be separate from more incremental, non-chatbot improvements to Siri rolling out in the coming months aimed at making the existing assistant more usable.

Both updates will be powered by Google’s AI models, Bloomberg reports, but the chatbot upgrade will be more advanced and akin to the much-lauded Gemini 3.

While the difference between an assistant and a chatbot may sound subtle, it represents a meaningful shift for Apple, which has long avoided a fully conversational interface and has lagged rivals that embraced one. Any new Siri chat capabilities could also eventually extend to other Apple devices under development, including wearables such as the pin Apple is developing.

Both updates will be powered by Google’s AI models, Bloomberg reports, but the chatbot upgrade will be more advanced and akin to the much-lauded Gemini 3.

While the difference between an assistant and a chatbot may sound subtle, it represents a meaningful shift for Apple, which has long avoided a fully conversational interface and has lagged rivals that embraced one. Any new Siri chat capabilities could also eventually extend to other Apple devices under development, including wearables such as the pin Apple is developing.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, or Robinhood Money, LLC.