Tech
ARC-AGI-3
(Arc Prize)

The toughest AI benchmark just got a whole lot tougher

ARC-AGI-3 is the latest version of a clever benchmark that challenges AI models to solve mini video games with no written instructions.

The flood of new AI models with increasingly advanced “reasoning” capabilities is forcing the AI industry to abandon early benchmark tests and invent new ones to test for many skills.

To watch the evolution of one such test — ARC-AGI — is to witness the huge technical leaps that today’s generative-AI models have made in a few short years. Tech CEOs brag about their models’ high scores on ARC-AGI, as it is widely considered one of the most unique and difficult AI benchmarks in use today.

Rather than testing how well a model can translate an inscription on an ancient Roman tombstone, or offer a diagnosis for a complex medical case, ARC-AGI challenges AI models to analyze abstract geometric puzzles and games without any written instructions. This ensures that the models are forced to create solutions to complex multistep problems, rather than regurgitate text from their training.

The latest version that just launched, ARC-AGI-3, is basically a collection of mini games, which the user can play by moving simple shapes through a pixelated game board using directional arrows. As designed, the games are easy for humans to figure out after a few minutes of experimentation, but incredibly difficult for computers to solve.

One of the fascinating new features of the latest version is a replay mode that lets human observers read through AI models’ “chain of thought” transcript to see how a model breaks down the problem and attempts a solution.

Humans can play through these games on the project’s website. For now it seems humans don’t have much to worry about.

The most capable state-of-the-art models in the wild haven’t even cracked a 1% score (out of 100). The current leaderboard for ARC-AGI-3 shows OpenAI’s GPT-5.4 in the lead at 0.3%, and tied for second place are Anthropic’s Opus 4.6 and Google’s Gemini 3.1 Pro. xAI’s Grok 4.20 Reasoning model got a 0%.

More Tech

See all Tech
tech

The US leads the world in robotaxi deployments

Every day it seems another robotaxi launches somewhere in the world. But most of them are in the US.

Of the 171 active robotaxi deployments globally, 69 — or 40% — are in the US, according to a new report from the Bank of America Institute. China, the next largest market, accounts for 24% of deployments.

Most of those deployments are still in testing or early commercial stages. Only 10 US cities currently have fully commercial robotaxi operations, defined as services that operate on public roads, carry paying passengers, run fully driverless without a safety driver, and function all day in any weather.

For now, that effectively refers to Alphabet’s Waymo, which operates commercially in Atlanta, Austin, Dallas, Houston, Los Angeles, Miami, Orlando, Phoenix, San Antonio, and the San Francisco Bay Area. That definition excludes competitors like Tesla, whose Robotaxi service uses safety monitors, and Amazon’s Zoox, which has yet to charge customers for rides.

tech

Sen. Sanders and Rep. Ocasio-Cortez introduce data center moratorium bill

Tapping into the growing public pushback surrounding the data center construction boom, Sen. Bernie Sanders, I-Vt., and Rep. Alexandria Ocasio-Cortez, D-NY, have announced The AI Data Center Moratorium Act of 2026. The bill calls for a halt to new data center construction until federal legislation regulating AI is enacted.

Lawmakers in at least 11 states have proposed pauses on data center construction, according to the National Conference of State Legislatures.

In addition to halting new data center construction, the bill also calls for banning the export of advanced AI chips to countries that lack regulations that protect against harms from AI.

In a press release, Ocasio-Cortez said:

“Congress has a moral obligation to stand with the American people and stop the expansion of these data centers until we have a framework to adequately address the existential harm AI poses to our society. We must choose humanity over profit.”

Heading into the midterm elections, data centers are starting to emerge as a political issue following a growing list of projects that have been scuttled due to community opposition. President Trump has pushed AI companies to voluntarily pledge to “pay their way” for the massive energy requirements of data centers.

Lawmakers in at least 11 states have proposed pauses on data center construction, according to the National Conference of State Legislatures.

In addition to halting new data center construction, the bill also calls for banning the export of advanced AI chips to countries that lack regulations that protect against harms from AI.

In a press release, Ocasio-Cortez said:

“Congress has a moral obligation to stand with the American people and stop the expansion of these data centers until we have a framework to adequately address the existential harm AI poses to our society. We must choose humanity over profit.”

Heading into the midterm elections, data centers are starting to emerge as a political issue following a growing list of projects that have been scuttled due to community opposition. President Trump has pushed AI companies to voluntarily pledge to “pay their way” for the massive energy requirements of data centers.

tech

RIP ChatGPT-XXX

OpenAI has shelved its planned ChatGPT adult mode indefinitely, according to a report from the Financial Times. The startup is in the midst of an internal effort to eliminate many of its “side quests” to focus on enterprise features like coding and productivity tools.

Earlier this week, the company announced it was killing one of those side quests: its video-generation app, Sora.

Per the report, investors and staff raised concerns that offering an erotica-generating AI model doesn’t exactly align with the company’s stated mission to ensure that artificial general intelligence benefits all of humanity.

As other tech companies deal with the legal consequences of failing to protect minors from their products, OpenAI would be choosing a potentially dangerous path with such an X-rated feature, while its main competitor, Anthropic, continues to make inroads among enterprise customers by improving on coding and spreadsheet skills.

Per the report, investors and staff raised concerns that offering an erotica-generating AI model doesn’t exactly align with the company’s stated mission to ensure that artificial general intelligence benefits all of humanity.

As other tech companies deal with the legal consequences of failing to protect minors from their products, OpenAI would be choosing a potentially dangerous path with such an X-rated feature, while its main competitor, Anthropic, continues to make inroads among enterprise customers by improving on coding and spreadsheet skills.

SpaceX Launches Classified Payload for NRO from Cape Canaveral, Florida

SpaceX isn’t just expensive — it’s in a different orbit

At a $1.75 trillion valuation, SpaceX would be valued at roughly 100x its 2025 revenue.

Latest Stories

Sherwood Media, LLC produces fresh and unique perspectives on topical financial news and is a fully owned subsidiary of Robinhood Markets, Inc., and any views expressed here do not necessarily reflect the views of any other Robinhood affiliate, including Robinhood Markets, Inc., Robinhood Financial LLC, Robinhood Securities, LLC, Robinhood Crypto, LLC, Robinhood Derivatives, LLC, or Robinhood Money, LLC. Futures and event contracts are offered through Robinhood Derivatives, LLC.