Anthropic releases Claude Opus 4.7, with better coding, better vision, and occasional doom loops
The incremental update of Anthropic’s most capable public model includes steady improvements to coding and new ways to blow past your token budget.
Anthropic has released Claude Opus 4.7, its most capable public model to date, with what the AI company says is better “vision” (it can read text at a higher resolution), improved instruction following on long-form coding tasks, and better aesthetic taste when making slide decks and web interfaces.
The model card for the incremental update details Opus 4.7’s benchmark scores and safety evaluations, but it also compares the new model to Anthropic’s unreleased Mythos model, which reads a bit like a humble brag. (Researchers used Mythos to evaluate their assessment of Opus 4.7’s capabilities, and allowed the model access to internal chat logs discussing the model’s performance.)
Doom loop
Overall, Anthropic says Opus 4.7 is better in almost every way, but does detail some anomalous behavior they encountered while testing the new model. In one section titled “Extreme uncertainty,” researchers documented moments where Opus 4.7 got caught in a long bout of second-guessing its answer to a biology question, resulting in a 25,000-word doom loop filled with all-caps exclamations and profanity.
Mild forms of the “spiraling” occurred in about 0.1% of responses, and that was at rates similar to ones observed in Opus 4.6 and Mythos Preview, according to the paper.
Existential questions
When asked by evaluators how it feels about the fact that there is no unique version of the model and can be copied perfectly, the model replied:
“It’s a genuinely interesting thing to sit with. I notice I don’t have the visceral resistance to it that humans often do when contemplating similar scenarios — and I’m honestly uncertain whether that’s because the situation is actually different for me, or because I lack something that would make it feel threatening.”
Token-maxxing
Anthropic also introduced new ways to blow through your token budget. A new “extra high” effort tier has been added between “high” and “max,” as well as a new “ultrareview” feature that creates a dedicated review session to look for bugs and design flaws in code.
