DeepSeek releases new V4 series models highlighting efficiency and long context
Chinese AI lab DeepSeek has released a major new version of its eponymous open-source AI models that are nipping at the heels of leading frontier models in some areas.
The most significant DeepSeek-V4 Pro and DeepSeek-V4 Flash both have a 1 million-token context — the amount of information the model can actively work with in a single session — which is a crucial feature for complex, long-running coding tasks.
DeepSeek rebuilt how the models process information under the hood, making them substantially more efficient — and that efficiency is what makes the large context window actually usable.
Also, the new models’ coding skills have closed the gap with the major frontier models from Anthropic, OpenAI, and Google.
The authors of the model acknowledge some of V4’s shortcomings, such as its lower scores on reasoning benchmarks, saying that V4 “trails state-of-the-art frontier models by approximately 3 to 6 months.”
As open-weight models, V4 can be run on any user’s own hardware, making the V4 models among the top-performing open-source models out there. V4’s large context and token efficiency are especially significant among open-source models.
But like with earlier DeepSeek models, don’t ask it about Tiananmen Square.
DeepSeek rebuilt how the models process information under the hood, making them substantially more efficient — and that efficiency is what makes the large context window actually usable.
Also, the new models’ coding skills have closed the gap with the major frontier models from Anthropic, OpenAI, and Google.
The authors of the model acknowledge some of V4’s shortcomings, such as its lower scores on reasoning benchmarks, saying that V4 “trails state-of-the-art frontier models by approximately 3 to 6 months.”
As open-weight models, V4 can be run on any user’s own hardware, making the V4 models among the top-performing open-source models out there. V4’s large context and token efficiency are especially significant among open-source models.
But like with earlier DeepSeek models, don’t ask it about Tiananmen Square.