Chinese AI startup DeepSeek unveils open-source model to rival OpenAI o1

Chinese AI developer DeepSeek has unveiled an open-source version of its reasoning model, DeepSeek-R1, featuring 671 billion parameters and claiming performance superior to OpenAI’s o1 on key benchmarks.

“DeepSeek-R1 achieves a score of 79.8% Pass@1 on AIME 2024, slightly surpassing OpenAI-o1-1217,” the company said in a technical paper. “On MATH-500, it attains an impressive score of 97.3%, performing on par with OpenAI-o1-1217 and significantly outperforming other models.”

On coding-related tasks, DeepSeek-R1 achieved a 2,029 Elo rating on Codeforces and outperformed 96.3% of human participants in the competition, the company added.

“For engineering-related tasks, DeepSeek-R1 performs slightly better than DeepSeek-V3 [another model from the company], which could help developers in real-world tasks,” DeepSeek said.

DeepSeek-R1 is available on the AI development platform Hugging Face under an MIT license, allowing unrestricted commercial use.

The company also offers “distilled” versions of R1, ranging from 1.5 billion to 70 billion parameters, with the smallest capable of running on a laptop. The full-scale R1, which requires more powerful hardware, is available via API at costs up to 95% lower than OpenAI’s o1.

As a reasoning model, R1 would self-check its outputs, potentially reducing errors common in other models. Although slower, reasoning models offer increased reliability in fields such as physics, science, and math.

Accelerating the AI arms race

The race for building language models has intensified especially with changing geopolitical realities.

“While OpenAI and other US-based firms definitely have the first mover advantage, China has been investing a lot in AI to build its capabilities to become a good second mover,” said Sharath Srinivasamurthy, associate vice president at IDC.

In real-world enterprise applications, DeepSeek-R1’s performance on key metrics translates to improved capabilities in mathematical reasoning, problem-solving, and coding tasks.

“Although this suggests that DeepSeek-R1 could potentially outperform OpenAI’s o1 in practical scenarios requiring these specific competencies, the eventual outcome still depends on various factors within the broader AI ecosystem, such as the AI readiness of data, RAG and agent support, ModelOps and DevOps toolchain integrations, cloud and data infrastructure support, and AI governance,” said Charlie Dai, VP and principal analyst at Forrester.

Moreover, while R1’s claims of superior performance are appealing, its true effectiveness remains uncertain due to a lack of clarity about the data it has been trained on.

“The models are only as good as the data they are trained on,” Srinivasamurthy said. “With restrictive policies in China on data consumption and publication, there is a possibility that the data might be biased or incomplete.”

⁠⁠Srinivasamurthy also noted that the true potential of LLMs lies in handling multiple modalities like text and images. While many models have achieved this, R1 has room to grow to become a comprehensive solution.

Potential for enterprise use

DeepSeek-R1’s MIT license, allowing unrestricted commercial use and customization, along with its lower costs, positions it as an appealing and cost-effective option for enterprise adoption.

However, enterprises may need to factor in additional costs associated with the MIT license, such as customization, fine-tuning, and adapting the model to meet specific business needs for a higher ROI, according to Mansi Gupta, senior analyst at Everest Group.

Businesses outside China may also be reluctant to use their data to train the model or integrate it into their operations due to regulatory challenges affecting AI adoption. “Enterprises must carefully assess the geopolitical risks tied to using R1, particularly for global operations,” Gupta said. “This includes navigating Chinese regulations and conducting thorough compliance assessments and risk analyses. Ultimately, the adoption of R1 will depend on how well enterprises can optimize the trade-off between its potential ROI and these geopolitical and regulatory challenges.”

Posts Similares

Deixe um comentário

O seu endereço de e-mail não será publicado. Campos obrigatórios são marcados com *