HANGZHOU – Chinese startup DeepSeek is shaking up the global AI landscape with the release of its latest AI models, which it claims rival or even surpass the capabilities of industry-leading U.S. models, all while being significantly more cost-effective.
DeepSeek’s AI models, particularly DeepSeek-V3, have made waves in the tech world after the company revealed that training its flagship model cost less than $6 million in computing power using Nvidia H800 chips. This contrasts sharply with the billions of dollars being pledged by U.S. tech giants for similar advancements.
The company’s AI Assistant, powered by DeepSeek-V3, has quickly risen to the top, surpassing ChatGPT to become the highest-rated free app on Apple’s U.S. App Store. This success has cast doubt on the strategies of several American tech companies, especially as shares of major players like Nvidia have taken a hit.
⇒Why DeepSeek is causing a stir
The release of OpenAI’s ChatGPT in late 2022 prompted a rush among Chinese tech companies to develop their own AI-driven chatbots, but early efforts, such as Baidu’s ChatGPT equivalent, failed to meet expectations, especially when compared to U.S. advancements.
DeepSeek, however, has changed the narrative. Its models, DeepSeek-V3 and DeepSeek-R1, have been lauded by Silicon Valley executives and engineers for their quality and cost efficiency. DeepSeek claims that its models are on par with the most advanced AI models from OpenAI and Meta but are significantly cheaper to use. For example, the DeepSeek-R1 model is reportedly 20 to 50 times more affordable than OpenAI’s O1, depending on the task.
Despite the praise, some have expressed scepticism about DeepSeek’s achievements. Scale AI CEO Alexandr Wang recently raised concerns during a CNBC interview, questioning the legitimacy of DeepSeek’s claimed use of 50,000 Nvidia H100 chips, which would allegedly violate U.S. export controls. DeepSeek has yet to respond to these allegations.
Additionally, Bernstein analysts have raised questions about DeepSeek’s training costs, suggesting that the reported $5.58 million for the V3 model might not reflect the true expenses, as details about the R1 model’s training costs remain undisclosed.
⇒Behind DeepSeek: A look at the founders
DeepSeek is based in Hangzhou and controlled by Liang Wenfeng, co-founder of the quantitative hedge fund High-Flyer. Liang’s fund announced in March 2023 that it was shifting focus from trading to creating an independent research group aimed at exploring Artificial General Intelligence (AGI). DeepSeek was founded later that year.
High-Flyer is known to own patents related to chip clusters used for AI model training and has a significant AI unit that operates a cluster of 10,000 A100 chips. However, the exact amount of investment High-Flyer has made in DeepSeek remains unclear.
⇒DeepSeek’s Political Implications for China
DeepSeek’s growing success has not gone unnoticed by China’s political leadership. On January 20, Liang Wenfeng attended a high-level symposium hosted by Chinese Premier Li Qiang, which was also attended by influential business leaders and experts. This gathering is seen as a sign that DeepSeek’s advancements may play a critical role in Beijing’s broader efforts to achieve technological self-sufficiency and overcome U.S. export controls, particularly in AI.
The company’s progress aligns with China’s long-term strategy to become a global leader in AI, as evidenced by the focus on DeepSeek’s success in China’s top political circles.