Revelation of Training Cost for AI Model Developed by China's DeepSeek Stirs Tech Sector
In a recent development, Chinese artificial intelligence (AI) developer DeepSeek has found itself at the centre of a controversy over the development of its R1 model. The Nature article, published in June, listed Liang Wenfeng as one of the co-authors of the research involving the R1 model.
The controversy surrounding DeepSeek's R1 model began with allegations of the use of unlawfully acquired AI chips. However, Nvidia, the manufacturer of the chips, clarified to Reuters that DeepSeek had indeed used lawfully acquired H800 chips, not H100s, for the training of the R1 model.
The R1 model was trained for a total of 80 hours on a 512-chip cluster of H800 chips, following a preparatory phase with A100 chips. DeepSeek spent $294,000 on training its R1 model, a cost that was lower than reported for US rivals.
The use of Nvidia's H800 chips, designed for the Chinese market, for the training of the R1 model has raised questions. DeepSeek defends its approach, arguing that distillation, a process that yields better model performance while being far cheaper, enables broader access to AI-powered technologies.
However, DeepSeek's statements about its development costs and the technology it used have been questioned by US companies and officials. The company has not directly responded to assertions that it had deliberately 'distilled' OpenAI's models into its own. OpenAI did not respond immediately to a request for comment regarding DeepSeek's use of its models.
The release of lower-cost AI systems by DeepSeek in January prompted global investors to dump tech stocks. The V3 model, another development by DeepSeek, relied on crawled web pages that contained a 'significant number of OpenAI-model-generated answers,' but this was not intentional.
The US made it illegal for Nvidia to export its more powerful H100 and A100 AI chips to China in October 2022. DeepSeek acknowledged for the first time that it owns A100 chips and used them in the preparatory stages of developing the R1 model. Despite this admission, the company did not comment on the allegations regarding the acquisition of these chips.
The ongoing controversy surrounding DeepSeek's AI model development and the use of certain technologies is a topic of interest for the global AI community and regulators alike. As the field of AI continues to evolve, so too will the challenges and controversies that come with it.
Read also:
- Web3 social arcade extends Pixelverse's tap-to-earn feature beyond Telegram to Base and Farcaster platforms.
- Trump praises the robustness of US-UK relations during his visit with Starmer at Chequers, showcasing the strong bond between the two nations.
- Navigating the Path to Tech Product Success: Expert Insights from Delasport, a Trailblazer in the Tech Industry
- Google introduces a new heat-resistant tool fueled by artificial intelligence