NVIDIA praises DeepSeek’s AI breakthrough, although the stock plummeted

Earlier this week, a obscure Chinese artificial intelligence startup named Deepseek shakes Silicon Valley because it claims that it has established a powerful AI model R1, and its American competitors (such as OPENAI’s GPT and Google’s of Google Gemini) cost is small. AI Major Stocks should be reported and plummeted. NVIDIA (NVDA) created expensive chips (GPUs) to provide motivation for the development of this AI model. Its stocks fell more than 16 % on Monday (January 27), and deleted nearly 500 billion yuan in its market value. Dollar.
Despite the negative financial impact, NVIDIA praised Deepseek’s breakthrough. A company spokesman told “Observer” in a statement: “Deepseek is an excellent AI progress and a perfect example of testing time expansion.” “Deepseek’s work illustrates how to use this technology to create new models, and use completely using it completely. The extensive available model and calculation of export control.
Test time zoom is a new real -time prediction technology that can adjust the calculation requirements of the AI model according to the complexity of the task during the real -time use process. Before that, there were two leading methods to zoom in AI models: before training and after training. The zoom expansion before training expands the data sets and computing capabilities of the AI model during the initial training. At the same time, the training zoom is helpful to fine -tune the model and enhance the performance of its real world.
According to its Research paperThe R1 model of Deepseek received training for 2,048 NVIDIA H800 chips, with a total cost of less than $ 6 million. Some companies and insiders of artificial intelligence are skeptical about this. For example, Alexandr Wang, the founder and CEO of SCALE AI, suspect that the NVIDIA chip that Deepseek may have may exceed its allowed speed.
Despite the use of scaling law, the AI model usually needs a large amount of GPU power during the output. To solve this problem, DeepSeek said that it develops R1 as the “distillation AI” model development. The model is smaller. This model trains the behavior of a larger AI system. The distillation model consumes less computing capacity and memory, and provides higher accuracy in the tasks such as reasoning and coding, so that they become an effective solution for limited resources such as smartphones. Microsoft and OpenAI are currently studying the use of “distillation” technology to steal the proprietary data of OpenAI through GPT API to build a architecture of R1.
“It is important to note that R1’s $ 6 million number does not explain the resources that previously invested,” ITAMAR Friedman, the former machine vision director of Alibaba, told Observer. “Deepseek relatively low training cost figures may only represent the final training steps.” Friedman explained that although expansion and optimization are valuable, there are still restrictions on the hours of the AI model or training process, while still effectively simulating thinking and learning. He said: “Large -scale high -cost systems still have significant advantages.”
DEEPSEEK, headquartered in Hangzhou, was founded in 2023. It is a derivative of Hedge Fund Flyer led by Liang Wenfeng.
			
		




