I first posted about DeepSeek in December of 2023, but did not make a note of them. They were unknown to me until probably Q3 of 2024 when v2 already came out. At the time, there was already a lot of rumbling from some people in Silicon Valley about how good DeepSeek was due to the fantastic research. And then when V3 came out in December, it made a lot of headlines due to being about as good as GPT-4o while being trained on GPU hours worth about $5.5m. Of course, that number has since been blown up and politicized. DeepSeek has been very transparent on what the cost is referring to. And then in January, things really blew up when the r1 reasoning model came out. It is a great model not only due to the results, but also the process. The research paper is fantastic. It was open sourced with MIT license. It really tipped off many people in AI research community on the secrets behind how reasoning models work. The most addictive part of r1 is the thought process. The thinking tokens that get generated before the final answer reads like normal human thought. Even beyond that, they made the extremely smart decision of releasing distilled versions of the R1 model on open sourced Llama and Qwen models. This move opened up people to try distilled R1 from their home computer. It also allowed servers to chose which R1 models offered the best combo of inference requirement vs performance.
Of course, the political and geopolitical storm since then have really put DeepSeek in spotlight. News of big drop in Nvidia stock valuation was a huge story globally. DeepSeek became the hero of open source community. It’s app became really popular and recently surpassed Gemini to become the second most popular LLM app and also surpassed Doubao to become the most popular LLM app in China. In fact, it recently just reached 30 million DAU (Daily Active Users). Due to the tech war between China and US, there have been push by many politicians to ban its app from America and even block its open source usage by American developers. The latter part really is just fear mongering since running DeepSeek models on US hosted servers really have no security risk.
All of this prompted me to really write about the profound impact of DeepSeek R1. In some way, you can already see it’s impact by the bull run in China’s Tech stocks. Intuitively, I think people understand that if DeepSeek is doing well, then that lifts up the entire Chinese tech sectors. Of course, DeepSeek is not the only good Chinese model. ByteDance’s Doubao and Alibaba’s Qwen models are both really good. Doubao is not rated higher, because it has not been open sourced. The Qwen small models are consistently some of the best models out there. Beyond that, there are many great AI models coming out of China like Moonshot’s Kimi, Minimax, Kuaishou’s Kling and Tencent’s Hunyuan.
However, DeepSeek has been an Apollo moment for China’s AI industry. A week ago, a well known tech tipster in China had already reported Huawei, Oppo, Honor, Meizu & Vivo all added deep integration of DeepSeek-R1 in their AI assistance. Since then, it appears remaining ones like Xiaomi, Nubia and Lenovo also joined in.
At the same time, I also kept track of all the automakers that added adoption of DeepSeek models in their cockpit. At this point, it appears all major domestic automakers other than Li Auto, XPeng, Nio, Huawei & Xiaomi have joined in.
Beyond that, the popularity of DeepSeek app has been off the hook. By a week ago, it became the fastest app to reach 30m DAU. It’s MAU for web and app reached 100 million after just 2 weeks.
It’s the second fastest behind Threads. Of course, it still has huge room for growth globally. I also don’t know where it is now, but the Chinese social media mention of people using DeepSeek for answers has been intensive for the past 3 weeks. That is also why all the OEMs are quickly integrating DeepSeek.
This goes beyond just phone and EVs. Other consumer electronics like PCs, TVs and projectors are also integrating with DeepSeek. This level of consumer product integration would indicate a huge jump in inference demand for DeepSeek algo. Not just with DeepSeek’s own server but also all the cloud services that are supporting the OEMs. A rather recent phenomenon is various Chinese apps and cloud services are using DeepSeek R1 for deep AI searches. This is most recently tested on WeChat’s AI search function.
Since WeChat is the most important app in China, you can see Tencent cloud will need a lot of inference power to handle all the search requests coming in.
Silicon Flow, which uses Huawei Cloud/Ascend GPUs, is handling all the inference demand from HarmonyNext devices that are using its Xiaoyi AI assistant. Now, just imagine how large the inference demand is when all the phone makers push out the software update that will use DeepSeek R1 across all of its devices.
But it goes beyond just consumer electronics and apps. Industries will also be using it. Here is just a small list of Cloud services that have added support for DeepSeek.
This list will need to be added with all remaining Chinese telecom cloud services, Huawei cloud, JD cloud, SC.net, ByteDance’s AI platform and all the other major AI platforms. Baidu is even integrating DeepSeek into its Erniebot app and DeepSeek search for its appBuilder platform. What a defeat for Baidu!
But its success hasn’t been limited to the Chinese market. DeepSeek’s open nature means that the open source community has widely embraced all the innovations and research paper releases from the Whale bros. Perplexity even built a Deep Research app around the R1. AWS, Azure and Nvidia NMS added support for it very early on. It has become the most liked model in Huggingface after just a few weeks. All the open source platforms like together, Groq, Sambanova, Hyperbolic and more are supporting DeepSeek R1 model inference.
But the most enduring effect of DeepSeek is still on Chinese society itself. Before DeepSeek came along, GenAI and AI app usage was not that high. Doubao and Kimi were the two most popular apps. OpenAI does not operate in China. As such, Chinese AI developers did not have access to top end large language models. Since the universal adoption of DeepSeek across every AI platform in China, this will be a period of blossoming AI sector development in China just like GPT-4o caused many AI startups to start building apps for corporate users. This will lead to a very fast period of AI competition and adoption across China over the next few years.
I think DeepSeek will likely be the most popular foundational model, but it won’t be the only one. Qwen and Doubao will also be very popular. One big question is just what hardware and which cloud services will win out.
Recently, I saw ByteDance and its AI platform VolCEngine advertise that it had the highest token throughput and lowest latency for DeepSeek R1 among major AI platforms like VolCEngine, Alibaba, SiliconFlow and Baidu.
As I have said many times before, ByteDance has far and away the greatest GPU/AI chip demand of all the Chinese tech firms. Now, a lot of that is just to build data centers to support its many popular apps globally. However, its high inference throughput would indicate that it is massively building up inference capacity because it wants to win the race for handling all the AI inference demand. Since Chinese OEMs represent 60% of global phone sales, a huge % of future global AI inference demand will be coming to Chinese tech/cloud company controlled data centers. ByteDance wants to be the one hosting and handling these requests. Alibaba, Tencent, Baidu and Huawei will need to step up to compete with ByteDance.
Then there is the question of what hardware they will use. I think most of the compute thus far have been with Nvidia chips, because they are great for both training and inference. However, the future of AI chip demand will reside in inference. There is just no reason for domestic data centers to keep trying to buy chips that they can’t access. Nvidia chips have been smuggled into China in large quantity, but that is likely going to reduce in quantity going forward. As such, domestic chipmakers like Huawei/Ascend, Cambricon, Biren, Meta-X, Kunlun and Moore Thread have all been capitalizing on this as seen from list below.
In fact, Ascend’s Atlas 800I A2 (with 8 cards) can apparently process 1911 Token/s while running Full Blooded R1
While this is just 1/2 of what an 8-card server from Nvidia H200 can achieve, domestic options will continue to improve as Huawei and DeepSeek work together to improve the optimization of CANN architecture for inference (something that CUDA has an advantage over everyone else).
In recent times, I' have also talked about China’s progress in DRAM/HBM chips, 3D NAND chips and power chips and how they will be part of China’s goal of building a full AI ecosystem for both hardware and software.
Interesting enough, Baidu AI Cloud’s 千帆 platform recently just came out with AI servers that they’ve gotten ready for DeepSeek R1 inference.
You will notice that they only offer solutions using Ascend-910B and Kunlun P800 chips. There are not Nvidia AI Servers. Even Lenovo, who refuse to buy Ascend chips, have built domestic AI server solution around Meta-X chips. That means, large Chinese cloud service providers may very well be moving fully toward domestic chip options for inference out of necessity. ByteDance at this point seems like the largest customer.
On the whole, I would say the DeepSeek moment is the Apollo moment for China’s AI industry. It has put GenAI usage in the mind of professional crowd in China. As they find DeepSeek models to be really good to use, they will recommend others to join in. This will have a deep effect on AI demand out of China and globally as Chinese OEMs deploy AI across all their devices. That means Chinese built data centers (both in China and globally) will be doing inference on a lot of DeepSeek prompts. I think at this point, DeepSeek search is already starting to replace Baidu in China. Going forward, this is a great moment for Chinese consumers who have been starved of good search engine. This is also a great moment for Chinese AI ecosystem who have not been accessing the best reasoning models. And all this demand may finally be fulfilled by Chinese data centers using Chinese chips and electronics.
TP's write up’s are always excellent. Thank you