2 min read

China's Mixed Stacks Are the New Normal

China's Mixed Stacks Are the New Normal
Photo by Fernando Andrade / Unsplash

DeepSeek's next model, R2, is delayed. Yesterday, the Financial Times reported that Chinese authorities had encouraged DeepSeek to prioritize Huawei's Ascend processors over Nvidia for training. DeepSeek reverted to Nvidia for training after repeated instability and efficiency problems on Ascend, despite assistance from Huawei engineers.

The company had aimed to release R2 earlier this year, but by late June, reports indicate CEO Liang Wenfeng wasn't satisfied with the model's performance. Competitive pressure inside China has intensified while R2 has been in limbo. Moonshot's Kimi K2 arrived in July with open weights and big claims on coding and complex task handling. Z.ai launched GLM-4.5 as an open model positioned for agentic use cases. And Alibaba has been rolling out Qwen 3 updates, including the Qwen3-Coder release. All three are plausible alternatives courting the same developer mindshare DeepSeek captured with R1.

The backdrop is as political as it is technical. Washington has just opened a narrow path for Nvidia and AMD to sell versions of their AI chips into China under unusual, closely monitored terms. It's far from a blanket return to pre-ban business, but it potentially eases bottlenecks at the margin. Since 2022, U.S. export controls have progressively barred China from top-tier chips, first A100/H100, then even China-specific variants like A800/H800 and, this spring, H20. So, any reopening is partial and contingent.

At the same time, Beijing is pressing domestic firms to favor local alternatives. Chinese officials have recently urged companies to avoid Nvidia's H20 for government-adjacent work and asked major platforms to justify H20 purchaes. That tension helps explain why DeepSeek kept Ascend in the inference stack even as it went back to Nvidia for training.

"Train where you must, serve where you're told" is becoming the Chinese pattern. The Ascend episode shows that porting large training jobs off CUDA isn't a quick swap. Inference is easier to migrate, and policy pressure ensures domestic chips stay in the path. Expect more mixed stacks and more divergence between training and serving environments.

Chinese open-weight releases are arriving fast and good. Kimi K2, GLM-4.5, and Qwen 3 are available to download today, which means things are really moving. Schedule risk for R2 isn't just a hardware situation. July's releases have set the bar high. The Ascend training attempt didn't stick for now. Nvidia is back in the training loop, and Ascend for inference.