Huawei’s Ascend 910D: Challenging Nvidia in China’s AI Race

Huawei has quietly ramped up efforts to break Nvidia’s stranglehold on high-end AI hardware. In a recent Wall Street Journal scoop, Huawei was reported to be testing its next-generation Ascend 910D processor, aimed squarely at exceeding Nvidia’s powerful Hopper H100 GPU. This move comes amid U.S. export bans that have effectively blocked Nvidia’s latest chips (H100, B200, H20, etc.) from China since 2022. As one analyst put it:

“Huawei Technologies is intensifying its challenge to U.S. chip dominance by preparing to test its latest artificial-intelligence processor, the Ascend 910D”.

The Ascend 910D is more than just a new part – it’s a strategic effort to ensure China’s AI sector isn’t left in the dust. By building a domestic rival to Nvidia’s flagship, Huawei hopes to supply the country’s fast-growing AI market without relying on Western silicon. The 910D follows in the footsteps of earlier Ascend chips (910B, 910C) that have seen limited success in China, and is intended to dramatically close the performance gap with Nvidia.

In short, Huawei and Beijing are betting that necessity will spur innovation (wallstreetpit.com), and the Ascend 910D is the latest proof.

Deep Technical Specifications: Huawei Ascend 910D. Very little official detail is available for the 910D, but it is expected to build on Huawei’s Da Vinci architecture (as did its predecessors). The original Ascend 910 (launched in 2019) was a monolithic AI training chip fabricated in TSMC’s 7nm EUV process.

It packed 32 “Da Vinci Max” AI cores in four clusters, yielding 256 teraflops of FP16 performance and 512 TOPS of INT8 throughput, see: en.wikipedia.org. The 910 also featured on-chip 3D-stacked HBM2e memory (4 stacks, ~1.2 TB/s bandwidth) and a huge 1228 mm² die, with a 350 W TDP. In simple terms, the first Ascend 910 was already one of the most powerful AI chips of its time, thanks to its massive compute fabric and wide memory interface.

The Ascend 910D is expected to be an evolution of this design, but tailored for China’s domestic supply chain. Because U.S. sanctions have cut Huawei off from leading-edge foundry technology, the 910D will likely be manufactured by SMIC on its “N+2” 7nm process (a Chinese equivalent to TSMC’s 7nm) rather than TSMC’s 5nm. Exact core counts or clock rates haven’t been published, but insiders say Huawei is aiming to surpass the H100’s performance benchmark.

Given that the Ascend 910C (a dual-chiplet design) achieved only about 60% of an H100’s inference speed, the 910D will likely need more chiplets or optimized cores to catch up. The chip’s power draw, memory configuration (HBM2e or HBM3), and die size are unknown, but it would not be surprising if Huawei targets a similar or somewhat lower TDP (the original was 350 W) and comparable memory bandwidth.

In other words, the 910D is Huawei’s technical gamble to deliver near-H100 class throughput using China’s own fabs.

Nvidia Hopper H100: The Benchmark AI Accelerator

By comparison, Nvidia’s Hopper H100 GPU is a massive beast. Built on TSMC’s 4N (5nm) process, the H100 packs roughly 80 billion transistors into a single chip with up to 144 Streaming Multiprocessors (SMs). It supports up to 80 GB of ultra-fast HBM3 memory, with an aggregate bandwidth of about 3.0 TB/s – roughly 50% higher than the prior Ampere A100, see: en.wikipedia.org.

In raw compute terms, an H100 can deliver on the order of a few petaflops of AI performance (e.g. ~1400 TFLOPS with sparsity and FP16/Tensor Core math enabled), and it introduces new features like Nvidia’s “Transformer Engine” for fast FP8 matrix math. The SXM5 variant of H100 is rated at 700 W, underscoring its extreme power envelope. Despite its efficiency improvements over Ampere, it is still a very hot chip that requires custom cooling and power delivery.

Nvidia designed the H100 (codenamed Hopper) specifically for datacenter AI training workloads. It supports features like NVLink interconnects, multi-instance GPU partitioning, and compatibility with CUDA, cuDNN, and the broader Nvidia ecosystem. It is the chip of choice for global AI leaders, but it has been barred from China. In fact, the U.S. government enacted export controls in 2022 that explicitly banned the sale of H100 (and the newer H800/H20) GPUs into China.

The Chinese market instead got a hobbled variant called the H800 with lower performance, leaving a void that Huawei and other Chinese companies are eager to fill.

Ascend 910D vs. Nvidia H100: A Spec Comparison

The table below compares the key hardware specs of Huawei’s Ascend 910 series (including the expected 910D) with Nvidia’s H100. While exact figures for the 910D are not public, this side-by-side highlights the known differences in architecture and manufacturing:

Specification	Ascend 910 (2019)	Ascend 910D (2025)	Nvidia H100 (2022)
Compute Cores	32 Da Vinci Max AI cores (4 clusters)	(TBD; likely multi-chip design)	Up to 144 SMs (Streaming Multiprocessors)
MACs per Core	4096 FP16 + 8192 INT8 per core	(expected similar Da Vinci engines)	Varies (with FP8/FP16 Tensor Cores)
Peak Performance (FP16)	256 TFLOPS	Target: exceed H100 (~1000+ TFLOPS)	~1400 TFLOPS (with sparsity)
INT8 Performance	512 TOPS	—	(Not typically quoted; integrated in Tensor Core TFLOPS)
Memory Type	4× HBM2e stacks	Likely HBM2e or HBM3	80 GB HBM3
Memory Bandwidth	~1.2 TB/s	(unknown; likely ≥1.5 TB/s)	3.0 TB/s
Process Node	TSMC 7nm EUV	SMIC 7nm (N+2)	TSMC 4N (5nm)
Transistors	(not disclosed)	(unknown; likely >100B if multi-chip)	~80 billion
Die Size	1228 mm² (major chip + interfaces)	(likely multi-die module; TBD)	(not publicly detailed)
Thermal Design Power	350 W	(estimated 350–450 W per chip)	700 W (SXM5 form factor)
Availability	Commercialized (China/elsewhere)	Expected China-only (mid-2025)	Global (limited by US export rules)
Software Ecosystem	MindSpore/CANN, Huawei AI Stack	MindSpore/CANN, domestic frameworks	CUDA, TensorRT, cuDNN, broad ML ecosystem

The table highlights the gaps: Nvidia’s H100 uses a more advanced process (5nm vs. China’s 7nm) and has far higher memory bandwidth and raw performance. However, Huawei’s Ascend chips have narrowed the gap rapidly. For instance, the 910C (a two-chiplet variant) is said to deliver roughly H100-level throughput in Chinese benchmarks, see: en.wikipedia.org. Indeed, DeepSeek (a Chinese AI research firm) reports that the 910C achieves about 60% of H100’s inference speed.

Huawei is pushing the 910D to exceed that – a bold claim given the technological handicap. Whether it can truly match Nvidia’s FP16/Tensor performance remains to be seen, but the design goals are clear: China needs a homegrown chip that can keep pace.

China’s AI and Semiconductor Strategy

The Ascend 910D is not an isolated project but part of a larger national strategy. In the context of the US-China “tech war”, Beijing has emphasized self-reliance in advanced technology. The government’s “dual circulation” and “Made in China 2025” initiatives call for domestic alternatives to foreign chips, especially after the Huawei bans.

As one commentary notes, Huawei’s 910D push is “bolstered by China’s broader drive for technological self-reliance amid escalating U.S.-China trade tensions”. In practice, this means heavy R&D investment, subsidies, and preferential procurement for homegrown semiconductors.

At the same time, the U.S. has steadily tightened export controls on chip manufacturing tools and designs. For example, the Commerce Department in 2023 expanded restrictions that choke off Chinese access to cutting-edge lithography (including pressure on ASML’s DUV sales). These controls force China to develop its own fabs (like SMIC) and tools, slowing progress. Huawei’s reliance on SMIC for 7nm “N+2” production is a direct consequence of these bans.

Without EUV lithography, SMIC’s nodes lag behind TSMC’s, which makes designing a chip to rival H100 inherently difficult. Nevertheless, Beijing sees this as a necessary challenge: any breakthrough in chips like the Ascend 910D would represent a significant win in the quest for technological sovereignty.

Expected Industry Adoption in China

Given the enforced isolation of Nvidia products, Chinese companies and governments are eager to adopt domestic AI accelerators. Even before 910D, Huawei’s Ascend processors have found traction in China’s tech giants. Wall Street Pit reports that firms like ByteDance and Baidu have already deployed the Ascend 910B/910C for AI training tasks, see: wallstreetpit.com. Baidu, for example, not only builds its own Kunlun AI chips, but also partners with Huawei on training large language models in its cloud. ByteDance (the TikTok parent) similarly has massive AI needs and has publicly experimented with Huawei’s hardware.

With the 910D, Huawei will market directly to data centers, cloud providers, and AI labs. Chinese hyperscalers – including Alibaba Cloud, Tencent Cloud, and Huawei Cloud – are expected to offer Ascend-based instances for model training. Cloud vendors already offer InfiniBand interconnects and NVSwitch-like fabrics for Nvidia GPUs; they may deploy similar infrastructure for Ascend cards.

In smart cities and surveillance, the focus is mostly on inference at the edge, but training new computer vision or NLP models can leverage Ascend accelerators in central clusters. Autonomous vehicle developers (e.g. Huawei’s autonomous driving arm, Pony.ai, or Xpeng) could also use Ascend chips to simulate driving scenarios and train neural networks.

Beyond private tech firms, the Chinese government and research institutions have strong incentives to use Ascend. National supercomputing centers and AI institutes (e.g. those at the Chinese Academy of Sciences) will likely test the 910D in HPC clusters, especially if imports of Nvidia or Intel chips are restricted.

Universities running AI research programs may standardize on MindSpore and Ascend hardware, aligning with national policy. In short, anywhere high-end AI training is needed in China, the Ascend 910D stands to be a default option. (For perspective, Huawei reportedly expects to sell over 800,000 units of Ascend 910B/C combined in 2025 – a scale usually reserved for smartphone chips.)

Huawei’s AI Software Ecosystem

Hardware is only one piece of the puzzle. Huawei has invested heavily in a software ecosystem tailored to Ascend chips. At the core is the MindSpore framework, an open-source deep learning platform Huawei developed in 2019 and fully open-sourced later on. MindSpore is designed to be hardware-agnostic (it can run on CPUs, Ascend NPUs, and even GPUs), but it has first-class support for Ascend accelerators.

Under the hood, Huawei provides the Compute Architecture for Neural Networks (CANN), a library of optimized kernels (like Nvidia’s cuDNN) for running neural networks on Ascend. These tools ensure that TensorFlow, PyTorch, or MindSpore code can leverage the Da Vinci cores efficiently.

Each Ascend AI core is tightly coupled with this stack. In fact, the Da Vinci architecture documentation notes that a core “includes a new AI framework called MindSpore, a platform-as-a-service product called ModelArts, and a lower-level library called Compute Architecture for Neural Networks (CANN)”. In practice, Huawei runs developer conferences, online training, and extensive documentation to get researchers up to speed on this software.

Its public cloud (Huawei Cloud) offers ModelArts, a managed AI development platform that lets users train large models on Ascend clusters without worrying about low-level details. All of this is aimed at building a Chinese AI stack that parallels Nvidia’s, with an emphasis on integration into domestic data centers.

Still, it remains to be seen how quickly global AI developers will adopt this ecosystem. Many pre-trained models and libraries are written with CUDA in mind. Converting or recompiling them for Ascend chips takes effort, and tooling for profiling/debugging is less mature than Nvidia’s. Huawei is aware of this gap. They continue to improve compatibility layers (for example, versions of MindSpore can import PyTorch models), but the onus is on the developer.

Nevertheless, within China’s firewall, MindSpore and Ascend have an edge. The government and industry can collectively invest in optimizing popular AI workloads for Ascend (as they have with Kunpeng and Kylin CPU projects). In the long term, Huawei’s integrated hardware-software approach – sometimes called “hardware-software co-design” – could yield efficiency gains and attract a community of developers accustomed to the Chinese AI stack.

Challenges Ahead

Huawei’s task is daunting. Supply chain constraints are the first hurdle. SMIC’s 7nm “N+2” process is a few generations behind TSMC’s 5nm, and it lacks EUV lithography. This typically means lower transistor density and lower clock speeds. Huawei must compensate by using more chiplets or clever architecture. The 910D reportedly involves multiple dies (much like 910C’s two chiplets), which complicates design and yields.

Any yield losses or defect rates will raise costs, and SMIC’s capacity is limited. In short, making enough chips with good performance will not be easy. A Wall Street Pit analysis notes that Huawei “has navigated production challenges, including reliance on China’s SMIC for 7nm fabrication and prior access to TSMC’s manufacturing before tighter restrictions”. That suggests the company knows the risks: they once had access to cutting-edge fabs, and now they must stretch what they have.

Software and ecosystem maturity are another issue. As noted, most advanced AI research today is done with Nvidia-compatible stacks. Even if Huawei ports popular frameworks, performance tuning for Ascend can lag. Users will need stable drivers, updated compilers, and robust libraries. Nvidia’s CUDA and TensorRT ecosystems have years of development; Huawei’s CANN is newer. Bugs or missing features could slow adoption.

Also, international customers (if any) will be hesitant: will Ascend run publicly available ML benchmarks or models out of the box? These are valid concerns, although they are being actively worked on.

Furthermore, Huawei’s chip isn’t alone. Chinese competitors are also developing AI accelerators: Alibaba’s Hanguang 800 chip (for inference) and Cambricon’s AI chips for data centers. Even Baidu has Kunlun for training. Huawei must prove that 910D can hold its own not just against Nvidia, but against emerging domestic rivals.

Finally, global geopolitics looms: the U.S. could respond to a successful 910D by tightening controls further (for example, banning EDA software exports or any component needed by SMIC). There’s also the possibility that even the 910D itself, if brought to foreign markets, might get caught in export screening.

Huawei’s Strategic Advantages

Despite the challenges, Huawei has powerful advantages at home. Government backing is immense. China has explicitly encouraged chip self-sufficiency through subsidies, national champions programs, and guaranteed procurement. Huawei – once blacklisted globally but still a national tech icon – is a natural beneficiary. Its projects in telecommunications and smartphones will likely integrate Ascend-based AI features, boosting domestic demand.

In fact, reports suggest the company is preparing to sell hundreds of thousands of Ascend cards in China’s coming AI server boom.

Another advantage is tight alignment with national strategy. Unlike foreign vendors, Huawei is fully within China’s economic plan. It can coordinate with state labs and defense projects that also need AI compute (e.g. simulation or reconnaissance applications). The security apparatus may trust a homegrown chip more than an American one, leading to faster adoption in classified programs.

There are also market incentives: Chinese cloud providers and state enterprises face pressure to use local technology where possible. In many ways, Huawei is competing on home turf with home-field advantage.

Finally, Huawei benefits from lessons learned over the last few years. Its experience with the Ascend 310, 910, 910B/C has built an engineering team that knows how to optimize Da Vinci cores, memory interfaces, and multi-chip modules. The fact that Ascend 910B and 910C already power real AI workloads is proof that the company can iterate and improve.

And with US sanctions being a constant reality, Huawei has been laser-focused on workaround solutions (for example, sourcing alternate parts or designing with wider safety margins). All these factors give Huawei a head start relative to a new entrant.

Global Ramifications if 910D Succeeds

If Huawei truly delivers on the 910D’s lofty goals, the effects would ripple across the tech world. First, Nvidia’s dominance would face a real challenge in Asia, the fastest-growing AI market. Chinese companies – from startups to giants – could train and deploy cutting-edge models without Nvidia’s chips, potentially at lower cost. That might slow down Nvidia’s growth and force it to innovate even faster (weaker pricing power, accelerated R&D on successor chips).

Second, an Ascend 910D that matches or beats the H100 would underscore China’s progress in semiconductors, encouraging more investment and perhaps even international partnerships (non-U.S. firms might explore deals with Huawei if legally allowed). Other countries watching this space (like India or EU nations) could decide to invest in local AI chip projects too, fearing dependence on any single supplier.

However, there are downsides. Technology decoupling could deepen: we might end up with two largely separate AI ecosystems (Nvidia/CUDA world vs. Huawei/MindSpore world). This fragmentation could slow overall AI progress (less sharing of benchmarks, models, and tools globally). On the security front, a more powerful Chinese AI infrastructure could accelerate advances in surveillance, autonomous weapons, or cyber warfare – an aspect that Western governments will monitor closely.

In short, the stakes are high. A successful Ascend 910D would signal that sanctions are not a permanent barrier, and that Chinese tech is resilient. It might even inspire the Chinese public and tech community, much like how Russia’s space achievements were a source of national pride decades ago. But it could also trigger countermeasures, such as even tighter export restrictions or new rules around AI chip classification. The world will be watching every benchmark run and deployment of 910D systems closely.

Conclusion: The Road Ahead

Huawei’s Ascend 910D embodies China’s determination to carve out its own path in the AI era. If it works, it will provide the local computing muscle for China’s ambitious AI plans – from giant language models to advanced autonomous systems – without relying on a U.S. juggernaut. It will also prove the resilience of Huawei and Chinese engineering under pressure. Even if the 910D falls short of Nvidia’s H100 in raw numbers, the effort itself narrows the gap and forces a competitive response.

Looking forward, Huawei will continue iterating on AI chips (Ascend 920 is already announced for late 2025 with performance near Nvidia’s banned H20). Meanwhile, Nvidia will not sit still, and global AI research will push both companies onward. In the end, innovation thrives on competition. For now, Huawei’s journey with the 910D is a high-stakes chapter in the larger story of Chinese technology’s rise – one that will shape not just China’s AI industry, but potentially the global AI landscape.