Nvidia Delays Next-Gen Blackwell GPUs Due To Design Issues: Reports

While HGX server platforms with Nvidia’s B100 and B200 are ‘effectively being canceled outside of some initial lower volumes,’ the chip designer is making its upcoming flagship GB200 Superchip a priority while introducing a new GPU to help satisfy demand, a research firm says.

Nvidia is reportedly delaying the release of its next-generation Blackwell GPUs by three or more months due to technical issues with the underlying architecture.

The news was first reported Friday by tech publication The Information, which cited two unnamed sources, and it was corroborated two days later by a separate report from semiconductor analysis firm SemiAnalysis, which said the first Blackwell GPU design will now arrive in the fourth quarter instead of the third quarter.

An Nvidia spokesperson didn’t directly comment on the reports but said the company is “on track” to ramp production of Blackwell GPUs in the second half of the year while demand for its current-generation Hopper GPUs, like the H200, is “very strong.”

The reports of delays arrived as Nvidia seeks to defend and grow its dominance of the AI computing space by making its chips, systems, software and service crucial to the development and deployment of AI applications. At the same time, rivals ranging from AMD to Amazon Web Services are trying to whittle away at Nvidia’s influence with their own AI chips.

According to estimates by SemiAnalysis, Nvidia’s B200, which will feature 192 GB of high-bandwidth memory (HBM), will begin shipping at lower volumes than the company originally planned by the end of the fourth quarter.

However, the firm indicated that Nvidia’s HGX server platforms for the B200 along with the B100 design—originally slated to launch in the third quarter— are “effectively being canceled outside of some initial lower volumes.”

In their place, Nvidia plans to satisfy demand for “lower-end and mid-range AI systems” with a new Blackwell design called the B200A, according to SemiAnalysis. This design will use the same GPU die that is being used in a product meant to meet U.S. export restrictions in China. Supported by the HGX platform, the B200 will come with 144 GB of HBM3e memory and up to 4 TB/s of memory bandwidth, which puts it slightly above the H200 but far below the B200 in terms of capacity and below the H200 in terms of bandwidth.

The GB200 Grace Blackwell Superchip, which is seen as Nvidia’s flagship product for the most demanding generative AI workloads, is now expected to start shipping in the first quarter of 2025 instead of this year’s fourth quarter, according to the firm.

Nvidia adjusted the release schedule and made changes to its Blackwell lineup because it’s “encountering major issues in reaching high volume product,” SemiAnalysis said.

The main issue is around the design of the Blackwell architecture and how it relies on an advanced semiconductor packaging technology called CoWoS-L from Taiwanese foundry giant TSMC, according to the firm. The technology is meant to enable high-speed communication between the compute and memory elements of the GPU, but it’s in limited supply as TMSC works on increasing production capacity, and Nvidia has been hitting obstacles with its implementation of the technology, SemiAnalysis said.

This is resulting in Nvidia getting fewer Blackwell chips than it anticipated, and it has prompted the company to focus its supply on the GB200 Superchips and introduce the B200A, which uses a less sophisticated version of CoWoS, as a way to serve customers who don’t want the power-hungry rack-scale systems that use the GB200, the firm said.