10 Big Nvidia GTC 2025 Announcements: Blackwell Ultra, Rubin Ultra, DGX Spark And More
With Nvidia CEO Jensen Huang asserting that the tech industry needs 100 times more computation than what was needed a year ago due to the rise of reasoning models, the AI leader presented at the company’s GTC 2025 event how it will fulfill those needs with increasingly powerful AI computing platforms, networking gear and software.
Nvidia CEO Jensen Huang believes the tech industry needs substantially more computation than is currently available to handle the rise of reasoning models like DeepSeek-R1 and the agentic AI workloads they will power.
“The amount of computation we need at this point as a result of agentic AI, as a result of reasoning, is easily 100 times more than we thought we needed this time last year,” Huang said during his Tuesday keynote at the company’s GTC 2025 event in San Jose, Calif., which attracted more than 25,000 attendees across several industries.
[Related: 12 New AI-Focused Technologies For Nvidia AI Ecosystem At Nvidia GTC 2025]
With reasoning models significantly increasing the number of tokens—or words and other kinds of characters—used for queries as well as answers when compared to traditional large language models, Huang presented at GTC how Nvidia will fulfill those needs with increasingly powerful AI computing platforms, networking gear and software.
Those platforms include the newly revealed GB300 NVL72 platform, which will be powered by Nvidia’s next-generation Blackwell Ultra GPU and launch in the second half of this year as the successor to the GB200 NVL72 that started shipping a few months ago.
Huang also provided details about the AI computing platforms that will follow, with the Vera Rubin NVL144 platform hitting the market in the second half of 2026, the Rubin Ultra NVL576 platform arriving in the second half of 2027 and a next-generation platform using the new Feynman GPU coming out sometime in 2028.
But Nvidia didn’t just focus on the fastest, most power-hungry platforms that will go into future AI data centers. Huang also announced the air-cooled, Blackwell Ultra-based B300 NVL16 platform for data centers as well as a new generation of Spectrum-X Ethernet and Quantum-X InfiniBand networking platforms that will use silicon photonics to reduce energy consumption and, as a result, pave the way for larger GPU clusters.
During Huang’s roughly two-hour presentation, he also showed off the upcoming DGX Spark and DGX Station PCs for AI developers as well as the new RTX Pro Blackwell GPUs, which will combine AI and graphics prowess for PCs and servers.
The Santa Clara, Calif.-based company also used GTC to introduce new software and models to improve the way AI applications are developed and deployed. These new offerings included Nvidia Dynamo, an open-source inference software framework that can significantly increase the performance of reasoning models on Nvidia GPUs.
What follows are these and other big announcements Nvidia made at this year’s GTC 2025 event, which also included a reference design for AI-focused data center storage solutions and the 14 winners of this year’s 2025 Americas Nvidia Partner Network awards.
Blackwell Ultra: New GPU, DGX Systems In 2025
One of the main headlines was the reveal of Nvidia’s new Blackwell Ultra GPU architecture, which the company said is built for AI reasoning models like DeepSeek R1 and claimed it can significantly increase the revenue AI providers can potentially generate.
Compared to Blackwell, which started shipping in systems a few months ago, Blackwell Ultra increases the maximum HBM3e high-bandwidth memory by 50 percent to 288 GB and boosts 4-bit floating point (FP4) inference performance by just as much, Nvidia said.
The company said Blackwell Ultra-based products from technology partners are set to debut in the second half of 2025. These partners include OEMs such as Dell Technologies, Cisco, Hewlett Packard Enterprise, Lenovo and Supermicro as well as cloud service providers like Amazon Web Services, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure.
With Blackwell Ultra for data centers, Nvidia is introducing two platforms.
The liquid-cooled GB300 NVL72 platform will consist of 72 Blackwell Ultra GPUs and 36 Grace CPUs and feature improved energy efficiency and serviceability, according to Nvidia. This allows the platform to achieve 1.1 exaflops of FP4 dense computation, and it comes with 20 TB of high-bandwidth memory as well as 40 TB of fast memory. The platform’s NVLink bandwidth can top out at 130 TBps while networking speeds reach 14.4 TBps.
The air-cooled HGX B300 NVL16 platform comes with 16 Blackwell Ultra GPUs. The company said it provides 11 times faster inference on large language models, seven times more compute and four times larger memory compared to a Hopper-based platform.
Nvidia plans to provide these platforms in new DGX SuperPod designs, with eight units of the GB300 NVL72 making up one and the HGX B300 NVL16 serving as the basis for another. The company plans to make the B300-based DGX SuperPod design available for the company’s modular MGX server racks and enterprise data centers.
576-GPU Rack-Scale Platform ‘Rubin Ultra’ Set For 2027
Huang said the company plans to release a rack-scale architecture for AI data centers that will connect 576 next-generation GPUs in the second half of 2027.
Huang made the disclosure in a road map update during his GTC keynote on Tuesday after sharing details of Nvidia’s next-generation Blackwell Ultra GPU and the associated GB300 NVL72 and B300 NVL16 platforms that will launch in the second half of 2025.
As Nvidia announced last year, the company plans to follow up Blackwell Ultra with a brand-new GPU architecture in 2026 called Rubin, which will use HBM4 high-bandwidth memory for the first time. This will coincide with several other new chips, including a follow-up to Nvidia’s Arm-based Grace CPU called Vera.
In his keynote this week, Huang provided more details about Rubin, which he said would be part of the liquid-cooled Vera Rubin NVL144 platform that will debut in the second half of 2026 and connect 144 Rubin GPUs with Vera CPUs, which will sport 88 custom Arm cores, using its new, sixth-generation NVLink chip-to-chip interconnect.
The Vera Rubin NVL144 platform will have the ability to hit 3.6 exaflops of 4-bit floating-point (FP4) inference performance and 1.2 exaflops of 8-bit floating-point (FP8) training performance, which Nvidia said will make it 3.3 times faster than the new GB300 NVL72.
The platform will feature 13 TBps of HBM4 memory bandwidth and 75 TB of fast memory, a 60 percent increase from the GB300 NVL72. The NVLink 6 bandwidth will hit 260 TBps, double that of the GB300 NVL72. The ConnectX-9 SmartNIC will hit 28.8 TBps, also double.
Rubin is made up of two reticle-sized GPUs, similar to Blackwell, but it will be capable of 50 petaflops of FP4 computation and will feature 288 GB of HBM4 memory, like Blackwell Ultra. However, instead of counting each Rubin chip package as one GPU, like Nvidia did with Blackwell, it will consider the two GPUs per package as individual logic processors.
Nvidia plans to follow up the Vera Rubin NVL144 with the liquid-cooled Rubin Ultra NVL576 in the second half of 2027. While it will keep the Vera CPU, the platform will come with a new GPU package called Rubin Ultra that will expand in size to four reticle-sized GPUs, featuring 1 GB of HBM4e memory and 100 petaflops of FP4 performance.
As the name implies, the Rubin Ultra NVL576 will connect 576 Rubin Ultra GPUs with Vera CPUs using a seventh generation of Nvidia’s NVLink. It will be capable of 15 exaflops of FP4 inference performance and 5 exaflops of FP8 training performance, which Nvidia said will make it14 times faster than the GB300 NVL72 platform.
The platform will feature 4.6 PBps of HBM4e memory bandwidth and 375 TB of fast memory, eight times faster than the GB300 NVL72. The NVLink 7 bandwidth will run 12 times faster at 1.5 PBps while the ConnectX-9 SmartNIC will hit 115.2 TBps, eight times greater than the GB300 NVL72.
On top of providing those details, Huang disclosed that Nvidia plans to deliver a next-generation platform with a new GPU called Feynman, which will feature a new high-bandwidth memory format, in 2028. This platform will also feature Vera CPUs, a next-generation NVLink interconnect, eighth-generation NVSwitchs and ConnectX-10 SmartNICs while using 204 TBps Spectrum 7 Ethernet switches.
The road map reflects the one-year release cadence Nvidia put into action for its data center GPUs and associated platforms last year with the launch of its H200 GPU, which was followed with the Blackwell-based platforms that started shipping a few months ago.
Spectrum-X, Quantum-X Silicon Photonics Switches
Nvidia revealed plans to release new Spectrum-X Ethernet and Quantum-X InfiniBand networking switches that use silicon photonics to lower energy consumption and, as a result, enable larger-scale GPU clusters.
The company said the Quantum-X Photonics InfiniBand switches will launch later this year while the Spectrum-X Photonics Ethernet switches will arrive next year from “leading infrastructure and system vendors.”
By integrating photonics on the switch silicon, the new switches reduce the number of lasers by four times compared to traditional pluggable switches that use optical transceivers, according to Nvidia. This results in a reduction in energy consumption by 3.5 times and an improvement in signal integrity by 63 times.
Gilad Shainer, senior vice president of networking at Nvidia, said this also results in 10 times greater resilience for the network because of the improved signal integrity as well as the fewer components required.
In addition, getting rid of the need for optical transceivers will result in 30 percent quicker data center buildouts, according to Shainer.
The significant reduction in energy consumption enabled by silicon photonics means that data centers can support three times more GPUs than those relying on traditional pluggable optics at the same power envelope, the executive said.
“We can bring more GPUs under the same power envelope, essentially enabling further scale and increasing compute density,” Shainer added.
Nvidia Dynamo: Accelerating Reasoning Models
Nvidia announced an open-source inference software framework called Nvidia Dynamo, which it said will enable AI reasoning models to run faster and effectively across large data centers “at the lowest cost and with the highest efficiency.”
Calling Dynamo the successor to its Triton Inference Server software, Nvidia said the new software framework is meant to maximize the amount of revenue AI application providers can earn from tokens generated by reasoning models.
When software framework is applied to a large cluster of GB200 NVL72 racks running the DeepSeek R1 model, the number of tokens generated increase by more than 30 times per GPU, according to Nvidia executive Ian Buck.
Dynamo accomplishes this first by “splitting the processing of the input tokens, the query […] and the reasoning tokens that happen in the back end,” Buck said.
The software framework then optimizes the way these partitions are processed in parallel, with the input tokens taking advantage of the GPU’s 4-bit floating point throughput and the output tokens scaling with the high-speed bandwidth and communication of the NVlink interconnect linking GPUs, he added.
Nvidia said Dynamo can also support disaggregated serving, which consists of assigning the “different computational elements” of large language models to different GPUs in the same data center. This will result in higher throughput and, as a result, faster responses for users creating queries for large language models.
Nvidia plans to make Dynamo available in Nvidia NIM microservices, and it will also be supported by a future release of the Nvidia AI Enterprise platform. Dynamo support the PyTorch, SGLand, Nvidia TensorRT-LLM and vLLM software libraries.
DGX Spark Mini PC For AI Developers
Nvidia said it plans to launch the DGX Spark mini desktop PC later this year alongside OEM partners who plan to release their own versions.
Intended for AI developers, the company said DGX Spark will feature its GB10 Grace Blackwell Superchip to deliver up to 1,000 trillion operations per second of AI computation for the fine-tuning and inferencing of reasoning models.
The small-form-factor PC will also feature 128 GB of unified coherent system memory, which will support AI models with up to 200 billion parameters.
To support models with up to 405 billion parameters, two DGX Spark systems can be connected with a cable through their Connect-X networking ports.
Asus, Dell Technologies, HP Inc. and Lenovo plan to release their own versions of DGX Spark under different names, such as the Dell Pro Max with GB10.
DGX Station Workstation PC With GB300 Superchip
Nvidia revealed a new version of its DGX Station workstation PC that will be powered by a desktop version of its GB300 Grace Blackwell Superchip.
Calling it “ultimate desktop computer for the AI era,” the DGX Station will be capable of 20 petaflops of AI performance, and it will come out later this year in different versions from several OEMs, including Dell Technologies, HP Inc., Supermicro and Asus later this year.
The DGX Station will also feature 784 GB of unified system memory and the Nvidia ConnectX-8 SuperNIC, which enables networking speeds of up to 800 Gbps for connecting multiple DGX Stations.
RTX Blackwell Pro GPUs For PCs, Servers
Nvidia announced RTX Pro Blackwell GPUs for PCs, laptops and servers with “groundbreaking AI and graphics performance” that will “redefine visualization, simulation and scientific computing for millions of professions”
Compared with the Ada Lovelace-based RTX Pro GPUs, the new Blackwell models come with an improved streaming multiprocessor that delivers 50 percent faster throughput and new neutral shaders that “integrate AI inside of programmable shaders” for AI-augmented graphics, according to Nvidia.
They also sport fourth-generation RT Cores that can deliver up to double the ray tracing performance, fifth-generation Tensor Cores that enable up to 4,000 AI trillion operations per second and add support for FP4 precision as well as larger, faster GDDR7 memory.
The RTX Pro GPUs for laptops, spanning from the high-end 5000 series to the low-end 500 series, will support up to 24 GB of GDDR7 memory with error-correction while the desktop models, spanning from the 5000 series to the 4000 series, will max out to 96 GB. The RTX Pro 6000 series for data centers will also top out to 96 GB.
Nvidia said the RTX Pro 6000 data center GPU will be made available soon in server configurations from OEMs such as Cisco Systems, Dell, Hewlett Packard Enterprise and Lenovo. The GPU will become available in instances from Amazon Web Services, Google Cloud, Microsoft Azure and CoreWeave later this year.
The RTX Pro desktop GPUs, on the other hand, are expected to debut in April with availability from distributors PNY and TD Synnex. They will then arrive in PCs the following month from OEMs and system builders like Boxx, Dell, HP, Lambda and Lenovo.
The RTX Pro laptops GPUs will land later this year from Dell, HP, Lenovo and Razer.
Nvidia AI Data Platform, Storage Certification Program
Nvidia announced a new, customizable reference design for enterprise storage platforms being used in data centers to run AI query agent applications.
Called the Nvidia AI Data Platform, the reference design is meant for storage providers whose products are approved by the Nvidia-Certified Systems program, which was expanded to include storage solutions for AI infrastructure.
Part of the Nvidia AI Data Platform is the new Nvidia AI-Q Blueprint, which is designed for the development of AI agents and uses Nvidia NeMo Retriever microservices to speed up the rate at which data is extracted and retrieved by up to 15 times on Nvidia GPUs.
The AI-Q Blueprint has the ability to “access large-scale data quickly and process various data types, including structured, semi-structured and unstructured data from multiple sources, including text, PDF, images and video,” according to the company.
Nvidia said it is collaborating with several data platform and storage providers, including DDN, Dell Technologies, Hewlett Packard Enterprise, Hitachi Vantara, IBM, NetApp, Nutanix, Pure Storage, Vast Data and Weka.
Solutions created with the Nvidia AI Data Platform are expected to become available this month from these data platform and storage providers.
Nvidia Releases Open Reasoning Models For Agentic AI
Nvidia marked the launch of new Llama Nemotron reasoning models that are designed for advanced AI agents that can work individually or as a team to “solve complex tasks.”
These open models are based on Meta’s Llama models, and they have been enhanced by Nvidia in the post-training process to “improve multistep math, coding, reasoning and complex decision-making,” according to the company.
“This refinement process boosts accuracy of the models by up to 20% compared with the base model and optimizes inference speed by 5x compared with other leading open reasoning models,” Nvidia said. “The improvements in inference performance mean the models can handle more complex reasoning tasks, enhance decision-making capabilities and reduce operational costs for enterprises.”
Companies that plan to use Nvidia’s Llama Nemotron reasoning models include Accenture, CrowdStrike, Deloitte, Microsoft and ServiceNow.
The new models will be made available as Nvidia NIM microservices in Nano, Super and Ultra sizes, which vary in accuracy and throughput. While they will be free for development, testing and research, they will require the Nvidia AI Enterprise commercial software platform when deployed into live production.
Nvidia Names Winners In 2025 Americas Partner Awards
Nvidia named the 14 recipients of its 2025 Americas Nvidia Partner Network awards, and the winners represent solution providers who “deeply understand” the company’s full-stack AI platform, according to the region’s channel chief.
Standing out among Nvidia’s 500 North American channel partners, the recipients include multi-year winners like Mark III Systems and World Wide Technology—both of which won two awards this year—as well as newcomers like Ahead and Advizex.
[Related: Meet Nvidia’s 14 Top Americas Partners Who ‘Deeply Understand’ Its Full-Stack AI Platform]
In an exclusive interview with CRN, Nvidia Americas Channel Chief Craig Weinstein said it chose the winners based on several criteria: industry and go-to-market alignment, innovation and expertise, customer success, training and certification, marketing and go-to-market activities, partnership with the ecosystem and sales performance.
“I would say almost all of the award winners have one commonality, which is they’re going to market with Nvidia’s full stack, our platform,” he said.
