Nvidia Seeks To Turbocharge AI PC Development With GeForce RTX 50 GPUs
While Nvidia positioned the Blackwell-based GeForce RTX 50 GPUs as a significant upgrade for PC gamers at CES 2025, the AI computing giant is also hoping to lure AI developers and content creators with a bevy of new hardware and software capabilities.
Nvidia has revealed its much-anticipated GeForce RTX 50 GPUs for desktops and laptops, relying on the same Blackwell architecture at the center of its new AI data center chips to bring forth PC advancements in graphics, content creation and productivity.
At its CES 2025 keynote Monday, Nvidia called the GeForce RTX 50 series the “most powerful” consumer GPUs “ever created,” positioning them as a significant upgrade for gamers by claiming they will offer up two times faster graphics performance than the previous generation when using the new AI-powered DLSS 4 image upscaling feature.
[Related: Opinion: Why Nvidia, MediaTek May Enter The PC CPU Market Soon]
But the AI computing giant is also hoping to lure AI developers and content creators with a bevy of new hardware and software capabilities. For instance, the GPU family’s new fifth-generation Tensor Cores pack a “massive amount of AI processing horsepower” to run AI models two times faster using less graphics memory when taking advantage of the newly supported 4-bit floating point (FP4) format, according to Nvidia.
On the desktop side, the flagship GPU, the 32-GB GeForce RTX 5090, will cost $1,999 when it becomes available along with the $999, 16-GB GeForce RTX 5080 on Jan. 30. The $749, 16-GB GeForce RTX 5070 Ti and $549, 12-GB GeForce RTX 5070 will become available in February. The GPUs will be available from Nvidia and add-in board partners as well as in desktops from system builders such as Falcon Northwest and Maingear.
Laptops equipped with RTX 5090, RTX 5080 and RTX 5070 Ti GPUs will debut in March while laptops with the RTX 5070 will launch in the following month from several OEMs, including Acer, Asus, Dell Technologies, HP Inc., Lenovo and MSI.
Nvidia Vows ‘Pipeline’ Of Microservices For AI PC Development
To foster development of AI PC applications on the new RTX 50 series and other recent generations of GeForce GPUs, Nvidia said it plans to release a “pipeline” of Nvidia NIM microservices and Nvidia AI Blueprints that use first- and third-party models to enable use cases ranging from PDF extraction, computer vision and speech to image generation, large language models and embedded models for retrieval-augmented generation.
These models include Nvidia’s newly released Llama Nemotron family of models, which are versions of Meta’s Llama models that have been optimized to aid with the development of agentic AI use cases ranging from instruction following and chat to coding and math.
To demonstrate how NIMs can be used to build AI agents and assistants, Nvidia plans to release a vison-enabled PC avatar called Project R2X that can read and summarize documents, fetch information and “assist with desktop apps and video conference calls.” It will also be able to connect with cloud AI services such as OpenAI’s GPT4o.
Nvidia Highlights Content Creation Enhancements
When it comes to content creation, Nvidia said the new RTX 50 GPUs come with new hardware features to boost video editing and 3-D rendering workloads on top of new software capabilities for image generation as well as voice and video communication.
For video editing, Nvidia said the RTX 50 series comes with new video encoders and decoders that provide a “generational leap” in capabilities with support for the 4:2:2 pro-grade color format, the multi-view extension of HEVC (high-efficiency video coding) for 3-D and virtual reality video as well as the new AV1 Ultra High Quality mode.
For 3-D rendering, Nvidia said the GPUs come with fourth-generation RT Cores, which enable applications to run 40 percent faster. One way the chips speed up 3-D rendering is through its fourth-generation AI-based image upscaling technology, DLSS 4, which introduces Multi Frame Generation to increase frame rates.
For image generation, Nvidia highlighted how the new FP4 support in RTX 50 GPUs will enable models used for such purposes to take up significantly less VRAM compared to the default 16-bit floating point (FP16) format.
As an example, the company said the FLUX.1 [dev] model from Black Forest Labs will only need less than 10 GB of memory with FP4, which means it can run on each of the four new RTX 50 GPUs since they range from 32-12 GB in VRAM. By contrast, the FLUX.1 [dev] model running with FP16 would require more than 23 GB of memory, which restricts it to the RTX 4090 and professional GPUs from the last generation.
Nvidia said it plans to offer a NIM microservice based on FLUX.1 [dev], which will be made available in an Nvidia AI Blueprint for 3-D guided image generation next month.
For voice and video communication, Nvidia said it plans to add two new features to the Nvidia Broadcast app for AI-enhanced video and voice effects. The first feature, Studio Voice, will make a user’s microphone sound like a high-quality microphone while the second, Virtual Key Light, can relight a “subject’s face to deliver even coverage.” These will initially require an RTX 4080 or higher when they become available in February.
How The RTX 50 Series Compares To The 40 Series
The flagship GPU in the RTX 50 series, the RTX 5090, is made up of 92 billion transistors, up 21 percent from the 76 billion transistors of its predecessor, the RTX 4090, which debuted in 2022 using Nvidia’s previous-generation Ada Lovelace architecture.
Across the RTX 50 series, the GPUs come with new fourth-gen RT Cores and fifth-gen Tensor Cores as well as a streaming multiprocessor that “has been updated with more processing throughput and a tighter integration with the Tensor Cores in order to optimize the performance of neural shaders,” according to Nvidia.
The RTX 5090 comes with 32 GB of GDDR7 memory and 21,760 CUDA cores, up from the 24 GB of GDDR6X memory and 16,385 CUDA cores of the RTX 4090. The GPU’s boost and base clock frequencies are 2.41GHz and 2.01GHz, respectively, which are lower than the 2.52GHz and 2.23GHz clock speeds of the RTX 4090.
As for performance, the RTX 5090’s Tensor Cores are capable of hitting 3,352 trillion operations per second (TOPS) in AI computing performance while the RT Cores can achieve 318 trillion floating-point operations per second (TFLOPS). These figures are 2.5 times and 60 percent faster, respectively, than the 1,321 AI TOPs achieved by the RTX 4090’s Tensor Cores and the 191 TFLOPS of the GPU’s RT Cores.
Nvidia did not disclose Shader Core performance numbers for the RTX 50 series after providing such information for the previous generation.
The total graphics power required by the RTX 5090 is 575 watts, up 27 percent from the 450 watts needed for the RTX 4090. The lowest-end GPU, the RTX 5070, requires 250 watts, which is 25 percent higher than the 200 watts needed for the RTX 4070.