Forrest Norrod On How AMD Is Fighting Nvidia With ‘Significant’ AI Investments
In an interview with CRN, AMD executive Forrest Norrod talks about how the company is “dramatically” increasing investments in its Instinct data center GPUs to compete with Nvidia and when it plans to make a greater focus on enabling channel partners to sell Instinct-based systems for AI workloads.
AMD’s top data center executive said the chip designer had “no choice” but to “dramatically” increase its investments in AI and release accelerator chips at a faster cadence in the face of unrelenting generative AI innovation and Nvidia’s aggressive strategy.
Forrest Norrod, executive vice president and general manager of AMD’s data center solutions business unit, made the comments in an interview with CRN a day after the Santa Clara, Calif.-based company announced on June 3 that it will release a new data center GPU every year instead of every two years. This new cadence will begin in the fourth quarter with the 288-GB Instinct MI325X, which offers significantly greater high-bandwidth memory than Nvidia’s H200.
[Related: Analysis: As Nvidia Takes AI Victory Lap, AMD Doubles The Trouble For Intel]
What’s driving AMD to move faster is a relentless pace of innovation in generative AI models as well as optimizations in the math operations that are key to their computation.
“This market is moving so fast, there's so much innovation and invention in the models and the math that we've got to maintain a very fast cadence,” said Norrod.
Another major factor behind AMD’s decision to commit what Norrod called “significant” resources to its AI efforts is Nvidia, which was first to announce a plan to release new data center accelerator chips every year instead of every two years last fall. The rival detailed its expanded road map a day before AMD’s announcement, disclosing new GPU architectures set to debut in 2025, 2026 and 2027 following the expected debut of Blackwell later this year.
Norrod said he believes Nvidia decided to speed up its accelerator chip release cadence in response to increasing competition, particularly AMD, which released the Instinct MI300X in December as a data center GPU that competes with Nvidia’s H100 and H200 chips.
“Nvidia, quite candidly, stepped on their accelerator pedal, and when they saw that—'holy crap. AMD has got a real part; they're going to be a real competitor’—they very deliberately stepped on the accelerator trying to block us and everybody else out. And so we're responding to that as well,” he said.
With AMD getting support for MI300 chips from major server OEMs and cloud service providers such as Dell Technologies, Hewlett Packard Enterprise, Supermicro and Microsoft Azure, the company has forecasted $4 billion in data center GPU revenue this year.
While that represents a fraction of the $19.4 billion Nvidia made from data center compute products in the first quarter alone, Norrod said the company’s new one-year cadence for products will allow it to catch up with the schedule for Nvidia’s flagship GPUs.
“We think we are closing the gap, narrowing the gap between the introduction of Nvidia's part and the introduction of our same generation part,” he said.
But AMD’s strategy to combat Nvidia’s AI computing dominance isn’t just about delivering faster performance or higher bandwidth. Norrod said the company’s strategy also relies on the embrace of open software and open standards, which will prevent customers from getting locked into working with one vendor.
“When you’re locked in and there’s no alternative or you’re not able to avail yourself of an alternative, the costs are whatever the supplier chooses to charge. And I would look at Nvidia's financials, it's been remarkable to see the progress they’ve made on both their revenue as well as the margin. So I think there's that obvious issue,” he said.
While AMD has so far been focused on shipping the Instinct MI300X to its largest customers, including cloud service providers and OEMs, the company plans to make a greater emphasis on enabling commercial channel partners to sell servers equipped with the accelerator chip in the second half of this year, according to Norrod.
“You’ll see us shifting very hard to enable enterprise through multiple channels this year, including the channel,” he said.
C.R. Howdyshell, CEO of Cleveland, Ohio-based systems integrator Advizex, told CRN that he finds a lot of promise from how AMD Instinct accelerator chips can help grow his business based on a recent commitment his company received from AI-focused cloud service provider Hot Aisle to buy a cluster of Dell PowerEdge XE9680 servers powered by MI300X GPUs.
“I think we’re going to see more opportunities just like this because […] there are customers [who] want to look at these opportunities to differentiate and think about what makes them different from a customer perspective as well as a go-to market. They can get very good, distinct performance from AMD—enough to make a bet on the business,” he said.
Alexey Stolyar, CTO of Northbrook, Ill.-based systems integrator International Computer Concepts, told CRN that while AMD has made the right move in accelerating its data center road map and customers are looking for alternatives due to pricing and availability reasons, he thinks demand for the company’s Instinct GPUs isn’t as high as he thinks “it should be.”
Part of the issue is that Nvidia’s current-generation GPUs like the H100 are becoming easier to obtain after several months of shortages, according to Stolyar. But he also sees customers continue to prefer Nvidia because of its pervasive software stack.
“People are just so familiar with Nvidia tools that they don't see other tools,” he said.
What follows is a transcript of CRN’s in-depth interview with Norrod, who discussed how the company has “dramatically” increased its investments in AI to compete with Nvidia, how it plans to lean more heavily on channel partners to sell Instinct-powered servers later this year and what factors have led early adopters to use the MI300X instead of Nvidia’s GPUs.
The executive also talked about how he sees AMD’s upcoming Instinct MI325X and MI350 chips competing against current and future Nvidia GPUs, how AMD views the viability of hybrid CPU-GPU packages like the MI300A and Nvidia’s Grace Hopper Superchip, and the benefits of AMD’s open standards approach versus Nvidia’s proprietary platform. In addition, Norrod gave his thoughts on Intel’s move to disclose the pricing of its eight-chip Gaudi 3 platform.
The transcript has been lightly edited for clarity and brevity.
Can you talk about the decision for AMD to move to an annual release cadence for data center accelerators? How much of a lift was it for the company to make a commitment like that? And what changes, if any, did that require?
It's a significant commitment. I think that it's really a function of two things. One is this market is moving so fast, and the demands of the customers are so great that continuing innovation on the silicon side is possible and is warranted. Simple things like math formats. Lisa talks about some the math formats with respect to block float, but there's many other innovations on math formats: handling sparsity, sharding, a couple other things. And so those require us to continually innovate on the silicon side, to support those, and support those efficiently. And so I think part of the dynamic is just this market is moving so fast, there's so much innovation and invention in the models and the math that we've got to maintain a very fast cadence.
And then the other dynamic, of course, is just competitively. Nvidia, quite candidly, stepped on their accelerator pedal, and when they saw that—'holy crap. AMD has got a real part; they're going to be a real competitor’—they very deliberately stepped on the accelerator trying to block us and everybody else out. And so we're responding to that as well.
In terms of the difficulty of response, it's a heavy lift. As you know, these are very, very complicated chips. They take multiple years to develop. We were sort of on a two-year cadence, and we really are dedicating to this product line another ‘more than’ team to do the silicon as well as resources to do the system designs as well as the resources to accelerate the software. So it's a significant investment. It's something that we actually embarked on some time ago, or we would not be able to respond. And so I think we see the opportunity is so great that we felt we had no choice.
Can you say at all how that faster release cadence is impacting the research and development budget? Are there any figures you can provide?
I'll stay away from direct figures, but, clearly—we've said this—we're dramatically increasing R&D investments in data center. We've been steadily ramping them. And then, more to the point, we're really dramatically increasing our investments in AI and have been doing so for some time. So this is not a ‘we're reacting now.’ This is a ‘you're seeing announcements based on a reaction quite some time ago.’
Nvidia had an investor presentation from October last year where the company publicly revealed its plan to release data center GPUs every one year instead of every two years. Was that roughly when AMD decided to respond with a faster release cadence, or did you have an inkling before then?
We had an inkling before then. Our AI investments have been ramping substantially over the last few years. But I'd say we really hit the accelerator long before that.
I'd say the second we saw—look, the promise of AI has been out there for some time. And you got to hand it to Nvidia. They saw the scope of the opportunity earlier maybe than most and had conviction around when it would occur. I still think they were surprised as well as the rest of us.
We've maintained a very large investment in our data center GPU technology for at least the last six or seven years when we really split the architectures [between CDNA for data centers and RDNA for PCs]. About six years ago, we really focused. So this has been a long journey for us as well. But even then, we didn't know when [the] AI inflection point was going to come. But I would say that about a couple hours after we started to see what the modern transformers of the GPT-3 realm could do in 2022, that's when we really started to further accelerate.
What's your expectation for how the Instinct MI325x and MI350 chips will overlap with Nvidia's flagship products when they come out, and how do you expect that to impact demand for either of those products?
For both of those, we think we are closing the gap, narrowing the gap between the introduction of Nvidia's part and the introduction of our same generation part. So [MI]325[X], I'd say, handily outdoes H200 and is competitive in many regards with [Nvidia’s upcoming] B100. And obviously, it'll be out a little bit behind H200.
And then [MI]350 [based on the CDNA 4 GPU architecture] I would say is a great part that we think is higher performance than what we see projected for [Nvidia’s] B200. We think B200 is really a 2025 part for any sort of volume, and so is [MI]350.
One thing I noted from Computex was that there was the announcement of the MI350 series and not just an MI350X. Does that indicate that we may see a MI350A announced at some point?
I'd say that it indicates that there's going to be multiple instantiations or multiple SKUs, and we're not going to unpack the details of that at this point.
Nvidia is putting a lot of emphasis on its Grace Hopper and Grace Blackwell Superchips and the benefits of integrating CPU, GPU and memory to provide the best performance. Does AMD see a need for an MI300A-like chip fulfilling the range of workloads Nvidia is targeting with these CPU-GPU Superchips?
We do think optimizing both together is a good thing. I would say that Grace Hopper or Grace Blackwell look a hell of a lot like [EPYC] Trento [CPU] and MI250 [GPU], [the two chips at the heart of the Frontier exascale supercomputer]. We introduced that particular architecture, some of the details are slightly different, but [they were a] CPU and GPU with high-speed coherent interconnection, pooled memory [that] we introduced two years ago. [With the] MI300A [APU], [we] put [the CPU, GPU and memory] together a little bit more tightly. And we do think there's a lot of validity for having the tight coordination between the CPU and GPU.
Now there's different optimization points, right? You may not always want a APU, like a [MI]300A or GB200 because you have more copies of the OS. For some workloads, you don't want that. You want a larger footprint of accelerators per operating system instance. And the operating system instance, by the way, is not the most critical thing. I'm just using that as a proxy for the size of the node. For some workloads, some applications, a one-to-one [CPU-GPU] ratio is great. [MI]300A is fantastic. For others, the one-to-four, one-to-eight ratio is great. That's why we've got the [MI]300X. You will see us support a variety of different ratios going forward. And over time, you'll see all of those different ratios offer some of the same attributes in terms of sharing between the CPU and GPU. It's going to be a journey, but I think you'll see that from us. We'll efficiently support a number of different node architectures depending on the particular workload, and you'll see us support a closer affinity between the CPU and GPU.
You were talking in the beginning about providing a set of solutions that are well optimized for whatever customers are trying to do. There's been a lot of emphasis on the highest possible performance you could get from silicon. In previous generations, AMD has put out lower-performance Instinct products that have lower power requirements, etc. Does AMD plan to release more lower-end Instinct products in the future?
Let me back up and unpack what I meant by that comment a little bit more. I would say that Nvidia's general approach to the market is, we can solve any problem, any AI problem for you with a GPU of some type. They prefer to sell you DGX if they can convince you to buy that, but [they are ultimately trying to sell you] a GPU.
Our solution is more look, we got GPUs, we got CPUs. They have different roles to play depending on your workload in your environment, and we're not going to try to force-fit you down one path or another. Within each one of those dimensions, there's additional scale points at play as well. So you'll see us continue to build out the range of solutions on both CPU and GPU.
To what extent is AMD is investing in the enablement of commercial channel partners to sell Instinct-based systems?
We're walking as many steps as fast as we can. Clearly, our initial set of customers were the hyperscalers that, quite candidly, have been investing with us on the software side for several years. And so the critical factor in [MI]300 was, first, get those guys up and into production, make sure that, with them, we've got ROCm 6 [AMD’s open software for GPU programming] up and going, and then secondarily, in parallel with that, get the OEMs up and going.
You'll see us shifting very hard to enable enterprise through multiple channels this year, including the channel. So we'll really be ramping that up quite a bit this year. There's been some engagement, but I'd say it's relatively nascent, but you'll see that shift hard [in the] second half.
For end customers who have ended up buying Instinct-based systems or consuming them through a cloud service provider or AI service, what would you say were the most critical factors in getting those customers to go with AMD as opposed to Nvidia?
A couple. Particularly for some large customers, there is a desire to have choice and not be locked in. And so if there's a credible alternative that, in and of itself, is a motivation.
But for most customers that's a secondary concern or somebody else's problem to maintain the health of the ecosystem. And for those, they're looking at, how is this the better solution? And certainly, the memory advantage of the AMD solutions has been very substantial, and it allows customers to run, particularly for medium-sized models and below you can get some pretty dramatic [total cost of ownership] advantages just by virtue of you can run a lot more instances on one GPU or one node than you can with [Nvidia’s] Hopper or Blackwell. That's most dramatic in inferencing. And so that's been a real motivator for a lot of customers there.
Going forward, you will see us continue to add some additional differentiated features. Our strategy, essentially, is to offer customers a low-friction path to an alternative and to have an alternative that is differentiated in multiple ways and offers real advantages to customers who can utilize those features and those points of differentiation.
Those two things may seem at odds with one another: having a low friction path to adoption and being differentiated. The art is in trying to make sure that we're delivering both of those without compromise on either one. So if you look at our strategy around the MI300, it was, look, we're going to build a [universal baseboard] that essentially drops into systems that can also accommodate Nvidia's baseboard [for the H100].
If you look at our software approach, we're really focusing on how do you minimize the friction of anyone using Instinct if you've been writing to frameworks—and that's been delivered, by the way. Multiple customers, like the Stable Diffusion guy the other day, he said it was minimal effort to port to AMD. So we want to maintain that while offering more value. Bigger memory is one. There'll be more overtime. I'm not going to pre-announce those. I don't want to Osborne myself.
AMD has talked a lot about the benefits of the open ecosystems and open systems it supports versus things that are proprietary with Nvidia. How does propriety materially impact customers? Being locked into a proprietary system, what actual adverse impacts could that have on a customer?
Well, the first is that I think when you're locked in and there's no alternative or you're not able to avail yourself of an alternative, the costs are whatever the supplier chooses to charge. And I would look at Nvidia's financials, it's been remarkable to see the progress they've made on both their revenue as well as the margin. So I think there's that obvious issue.
But the other issue is that for many customers, they do want choice for a variety of different reasons. A lot of the [mega data centers], for example, have internal efforts that they'd like to support as well. If you have a proprietary infrastructure where every element is locked down, their ability to insert solutions if they want a particular piece of their own making or their own optimization is limited. So if they don't like the networking or they don't like the CPU or they don't like the accelerator, they're sort of out of luck if you've got a highly integrated proprietary solution where every element — the CPU, the GPU and the network — is proprietary.
I think our approach is a balancing act. We want to optimize the solution together, but we want to do so in the context of open standards that allow people to make other choices.
So the biggest and hardest part of that is the fabric choices. And so that's one reason we really focused on Ultra Ethernet [UE] and Ultra Accelerator Link [UAL], because without that, you can't build systems that can accommodate different CPUs or GPUs. And so that's the first and most important part of that. And again, if you look at the list of folks that are the promoters of UE and UAL, it's pretty clear, particularly with UAL, everybody on there [Broadcom, Meta, Cisco, Hewlett Packard Enterprise, Microsoft, Intel] is building silicon for CPU, accelerator, networking or multiple [parts], and that includes the end customers on that list as well.
What do you think of Intel's move to publicly disclose the $125,000 list price of its eight-chip Gaudi 3 platform for OEMs? And do you see a need for pricing transparency in the data center accelerator space?
I think list prices are almost a complete waste of time. I don't know what to tell you. The one thing I can tell you is, I would be very surprised if more than 10 percent of Intel's products were sold at a published price [at least for the data center]. And I think the same is probably true here.
You're saying you don't think it's useful.
Why is it useful? Look, at the end of the day, anybody that's going to be buying one of these things—if people are going to talk about systems, you're talking about hundreds of thousands of [U.S.] dollars. What are you going to do? You're going to go off and you're going to get quotes from three or four vendors for a couple of different architectures. That's what you're going to do. And not one of those quotes is going to be the price that Intel just said, I think. If history is any guide, that's my guess. Then maybe I'm wrong. Maybe Intel will somehow enforce the price. But I'd be very surprised. Look, every tender for any one of these things could be a multi-source, competitive bid. So what's the point of publishing a price? I don't know.
I should clarify that the $125,000 list price is just for the eight chips with the universal baseboard and not a full built server.
It's still the same. Anyway, I don't think it's a big deal.