Nvidia ‘Doubling Down’ On Partners With DGX Cloud Service
Nvidia executives Manuvir Das and Craig Weinstein explain to CRN how the GPU giant will rely on channel partners ‘to a significant extent’ for its new DGX Cloud supercomputing service, designed to help enterprises create and run generative AI applications, and how it will open a range of services opportunities.
Nvidia executives told CRN that the GPU giant is “doubling down” on the channel with its new DGX Cloud supercomputing service, which will open a range of services opportunities for partners that will “move at the speed of light” because of the focus on software.
In an interview, Manuvir Das, Nvidia’s vice president of enterprise computing, said the Santa Clara, Calif.-based company will rely on partners within its Nvidia Partner Network “to a significant extent” to sell DGX Cloud, which gives enterprises quick access to the tools and GPU-powered infrastructure to create and run generative AI applications and other kinds of AI workloads.
[Related: 8 Big Announcements At Nvidia’s GTC 2023: From Generative AI Services to New GPUs]
Das said Nvidia’s go-to-market plan with partners for DGX Cloud builds on the work the company has done in the channel with its DGX systems, which combine Nvidia’s fastest data center GPUs with the Nvidia AI Enterprise software suite and serve as the underlying infrastructure for the new cloud service.
“This is about Nvidia truly growing up as an enterprise company and saying that we’ve learned in the last few years the value of this ecosystem and now we’re doubling down with this ecosystem,” Das said.
DGX Cloud is hosted by cloud service providers, with initial availability on Oracle Cloud Infrastructure. The service is expected to land at Microsoft Azure in the third quarter, and Nvidia said it will “soon expand to Google Cloud and more.”
Das said DGX Cloud will have multi-cloud and hybrid-cloud capabilities thanks to Base Command, Nvidia’s software that manages and monitors training workloads and allows users to right-size the infrastructure for what their applications require.
At launch, each DGX Cloud instance will include eight of Nvidia’s A100 80GB GPUs, which were introduced in late 2020. The monthly cost for an A100-based instance will start at $36,999, with discounts available for long-term commitments. DGX Cloud instances with Nvidia’s newer H100 GPUs will arrive at some point in the future with a different monthly price.
While Nvidia plans to offer an attractive compensation model for DGX Cloud, Nvidia Americas Channel Chief Craig Weinstein said the cloud service will make services offered by partners “even more valuable” because DGX Cloud shifts the opportunity from hardware to software.
“Instead of our partners spending so much time building data centers, the quality of the work is going to happen a lot faster. So the services opportunity for our partners is probably going to move at the speed of light and will increase sequentially over time,” said Weinstein.
An executive at one of Nvidia’s early DGX Cloud partners, cloud-focused MSP SADA Systems in Los Angeles, said there is no vendor better positioned to seize the generative AI opportunity with enterprises than Nvidia, and he already sees big potential with the new cloud service.
“I’m redirecting business and technical development resources from other areas to the relationship with them as a result of DGX [Cloud]. It’s really material for us,” said Miles Ward, CTO at SADA.
What follows is a transcript of CRN’s interview with Das and Weinstein about the “significant” opportunity for partners with DGX Cloud, how the cloud service helps enterprise build generative AI models with proprietary data and what kinds of services opportunities it will create for partners.
To what extent do you plan to rely on the Nvidia Partner Network to sell DGX cloud?
Manuvir Das: To a significant extent. We built the DGX business over the years with an ecosystem of NPN [partners], who worked with us on all the customer engagements, and we really think of DGX Cloud as just a different form factor and a slightly different experience for people to get the same capability. So, of course, we’re very keen for the NPN partners to come along with us.
So two things. No. 1: It is already the case that we’re selling DGX Cloud today, and there is a model for it to be transacted through the NPN. So that’s already in place.
And No. 2: Of course, these are hosted in the cloud, and I talked about cloud marketplaces. And for a variety of reasons, the public clouds have become more and more vendors to enterprise customers, and so the NPN model really matters to the cloud service providers and, as they keep introducing new ways for NPN [partners] to participate in the ecosystem, we’ll just fully participate as well.
So I think what’s unique about us with the [partners] is that they get two new opportunities from DGX Cloud through Nvidia. One is independent of the CSP marketplace motion, we have extended our business model so that just like [partners] could transact the regular DGX, they can transact the DGX cloud. And then two: obviously, with this stuff coming to the cloud marketplaces, that gives another opportunity for the [partners] to participate in the cloud motion by offering the DGX Cloud.
Can you elaborate more on what's the difference between those two new opportunities for Nvidia partners with DGX Cloud?
Das: When you offer a service in the cloud, you have the option of placing it in the marketplace of the CSP. And if you put it in the marketplace of the CSP, then what happens is, customers can [use] their cloud credits when consuming the servers, and there’s a greater tie-in to the sales teams of the CSP. The other kind of model is where you transact directly. You use the cloud to host the offering, but you do the transaction directly independent of the cloud marketplace.
How big of an opportunity is DGX Cloud for partners?
Das: Just as with the DGX it has been a significant business for them, so we see the same thing [with DGX Cloud]. And I would go one step further, which is, in the on-prem world, we’ve had DGX, which is our system, and then we’ve had all our OEM partners build great systems similar to DGX. And our posture is, we’re happy regardless of which system the customer uses: an OEM system or DGX system. Obviously, there are certain additional facets in DGX that make it more of a premium system.
Now, what’s happening here in this cloud offering, DGX Cloud, it’s really not about what’s in the box underneath. The customer doesn’t actually see that. It’s about this whole software platform. So we do think that this is a broader opportunity for the [partners], so we’re excited for that.
But it’s the same customer with the same need. DGX customers choose DGX because they just want to do the data science, and they want a system that just works and all the orchestration and all these engineering things are solved, so that they can just go be data scientists. And that’s exactly what DGX Cloud addresses, so it’s a great opportunity for [partners].
DGX Cloud's starting monthly cost is $36,999. Does that include the cost of the cloud instance itself, or is that just for the DGX Cloud service?
Das: No, it includes everything. It’s all inclusive. So that is a starting price, because that’s lowest shortest rental period, which is a month. And then as you go to longer rental periods, the price-per-month drops, so it’s on a graduated scale. So that’s No. 1.
No. 2: It’s all inclusive. Firstly, on the infrastructure side, it includes the compute instances with the GPUs. It includes a certain amount of storage. It includes networking egress charges, because in the cloud, you have this concern that they make it free to bring a data set in, but then you have to pay to take the data out. So we bundle in egress as well. So on the one hand, it includes all of that.
On the other hand, it includes all of the software stack. So it includes Nvidia AI Enterprise, which is built into the service, so the Nvidia AI Software [is there] for all your use cases, the frameworks, NeMo for training, etc. And then it includes software called Base Command, which is the console that you use, and this is what makes it a multi-cloud, hybrid-cloud thing.
So this is the experience where as a data science team, you just go in and you submit a job. It could be Python or a Jupyter Notebook or what have you. You point to your input dataset, and basically hit go. And the system underneath does all the orchestration and where to allocate your resources. And if a node fails, then how to put you on another node so that your job can continue—all that kind of stuff. So all of that’s included in this pricing.
Is there any way the price can go up per month?
Das: What will happen over time, of course, is that the service will evolve to a higher capability. For example, new kinds of instances with more powerful GPUs, where each GPU can do more work, and then, of course, that’s a new kind of offering.
Is the $36,999 monthly cost for an A100- or H100-based instance?
Das: It’s for the A100.
So the H100-based instance will have a different price?
Das: Yeah. We don’t have a pricing announced for that yet.
When will the H100 be available in DGX Cloud instances?
Das: So we haven’t announced anything. We are going to bring it into preview very quickly for customers to evaluate. And that will happen very, very quickly. And then we’ll announce expansion of the commercial offering to H100 sort of as we go.
Is it fair to say that at launch DGX Cloud will be A100 only?
Das: Correct. That’s what it is today.
Is there a compensation model for partners with DGX Cloud?
Craig Weinstein: We’re working through it now, but the way in which Manuvir described our partners providing DGX solutions today on-prem will be the same going forward. It will be just a different pricing structure. We will still leverage our distribution relationships that exist across the geographies around the world.
And the most important piece of this is that the valuable services that our partners are creating, as they work hand-in-hand with these customers, become even more valuable, because instead of our partners spending so much time building data centers, the quality of the work is going to happen a lot faster. So the services opportunity for our partners is probably going to move at the speed of light and will increase sequentially over time.
Das: This is such a key point that I will state it again. This is not a situation where we’ve had this model with the distribution network, the [Nvidia Partner Network], etc., and now we’re like, we’re just going to stick a service in the cloud, the customer can work with us directly and we don’t need anybody else. It’s actually the opposite. This is about Nvidia truly growing up as an enterprise company and saying that we’ve learned in the last few years the value of this ecosystem and now we’re doubling down with this ecosystem. And so I really want to emphasize that.
So with the compensation model, it sounds like you're still working that out?
Weinstein: Correct. We’ll provide some details as we go live after GTC.
Will DGX Cloud have an attractive compensation model for partners?
Das: Yes. The answer to that is yes. Absolutely.
What are the different kinds of services opportunities for partners with DGX Cloud and how can that help them with growth and profitability?
Das: So let’s take the first one. Let’s take generative AI [as an example]. We have an offering called Nvidia AI Foundations now with some of these cloud services that can be consumed. To your question, any enterprise company that’s adopting generative AI, they need a whole range of implementation services that a channel partner can provide, because the pieces have to be stitched together.
The biggest one is, “I need the data, so I need to find my data sources. I need to curate my data. I need to set up a system at my company that every day when new data has been generated, it’s being fed into the AI pipeline in the right way.” So what happens when a company starts the AI journey is on day one. They have to identify some existing data set, and then do a whole bunch of one-off work to reformat it, curate it, all of that, but obviously, they’re generating new data every day, right? And you can’t be sitting there. So they have to change their processes and change that data pipeline. That is a great opportunity for services for a channel provider.
The second opportunity then is, you got to have this pipeline, if you will, where it’s not a one-off thing. You don’t just train this model once and you’re done. You have to start with the model. You have to train it and then you have to keep fine-tuning it. You have to keep customizing in various ways, and that leads to new models. So as a company, you have to manage this library of models that you’re constantly evolving, developing, keeping track of. That’s another great services opportunity.
The third opportunity is what people might call model engineering, where, for a typical enterprise customer, they don’t have the R&D team to do all the tweaks and optimizations and things that you do on your models. For example, none of this stuff is cheap to build and run. And if I can do a certain optimization in my model that makes the model run twice as fast, that means it can serve twice as many requests, which means I basically spend half the money on the infrastructure. That’s another example of a service that the ecosystem can provide.
So the interesting thing about this whole AI workflow, which is now exposed by generative AI, is that it’s, on the one hand, accessible enough that enterprises want to do it. But on the other hand, it’s complex enough that it requires a range of services to actually operationalize [it], and that is where the opportunity lies for the ecosystem.
One thing you were talking about was data retrieval, and I know there is a capability called Inform in the new NeMo large language models cloud service, where you can fetch data from proprietary databases and train a custom model on that data. Can you talk about those capabilities?
Das: So there are two approaches. The Inform approach is very specifically a new kind of AI model that is all about ingesting all the data from these kinds of databases and learning from that, from the ground up. So that’s what Inform is, and that’s part of our library.
But then there’s another approach as well, which we enable on any of the other models through a process of customization, where, depending on the question that is asked or the request that is made to the standard GPT [Generative Pre-trained Transformer] model, we have a mechanism by which we can go search the proprietary information, extract enough information and then add it to the GPT-based model, so that it can use that information when generating its answer.
It sounds like it’s a pretty significant capability.
Das: We do believe it is. And this is a key basis for why we built this service, because there’s a lot of great capabilities out there, obviously with ChatGPT and [and the model’s creator], OpenAI, and we’re a big partner of theirs. We power what they do.
The reason we built this platform, it was not to compete with [OpenAI]. It was to cover a different need for the enterprise companies, which is the data that they have, that is specific to them, is actually the most valuable stuff. And the sense in which they want the power of ChatGPT is not necessarily to produce the answers, but to be like an intelligent assistant that knows how to talk. So the value of training on the internet for them is not “tell me how World War I started,” but how do you converse with somebody? And how do you construct a good paragraph’s worth of response? That sort of thing. And so our whole approach is, we’ll give you a pre-trained model, which is like that, where we’ve done all that training one time—you don’t have to do it. But then we give it all your information, so its answers are based on your information, but its intelligent, conversational ability comes from the GPT-style training.
Weinstein: What we’re seeing, too, is the collision of what Manuvir just described and use cases that are defined by industry. So if you take what Manuvir just described, and you apply it to the segment of retail, next thing you know the you have the entire experience that Manuvir just shared but now you have it in a brand ambassador who’s an avatar. And that avatar now needs language capabilities and a dialect that needs other AI models for speech that leverages our Riva software.
And so what you’re talking through is an end-to-end architecture, where software is the dominant opportunity for our partners. And what we’re trying to do is make sure our partners are prepared for that, that we’re teaching them and educating them and they’re engaging with the ecosystem correctly. And we think it’s exciting because it’s going to go across all industries.
Das: It’s the complexity. Because you look at other ecosystems—let’s take the database ecosystem. With SQL as the natural way to query databases, it’s one standard. You get educated on that, and then a lot of the complexity goes away. AI is not really like that, right? There’s still a lot of pieces. And those pieces are evolving, too.
So we’ve tried to do two things as Nvidia.
On the one hand, we’ve tried to create as much of a standardized platform as possible. The GPU architectures are all the same with CUDA [the parallel programming platform] on top of that. The software’s all Nvidia AI Enterprise. So, on the one hand, create as much of a platform as possible.
But on the other hand, with all our competencies and our [Deep Learning Institute] trainings and all of that, there’s a lot still that a partner has to learn to create a good implementation, but therein lies the value opportunity, because the end customer cannot just go do this.