Cloudera Teams With Nvidia To Create New AI Inference Service

Updated: The new Cloudera AI Inference Service leverages Nvidia NIM microservices to boost the development of large-scale AI models that can tap into the huge volumes of data stored on the Cloudera Data Platform.

Hybrid data platform provider Cloudera is taking a deeper dive into AI with a new service, powered by Nvidia’s NIM microservices, for deploying and managing large-scale AI models.

The new Cloudera AI Inference service, which will make its formal debut at the Cloudera EVOLVE24 New York event in New York City on Thursday, will make it easier for businesses and organizations to fully leverage their data to advance generative AI projects into full production.

“We focus a large part of our energy on getting to ‘AI ready,’” said Abhas Ricky, Cloudera chief strategy officer, in an interview with CRN in which outlined his company’s role in the current wave of AI development and adoption and how the new AI Inference service expands the company’s AI efforts.

Cloudera’s flagship system, the Cloudera Data Platform (CDP), provides a number of data management capabilities including operational database, data engineering, data warehouse, data flow, data stream processing and machine learning functions.

The rise of AI and generative AI is creating new demands for trusted data and Cloudera has taken steps to fill those needs. In June, for example, the company acquired the Verta Operational AI Platform from Verta.ai in a move Cloudera said would deepen its AI technology portfolio and expertise with Verta AI and machine learning talent and technology.

Ricky said Cloudera’s AI strategy is based on three pillars: Ensuring that customers can run scaled AI workloads on GPUs on private clouds; ensuring that clients can leverage any open-source or proprietary model; and providing the necessary tooling that allows customers to work with such capabilities as enterprise search, semantic querying, retrieval augmented generation and more.

In March Cloudera announced an expanded collaboration with Nvidia, dubbed Cloudera Powered by Nvidia, through which the two companies planned to integrate Nvidia NIM microservices (part of the Nvidia AI Enterprise software platform) into Cloudera Machine Learning.

Ricky said Cloudera AI Inference, powered by embedded NIMS microservices, streamlines the deployment and management of large-scale AI models and allows organizations to serve data on the Cloudera platform to large language models (LLMs) to advance their GenAI projects.

Using Cloudera AI Inference developers can build, customize and deploy enterprise-grade LLMs with up to 36x faster performance using Nvidia Tensor Core GPUs and nearly 4x throughput compared to CPU systems, according to the company.

Overcoming Data Security Hurdles

While many businesses and organizations are launching AI projects, including AI-driven chatbots and virtual assistants, concerns about data compliance and governance have slowed many efforts. And Ricky noted that some 70 to 75 percent of all data assets are on private cloud systems.

With Cloudera AI Inference, sensitive data doesn’t need to be pushed to a third-party AI model with all the inherent risks of that data leaking out, according to Ricky. Cloudera is building enterprise-grade security and governance around data and model development, deployment and access.

According to Cloudera’s detailed description of the new service, AI Inference integrates user interfaces and APIs directly with NVIDIA NIM microservice containers, eliminating the need for command-line interfaces and separate monitoring systems. The service’s integration with Cloudera’s AI Model Registry enhances security and governance by managing access controls for both model endpoints and operations. Users benefit from a unified platform where all models – whether LLM deployments or traditional models – are seamlessly managed under a single service.

Cloudera AI Inference utilizes Nvidia NIM microservices to optimize open-source LLMs, including Llama and Mistral, according to the company. Workloads can be run on premise or in the cloud with virtual private cloud deployments for enhanced security and regulatory compliance. And users can rely on auto-scaling, high-availability and real-time performance tracking to detect and correct issues and maintain efficient resource management.

Of importance to channel partners, Ricky said Cloudera AI Inference gives the company’s systems integrator and ISV partners greater opportunity to build generative AI applications, agents and other software that tap into data in the Cloudera platform.

"Enterprises today need to seamlessly integrate generative AI with their existing data infrastructure to drive business outcomes," said Kari Briski, vice president of AI software, models and services at NVIDIA, in a statement. "By incorporating NVIDIA NIM microservices into Cloudera's AI Inference platform, we're empowering developers to easily create trustworthy generative AI applications while fostering a self-sustaining AI data flywheel.”

Cloudera AI Inference has been in tech preview since June and is now generally available.

At the EVOLVE24 New York event Thursday Cloudera said it has extended its Open Data Lakehouse interoperability to the Snowflake AI Data Cloud, providing joint customers with access to Cloudera’s Data Lakehouse through its Apache Iceberg REST Catalog. (Apache Iceberg is an open-source data table format that’s key to ingesting, preparing and processing data.)

Cloudera also announced the addition of a number of new partners to its Enterprise AI Ecosystem including Google Cloud, Snowflake and Anthropic. Nvidia, Amazon Web Services and Pinecone were among the original ecosystem partners when it debuted last year.

Cloudera also provided a preview of planned capabilities in the Cloudera Data Platform including a single CDP codebase for cloud and on-premises environments, a hybrid control plane for monitoring and managing deployments across any infrastructure, unified data security and governance across hybrid infrastructures, federated data access between multiple infrastructures, and the ability to deploy Cloudera on ARM-based systems such as AWS Graviton.