Alluxio Targets Data Provisioning Hurdles For AI With New Offering

Recognizing that “deep learning” AI/ML applications face different data management and I/O challenges than traditional data analytics, Alluxio has launched a new data management platform to overcome those hurdles.

ARTICLE TITLE HERE

Alluxio has launched a new data management platform for data-intensive artificial intelligence and machine learning tasks, building on the company’s core data orchestration technology and expertise.

Alluxio Enterprise AI offers the data accessibility and performance needed by the growing number of data-driven applications, including generative AI, large language models, computer vision and natural language processing, that are generating more demanding workloads for IT and data infrastructure.

While the company’s flagship Alluxio Enterprise Data was originally developed for a range of big data tasks, including data analytics and AI/machine learning, product management director Adit Madan, told CRN that it became clear a new platform was needed to specifically meet the demands of AI workloads such as deep learning and large-scale model training and deployment.

id
unit-1659132512259
type
Sponsored post

[Related: The Coolest Data Management And Integration Software Companies Of The 2023 Big Data 100]

“This is something that we’ve been incubating and increasingly it became very clear to us that there is a need for a completely different offering,” Madan said in an interview. “For deep learning and, especially with the use of specialized compute hardware such as GPUs, it became very clear to us that [data computation needs] are diverging significantly and that we do need a different product.”

Data accessibility and data volume/complexity are major hurdles to the implementation and efficient operation of AI systems, especially model training and model serving, Alluxio, headquartered in San Mateo, Calif., said in the Alluxio Enterprise announcement, citing market researcher Gartner.

“What we’ve seen is that the data access problem is much more recognized in the AI, ML, deep learning space,” Madan said. “The demands on the system are very different for deep learning.”

He said that on the analytics side the need for a product like Alluxio doesn’t necessarily become clear for prospective users until their applications began working with multi-petabytes of data. AI/ML teams, however, are running into data access problems with just hundreds or even tens of terabytes of data.

Alluxio’s original product grew out of the “Tachyon” research project at the University of California, Berkeley’s AMPlab. The cloud-based Alluxio Enterprise Data platform is a virtual distributed data storage system that links distributed data with computational systems including data analytics and machine learning.

Madan emphasized that while the Alluxio Enterprise AI incorporates the company’s expertise and core technology, the software is an entirely new product “built from the ground up” and isn’t just a modified edition of the original product.

In addition to providing a remedy to inefficient I/O challenges, Alluxio Enterprise AI helps remove cost management barriers to AI/ML adoption, Madan said, and provides an alternative to using high-performance GPU-based systems that are in limited availability.

Alluxio Enterprise AI provides seamless data access for AI workloads across on-premises and cloud environments, the company says. It accelerates machine learning pipelines up to 20x over commodity storage using an enhanced set of APIs for model training and provides extreme concurrency and up to 10x acceleration for model serving from offline training clusters for online inference.

The new platform utilizes a distributed architecture (Decentralized Object Repository Architecture) with decentralized metadata for high-performance input/output for both analytics and AI/ML. The architecture offers infinite scale using commodity storage, eliminating the need for high-performance computing (HPC) storage systems based on specialized hardware.

The system also uses an intelligent distributed caching feature that enables AI engines to read and write data through a high-performance cache tailored to the I/O patterns of AI engines instead of slow data lake storage, according to the company.

Alluxio Enterprise AI has been in beta/early adopter stage for six months, Madan said, and is now generally available. He expects to see adoption both among new customers as well as customers who are already using the Alluxio Enterprise Data platform but have launched AI/ML projects.

The company will continue to offer its original Alluxio Enterprise Data platform targeted toward high-performance data analytics tasks.