Databricks Data+AI Summit 2024: The Biggest News
Upgrades to Databricks’ Mosaic AI, a deeper collaboration with Nvidia and the open-sourcing of Unity Catalog are among the biggest updates from Summit 2024.
Upgrades to Databricks’ Mosaic AI unified tooling product for artificial intelligence and machine learning. A deeper collaboration with Nvidia. And the open-sourcing of Unity Catalog.
These are some of the biggest news announcements from the San Francisco-based analytics platform vendor’s Data+AI Summit, which runs through Thursday in Databricks’ headquarters city.
Privately held Databricks has sought to project might in the ongoing data and AI vendor wars, unveiling earlier this year that it reached more than $1.6 billion in revenue for the fiscal year ending Jan. 31. That marks 50 percent growth year over year, according to the vendor.
[RELATED: Databricks Expands GenAI Data Capabilities With Latest Acquisition]
Databricks Data+AI Summit 2024
The vendor has a partner program for consultancies and other partner business models, according to Databricks.
In the lead-up to Summit, Databricks bought Tabular, a San Jose, Calif.-based data management company founded in 2021 by the original creators of Apache Iceberg and Linux Foundation Delta Lake. The deal closed on Friday – but Databricks announced the acquisition during rival Snowflake’s Data Cloud Summit 2024 (Databricks even had an ad outside the venue for Snowflake’s show.)
Here’s more of what Databricks announced this week during the event.
Mosaic AI Updates
Compound AI systems building support, ways to improve model quality and more AI governance tools are the themes around where Databricks is investing in its Mosaic AI offering, the vendor revealed during its Data+AI Summit.
The vendor launched public previews of Mosaic AI Agent Framework, Mosaic AI Agent Evaluation, Mosaic AI Model Training, and Mosaic AI Gateway.
The investments aim to help GenAI users struggling with privacy, quality and cost concerns when moving projects from pilot to production, according to Databricks. Compound AI models are an alternative to one large model, bringing in multiple models, retrievers, vector databases and tools for evaluation, monitoring, security, and governance.
Mosaic AI Agent Framework aims to provide an easy way of building retrieval augmented generation (RAG) applications with foundation models and enterprise data.
Mosaic AI Agent Evaluation is an AI-assisted evaluation tool that promises to automatically determine if outputs are high quality. It also provides a user interface (UI) for getting stakeholder feedback.
Mosaic AI Model Training is used for fine-tuning open source foundation models with an organization’s private data. The models are fully owned and controlled by the customer and should produce better results for specific use cases. Smaller models fine-tuned this way should also be faster and less expensive, with fewer parameters and less computing power needed.
Mosaic AI Gateway is a way for Databricks users to have a unified interface for querying, managing and deploying models – open source or proprietary – so that users can switch the large language models (LLMs) that power applications without complicated changes to the app code.
Gateway supports usage tracking and guardrails for monitoring model calling and setting spending rate limits. Users can also filter for safety and personally identifiable information (PII) no matter the model they use, according to Databricks.
Mosaic AI Tools Catalog
A Mosaic AI offering launching in private preview instead of public preview is the Mosaic AI Tools Catalog, according to the vendor.
The catalog will allow users to govern, share and register tools with Databricks Unity Catalog to make them more secure and discoverable, according to Databricks.
This should make tool-enabled models usable in secure, governed ways while giving the tools discoverability across an organization.
Unity Catalog Open Sourced
Now available is Unity Catalog OSS, an open-sourced version of a product introduced in 2021 that aims to unify data and AI governance across clouds, data formats and data platforms.
The catalog supports any data format and compute engine, according Databricks. It can read tables with Delta Lake, Apache Iceberg and more while supporting Iceberg representational state transfer (rest) Catalog and Hive Metastore (HMS) interface standards.
It also promises to ensure unified governance across tabular, non-tabular data, and AI assets including ML models and generative AI tools. Unity Catalog OSS interoperates with Microsoft Azure, Amazon Web Services, Google Cloud Platform, Salesforce, Apache Spark, Trino, DuckDB, Daft, PuppyGraph, dbt Labs, Confluent,, Fivetran, Granica, Immuta, Informatica,, LangChain, Tecton and more.
Open-sourcing the catalog should give users more flexibility and control without vendor lock in, according to Databricks. More than 10,000 enterprises leverage Unity Catalog.
Nvidia Partnership
Not to be outdone by Snowflake announcing a new collaboration with Nvidia around integrating NeMo Retriever microservices into Snowflake Cortex AI, Databricks has revealed that it has its own expanded collaboration with the semiconductor heavyweight.
As part of the collaboration, Nvidia’s compute unified device architecture (Cuda) will come to Databricks’ Data Intelligence Platform, with the aim of boosting efficiency, accuracy and performance of AI development pipelines, according to Databricks.
This alliance means native support for Nvidia graphics processing units (GPUs) acceleration on the platform and native support for Nvidia-accelerated computing in the Databricks vectorized query engine Photon, according to Databricks.
Delta Sharing Growth
Just before Databricks’ show, the vendor published new growth milestones for its Delta Sharing, which is used for sharing live data across clouds, platforms and regions.
Those stats include:
- More than 16,000 data recipients have used Delta Sharing to receive data and AI assets
- Quadruple year-over-year growth in active Delta Shares between data providers and data recipients
- More than 2,000 listings of datasets, AI models and solution accelerators are on the Databricks Marketplace
- More than quadruple increase year over year in listings on Databricks Marketplace
- 40 percent of Delta Sharing connections are through open connectors to Apache Spark, Microsoft Excel, Salesforce’s Tableau and other non-Databricks platforms