Snowflake Data Cloud Summit 2024: The Biggest News
Snowflake Iceberg Tables, Cortex AI and Horizon received major updates during the annual conference.
Snowflake Iceberg Tables general availability. New artificial intelligence enhancements in Cortex AI. And a private preview of an internal model marketplace in Snowflake Horizon.
These are among the biggest announcements unveiled this week at Snowflake Data Cloud Summit 2024, the data storage and processing vendor’s annual conference. Bozeman, Mont.-based Snowflake runs its summit Monday through Thursday in San Francisco.
As part of the announcements, the vendor reiterated that its Snowflake Copilot offering will become generally available (GA) “soon.” The copilot leverages Mistral Large and Snowflake’s proprietary structured query language (SQL) generation model to accelerate productivity, according to the vendor.
[RELATED: Here Is Snowflake’s Plan To Drive AI Sales For Partners: Channel Chief]
Snowflake Data Cloud Summit 2024
Snowflake is a member of CRN’s 2024 Channel Chiefs.
Ahead of Summit, Snowflake Senior Vice President of Worldwide Alliances and Channels Tyler Prince told CRN about the importance of the vendor’s channel partner strategy.
“Ultimately, the real excitement is in the way we show up for our customers with a very connected ecosystem of SIs working with our app and data and tech partners and working closely with us around our AI strategy,” Prince said. “That’s very powerful and adds a tremendous amount of value to our customers.”
Summit comes on the heels of what was called a “decent” quarter in a report by investment firm Bernstein – but with uncertainty around Snowflake’s future.
“We worry that confidence in growth durability and management’s credibility in estimating the revenue/growth opportunity are tenuous, and expectations may creep too high for next quarter,” according to the report. “More importantly, the company’s long-term strategy remains unclear at best and unattainable at worst.”
The vendor raised its expected revenue for the 2025 fiscal year and showed “strong” growth in remaining performance obligation (RPO). The vendor brought in $828.7 million in revenue for the quarter ended April 30, up 34 percent year over year.
Snowflake CEO Sridhar Ramaswamy shouted out partners during the company’s earnings call May 22.
“Snowflake has a powerful and unique partner ecosystem,” Ramaswamy said. “Part of our success is that we have many partners that amplify the power of our platform. They range from big organizations like EY and Deloitte, but also firms like LTIMindtree and Next Pathway.”
He continued: “These partners bring on entirely new capabilities and unlock new use cases for us and our customers. They also often bring new customers to us. And they really care about how easy it is to build on Snowflake, how reliable Snowflake is and also about how we can go to customers jointly. Partners bring enormous power to our data cloud vision. Their success creates success for us and our customers.”
Read on for more of the biggest announcements made at Snowflake Data Cloud Summit 2024.
Iceberg Tables GA
Snowflake has made Iceberg Tables generally available (GA), giving users of the Apache Iceberg open table format full storage interoperability.
This feature promises an easier way to use, govern and collaborate on Iceberg data stored externally. Users can leverage Tables for data lakehouses, data lakes, data meshes and other open, flexible architectural patterns, according to the vendor.
On the latest Snowflake earnings call, CEO Ramaswamy said that Iceberg “is enabling us to play offense and address a larger data footprint.”
“Many of our largest customers have indicated that they will now leverage Snowflake for more workloads as a result of this functionality,” he said. “More than 300 customers are using Iceberg in public preview.”
Greater availability of Iceberg will have a negative effect on Snowflake’s revenue in the second half of its fiscal year due to customers moving data out of Snowflake and into Iceberg, Chief Financial Officer Michael Scarpelli said on the call, according to a transcript.
However, Iceberg adoption opens up greater opportunities for Snowflake and its partners down the road.
“Data Lakes or cloud storage in general for most customers has data that is often 100 times or 200 times the amount of data that is sitting inside Snowflake,” Ramaswamy said on the call, according to a transcript. “And now with Iceberg as a format and our support for it, all of a sudden, you can run workloads with Snowflake directly on top of this data. And we don't have to wait for some future time in order to be able to pitch and win these use cases, whether it's data engineering or whether it is AI, Iceberg becomes a seamless pipe into all of this information that existing customers already have.”
In a report after the call, investment firm William Blair said in a report “that Iceberg tables will soon pressure Snowflake’s storage revenue (11 percent of total revenue in the latest quarter) and, to a lesser extent, its compute revenue (due to a reduced need to load/transform data into the” data warehouse.
The firm is also concerned “that price per query could come under pressure over time as Iceberg storage lowers entry barriers to competing query/compute engines,” according to the William Blair report.
Polaris Catalog
Snowflake’s new Polaris Catalog is a vendor-neutral, fully open catalog implementation for Apache Iceberg, promising cross-engine interoperability to give users more choice, flexibility and control over data.
The catalog should become open sourced in the next 90 days, according to Snowflake. Users can run the catalog hosted in AI Data Cloud or self-host with Docker, Kubernetes and other containers. Snowflake-hosted Polaris Catalog should enter public preview “soon,” according to the vendor.
Polaris Catalog offers a centralized place for engines to find and access Iceberg tables with consistent security and full, open interoperability, according to Snowflake. It uses Iceberg’s open source representational state transfer (REST) protocol and supports Apache Flink, Apache Spark, Dremio, Python, Trino and more.
Snowflake Horizon Updates
Snowflake has launched a private preview of an internal marketplace in Snowflake Horizon as part of a series of advancements for the built-in governance and discovery product.
Internal marketplace can allow users to publish and curate models, applications and other data products for groups within an organization. The marketplace also prevents unintended sharing to external parties, according to Snowflake.
The vendor will also “soon” launch private previews for AI model sharing and AI-powered object descriptions. The object descriptions will automatically generate relevant context and comments for tables and view once in effect.
Now GA is universal search, with which users can search AI Data Cloud for content in Snowflake storage, external Iceberg storage and from third-party providers, according to the vendor.
Universal search uses search engine technology from 2023 Snowflake acquisition Neeva and allows for natural language queries.
Cortex AI Enhancements
Entering public preview “soon” are two new chat capabilities for Snowflake Cortex AI, with users having the ability to develop chatbots in minutes against structured and unstructured data, according to the vendor.
Cortex Analyst was built with Facebook parent Meta’s Llama 3 and Mistral Large models. Analyst should allow businesses to securely build applications on top of analytical data in Snowflake, according to the vendor.
Cortex Search uses Neeva retrieval and ranking technology with Snowflake Arctic embed so that users can build apps against documents and other text-based datasets through enterprise-grade hybrid search-as-a-service. This search leverages vector and text, according to Snowflake.
Becoming GA “soon” is Cortex Guard. Guard uses Meta’s Llama Guard to filter and flag harmful content across data and assets. That content could include violence, hate, self-harm or criminal activities, making sure models are safe and usable, according to Snowflake.
Also GA soon is document AI, which will allow users to extract invoice amounts, contract terms and other content from documents with the Snowflake Arctic-TILT multimodal large language model (LLM).
On Snowflake’s latest quarterly earnings call, CEO Ramaswamy revealed that more than 750 customers use Cortex AI.
In the Bank of America report following Snowflake’s earnings, the investment firm said that Snowflake investments in GPUs, the AI platform and adding TruEra for LLMs monitoring will weigh on full fiscal year operating and free cash flow margin, but “we believe these are the right investments.”
“Snowflake is in a position of strength to capture AI workloads as the data warehouse leader,” according to Bank of America. The investments “all indicate that the company is serious about closing the competitive gap for AI workloads.”
More AI, ML Improvements
In private preview is the Snowflake AI & ML Studio no-code interactive interface for users to start AI development. Users can turn to the studio product for model testing and evaluation, according to the vendor.
Through the studio, users can access Cortex Fine-Tuning, which is now in public preview. Users can also access this feature through a SQL function.
Fine-Tuning is a serverless customization for a subset of Meta and Mistral AI models. Users can leverage fine-tuned models through a Cortex AI function. Snowflake also allows for role-based access controls.
ML Lineage is in private preview, giving users a way to trace feature, dataset and model use across end-to-end ML life cycles.
Now GA is the Snowflake Model Registry for users looking to govern the access and use of different AI model types, according to Snowflake.
And the Snowflake Feature Store is in public preview. Users can make use of this feature to create, store, manage and serve consistent ML features for model training and inference.
Cloud Footprint Growth
During Data Cloud Summit 2024, Snowflake said it will add a data boundary only for the European Union users, keeping data within regional borders with stronger data residency and sovereignty assurances meeting regional regulations.
The vendor will also offer a Department of Defense (DoD) environment with Boundary Cloud Access Point (BCAP) networking integration that meets Impact Level 4 (IL4) security controls, according to Snowflake.
Snowflake has more than 40 supported cloud regions.
Snowflake Native App Framework Integration
In public preview on Amazon Web Services is Snowflake Native App Framework integration with Snowpark Container Services, which aims to give users variety in the apps they build in the AI Data Cloud.
With this integration, users can leverage configurable graphics processing units (GPU) and central processing unit (CPU) instances to allow for computer vision automation, geospatial data analysis, ML apps for enterprises and other use cases.
Developers can build AI-powered Snowflake Native Apps once and then deploy and distribute them across clouds and regions through Snowflake Marketplace. The marketplace has more than 160 total Snowflake Native Apps available.
Snowflake Trail
During Summit, Snowflake unveiled its Trail set of observability capabilities aimed at more visibility into data quality, pipelines and applications.
Developers can leverage Trail to monitor, troubleshoot and optimize workflows, according to Snowflake.
Trail operates under OpenTelemetry standards, so developers can integrate with Grafana, Metaplane, PagerDuty, Slack and other observability and alert platforms.
More Developer Tool Updates
Snowflake launched a series of public previews for tools aimed at developers, including a public preview for AI Data Cloud with Snowflake Notebooks natively integrated with the full Snowflake platform. With
Notebooks promises users a single development interface for Python, SQL and Markdown. Developers can leverage Snowflake Notebooks to experiment and iterate on machine learning (ML) pipelines, employ AI-powered editing features and simplify data engineering workflows, among other use cases, according to Snowflake.
Also in public preview is a Snowpark pandas application programming interface (API) to allow Python developers use of the pandas syntax for AI and pipeline development.
A database change management feature to help with development operations (DevOps) and an integration with Git are in public preview. Going GA “soon” are Snowflake’s Python API and open source Snowflake command line interface (CLI), which should help with continuous integration and continuous delivery (CI/CD) practices, according to the vendor.
Product Use Updates
Snowflake celebrated growth in Streamlit, touting that the open source community it supports now has more than 275,000 monthly active developers and more than 6 million monthly application views.
The vendor bought Streamlit in 2022 for $800 million, growing the community more than sixfold since then, according to Snowflake.
For Snowflake’s Dynamic Tables offering, the vendor said that more than 2,900 users run more than 200,000 Dynamic Tables to build and manage production-grade data pipelines.