The Coolest Data Warehouse And Data Lake System Companies Of The 2025 Big Data 100
Part 4 of CRN’s Big Data 100 takes a look at the vendors solution providers should know in the data warehouse and data lake systems space.
A Deep Dive Into Data
Data warehouse systems have been at the center of many big data initiatives going as far back as the 1980s. Today companies from leading cloud hyperscalers such as Amazon Web Services (Redshift) and Google Cloud (BigQuery), established data platform providers like Cloudera and Teradata, and startups like Firebolt continue to fuel the growth in the data warehouse space.
In recent years companies such as Databricks and Starburst have promoted the concept of data lakes as more flexible, more cost-effective alternatives to data warehouses.
As part of the CRN 2025 Big Data 100, we’ve put together the following list of data warehouse and data lake companies—from well-established vendors to those in startup mode—that solution providers should be familiar with.
These vendors offer data warehouse and data lake systems that provide data and related capabilities such as data transformation and data governance needed to support advanced data analysis tasks.
Given the wave of AI development that’s boosting the demand for data, many data warehouse and data lake technology developers are adapting their platforms to go beyond data analytics to provide the high volumes of high-quality data needed for AI and generative AI tasks.
This week CRN is running the 2025 Big Data 100 list in a series of slide shows, organized by technology category, spotlighting vendors of business analytics software, database systems, data warehouse and data lake systems, data management and integration software, data observability tools, and big data systems and cloud platforms.
Some vendors have big data product portfolios that span multiple technology categories. They appear in the slideshow for the technology segment in which they are most prominent.
Cloudera
Top Executive: Charles Sansbury, CEO
Cloudera describes its flagship offering as a hybrid cloud data platform that provides the benefits of both on-premises (private cloud) and public cloud data architecture form factors for data management and data analytics tasks.
The company’s on-premises offerings include Cloudera Data Warehouse, Cloudera Data Engineering and Cloudera AI. Those same services are available in the cloud, plus Cloudera Data Flow, Cloudera Data Hub, Cloudera Operational DB and Cloudera Streaming.
Cloudera Shared Data Experience (SDX) is a critical component of the Cloudera platform that ensures data security and governance across on-premises and cloud deployments.
In November Cloudera acquired Octopai, an Israeli developer of data lineage and data catalog software, in a move to expand Cloudera’s data catalog and metadata management capabilities for data analytics and AI tasks.
Dremio
Top Executive: Sendur Sellakumar, CEO
Dremio markets its Dremio Unified Lakehouse Platform for governed, self-service analytics and AI applications.
The Dremio platform incorporates the Apache Software Foundation’s Arrow framework for developing data analytics applications using columnar data, the Apache Iceberg format for data analytics tables, and a SQL query engine for business intelligence and interactive analytics.
One of Dremio’s target markets is businesses and organizations looking to migrate off of legacy Hadoop systems.
Earlier this month Dremio debuted the latest release of the Dremio Unified Lakehouse Platform with new intelligent automation capabilities to accelerate AI and analytics projects and reduce costs.
Firebolt
Top Executive: Eldad Farkash, CEO
Israeli startup Firebolt offers a high-performance, low-latency data warehouse cloud service for AI and analytics tasks.
The company’s data warehouse promises sub-second query times, data ELT scalability, high concurrency and optimized compute capabilities for AI, analytics and data applications.
Founded in 2019, Firebolt officially launched its cloud data warehouse service in September 2024 after five years of development. The service is particularly targeted toward developers and data engineers who require high performance for customer-facing analytics and data-intensive applications.
Firebolt has raised $270 million in funding, including $127 million in Series B funding in 2021 and $100 million in Series C funding in 2022. In February the company hired former Oracle and Confluent executive Hemanth Vedagarbha as president to oversee global go-to-market expansion and customer-facing operations.
Kyvos Insights
Top Executive: Praveen Kankariya, CEO
Kyvos Insights’ key product is its Semantic Lakehouse for business intelligence and AI.
The GenAI-driven, high-speed data analytics platform boasts sub-second querying on massive datasets. The universal semantic layer technology “democratizes” data for all users across an organization, according to the company, enabling self-service analytics.
Other products offered by Los Gatos, Calif.-based Kyvos Insights include Kyvos BI and Kyvos Dialogs.
Ocient
Top Executive: Chris Gladwin, CEO
Ocient handles real-time analytics and other compute-intensive data workloads through its Ocient Hyperscale Data Warehouse.
Working with hyperscale datasets, the system efficiently performs data transformation and load tasks and executes complex queries, as well as performing AI, machine learning and geospatial analysis jobs.
Ocient emphasizes its vertical data analytics services tailored for government, telecommunications, financial services, advertising, geospatial and operational IT.
In January Ocient announced a collaboration with chip designer AMD to pair Ocient’s software with 4th Gen AMD EPYC processors to deliver a 3.5X increase in processing power.
In March 2024 Ocient, headquartered in Chicago, raised $49.4 million in an extension of its Series B funding.
Onehouse
Top Executive: Vinoth Chandar, CEO
The Onehouse Universal Data Lakehouse data storage offering is a fully managed cloud data lakehouse service that can ingest data from many sources in minutes and supports all data analytics and business intelligence query engines.
The service is built on the Apachi Hudi open-source data management framework that brings database and data warehouse capabilities to data lakes. (Onehouse founder and CEO Vinoth Chandar was Hudi’s original developer while he was working at Uber.)
In January Sunnyvale, Calif.-based Onehouse the general availability of Onehouse Compute Runtime which optimizes data workloads across all leading cloud data platforms and query engines including Amazon Redshift, Databricks, Google BigQuery and Snowflake.
Starburst
Top Executive: Justin Borgman, CEO
Starburst’s data lakehouse platform facilitates analytics and AI workloads by unifying data across hybrid, distributed environments. The system incorporates the open-source Trino query engine and Apache Iceberg data table format.
The Starburst portfolio includes Starburst Enterprise, for on-premises deployments and running on hyperscaler cloud platforms, and the fully managed Starburst Galaxy data lakehouse service.
Starburst’s software is also a key technology within Dell Technologies’ Dell Data Lakehouse system.
In February Starburst, headquartered in Boston, said that it achieved record global sales in fiscal 2025, grew its customer base by 20 percent during the year, and signed the largest deal in the company’s history – a multi-year, eight-figure per year contract with an unnamed global financial institution.
SQream
Top Executive: Liam Galin, CEO
SQream develops next-generation relational database technology that runs on Nvidia GPUs to provide a high-performance platform for data management, data analytics and what the company calls the “AI Factory.”
SQream Blue is the company’s cloud-native, fully managed data preparation and data transformation lakehouse.
Earlier this month SQream named Liam Galin as the company’s new CEO to “spearhead [the] SQream 2.0 era,” positioning the company as the “ultimate AI Factory enabler” with a strategic focus on mass-scale AI enablement leveraging the SQream platform’s ability to rapidly transform massive amounts of data into AI-driven insights.
The New York-based company also signaled that it would advance to a new go-to-market model focused on partner channels.
Teradata
Top Executive: Steve McMillan, President and CEO
Teradata was a pioneer in the data warehouse space when it was founded in 1979 as a collaboration between the California Institute of technology and Citibank’s advanced technology group.
Today the San Diego-based company offers the Teradata VantageCloud complete cloud analytics and data platform, which also includes the ClearScape Analytics in-database analytics for operationalizing AI and machine learning at scale.
In March Teradata launched Teradata Enterprise Vector Store, an in-database solution that the company said creates a single, trusted data repository for AI and supports retrieval-augmented generation. The company said Teradata Enterprise Vector Store brings the speed and power of the Teradata platform to vector data management, a crucial capability for effective AI applications.
Yellowbrick Data
Top Executive: Neil Carson, CEO
Yellowbrick targets its SQL data platform for a range of enterprise data warehouse, business intelligence, ad hoc and streaming analytics, and AI workloads. About two-thirds of Yellowbrick implementations are on-premises appliances but has recently been emphasizing its cloud services built on Kubernetes.
The Mountain View, Calif.-based company recently expanded its partnership with Red Hat to support Red Hat’s OpenShift platform, allowing businesses to run the Yellowbrick Data Platform alongside other applications on OpenShift, providing workload mobility across hybrid cloud environments.
