The 10 Biggest Big Data News Stories Of 2019

From major acquisitions like Salesforce-Tableau, to the exploding use of AI and machine learning technologies, to increased consumer demands for data protection regulations, the headlines in the big data world in 2019 were, well, big.

Big News In Big Data

From blockbuster acquisitions, to controversy over data privacy and protection, to the rapid incorporation of artificial intelligence and machine learning within business analytics tools, 2019 was an eventful year in the Big Data arena.

Acquisitions were rife in the industry this year, from game-changing deals like Salesforce.com’s $15.7 purchase of data analytics and visualization software vendor Tableau Software to smaller acquisitions that allowed companies like Sisense, Alteryx and Logi Analytics to expand their technology portfolios.

There were big data companies like Databricks that succeeded and flourished, and companies like MapR Technologies that failed.

And there were the mega-trends of increased use of AI and machine learning in big data software, the growing use of Kubernetes as a big data platform, and the rising focus on unprotected and misused customer data.

Here’s a look at what we think were the 10 biggest news stories in Big Data in 2019.

10. SAS Goes All-In On AI With $1-Billion Spending Plan

Artificial intelligence and machine learning were hot in 2019, especially in the big data and business analytics space. But the industry sat up and took notice when business analytics software leader SAS announced in March that the company would invest $1 billion in AI over the next three years.

Specifically, SAS said it would spend the $1 billion on AI research and development, educational initiatives to address customer needs and help them better understand and benefit from AI, and expert services to optimize customer return on AI projects.

SAS is already working with AI in such areas as advanced analytics, machine learning, deep learning, natural language processing and computer vision. The company is building AI capabilities into the core SAS platform and other applications for data management, customer intelligence, fraud and security intelligence, and risk management, as well as within applications for such industries as financial services, government, health care, manufacturing and retail.

9. Big Data Gets Real (Time)

Real-time big data technology has been gaining steam in recent years as demands grow for the ability to process and analyze streaming data in real time, such as data generated by financial trading systems, operational IT applications or networks of Internet of Things devices.

Market researcher IDC has forecast that by 2025 nearly 30 percent of all generated data will be real-time data.

That’s driving the growing popularity of tools and software for managing and analyzing real-time data, including open-source systems like Apache Flink, Apache Kafka and Apache Storm; software from developers including Confluent and Striim; and cloud-based systems such as Amazon Kinesis and Azure Stream Analytics.

8. The Rise Of Kubernetes As A Big Data Platform

As big data systems continue to sprawl across hybrid-cloud and multi-cloud systems, Kubernetes, the open-source container-orchestration technology, continued to gain momentum in 2019 as the platform of choice for running big data workloads.

In one notable example Google, which originally developed Kubernetes, has recently been experimenting with the container platform as an alternative to the YARN resource scheduling software for scheduling Apache Spark workloads for the Google Cloud Platform.

Big data developers are writing their software for the Kubernetes platform for running big data workloads and automating application deployment and management. As big data increasingly becomes “Big Data-as a-Service,” look for this trend to continue and even accelerate in 2020.

7. Business Analytics Vendors Expand Product Capabilities Through Acquisitions

Salesforce.com’s acquisition of Tableau Software and Google’s bid to buy Looker Data Sciences caught everyone’s attention in 2019. But beyond those blockbuster deals there were a significant number of acquisitions in the business analytics space throughout the year. In most cases the acquiring companies were looking to expand the capabilities of their technology portfolio, such as with new analytical functionality, data management software or ETL (extract, transform and load) tools.

In May Sisense, an established player in the analytics space, bought up-and-coming cloud analytics developer Periscope Data to add that company’s advanced data science and analysis capabilities to its product lineup. One month earlier Alteryx, also an established data analysis platform vendor, spent $20 million to acquire ClearStory Data and its data preparation, blending and discovery software.

Qlik, one of the best-known developers of business analysis software, acquired Attunity, a developer of data management and integration software, in a $560 million deal. Data management/secondary storage tech vendor Cohesity bought Imanis Data and its data management and backup software for Hadoop and NoSQL workloads and databases. And Logi Analytics looked to build on its success in embedded analytics with its purchase of business analytics software vendor Zoomdata.

6. The Collapse Of MapR Technologies

MapR Technologies of Santa Clara, Calif., was one of the leading developers of Hadoop-based big data platforms, along with rivals Cloudera and Hortonworks. The once-high-flying company raised $280 million in venture funding and at one time had a market capitalization that exceeded $1 billion.

But Hadoop sales haven’t taken off in the way many people expected. Managing Hadoop and developing applications on it proved to be complex. And cloud-based systems offered by Amazon Web Services and Microsoft Azure cut into the demand for on-premise Hadoop systems using HDFS (Hadoop Distributed File System).

The market changes were a driver behind the merger of Cloudera and Hortonworks. MapR, struggling with poor financial results, warned in May that without additional funding or a buyer it might have to lay off workers and close its headquarters. In August Hewlett Packard Enterprise announced a deal to acquire MapR’s business assets, including its technology and intellectual property for an undisclosed sum. HPE is combining the technology with other products, including the BlueData container platform, in a move to expand its AI and data analytics business.

5. The Rise, Fall, And Rise Of Cloudera

Cloudera has been one of the leading big data software companies in recent years, a position that was enhanced at the start of 2019 when it completed its merger with rival Hortonworks, seemingly creating an unbeatable big data juggernaut.

But Cloudera, already tasked with the challenge of integrating the Cloudera and Hortonworks product lines, was also facing questions about the demand for Hadoop-based products like Cloudera’s platform. (Those concerns were highlighted mid-year when MapR Technologies, which also sold Hadoop-based systems, closed its doors and sold its technology and business assets to Hewlett Packard Enterprise.)

Things came crashing down in June when Cloudera, while announcing earnings that beat expectations, issued revenue guidelines that did not. CEO Tom Reilly, who had led the company for six years, announced that he would step down at the end of July – news that sent the company’s stock plummeting nearly 30 percent.

In September, however, Cloudera launched the Cloudera Data Platform, the company’s new flagship product that the company touted as being much more than a merged version of the older Cloudera and Hortonworks systems. Also in September Cloudera acquired Arcadia Data’s technology assets, including its ArcEngine software that boosts analytic performance. Those moves, combined with a major overhaul of the Cloudera Connect partner program, appeared to position the company for a smoother 2020.

4. Growing Calls For More Rigorous Data Privacy/Protection Regulations

The European Union’s General Data Protection Regulation (GDPR) went into effect on May 25, 2018. In the U.S., meanwhile, there has been a seemingly never-ending stream of big data disasters: hacked customer databases, customer databases left unprotected on the web, and – most ominously – the misuse of user and customer data by online marketers, social media companies and others.

In July, for example, Capital One revealed that a hacker – a former programmer at the financial services company, had gained access to personal information from 106 million credit card customers and applicants.

Also in July, Facebook agreed to pay a $5 billion penalty to settle Federal Trade Commission allegations that the social media giant mishandled user privacy practices and lost control over huge amounts of user data.

All this has led to growing calls for more rigorous data privacy and protection regulations in the U.S. In July, Sen. Ron Wyden renewed his push for data privacy legislation and urged passage of the Consumer Data Protection Act that he introduced in late 2018. And in September 51 top CEOs in the U.S. – including top executives at Amazon, IBM, Dell Technologies and JP Morgan –urged passage of federal data privacy legislation in an open letter to Congress.

California has gone it alone, creating the California Consumer Privacy Act that takes effect Jan. 1, 2020.

3. The Automated ML Explosion

Artificial intelligence and machine learning were hot in 2019, and probably nowhere more so than in the business analytics space. Indeed, it seemed at times that “big data analytics” and “artificial intelligence” had become synonymous.

Machine learning provides IT systems with the ability to learn and improve from experience without being programed. Trouble is, building machine learning systems and the models that support them can be a complex task that requires serious data science expertise. That has led to an explosion of automated machine learning tools, frameworks and services – offered through either automated ML products or built into existing big data systems.

Companies developing automated ML software include Aible, Big Squid, dotData, DataRobot and H2O.ai. Established big data and business analytics companies offering automated ML capabilities as part of their technology portfolios include Cloudera, Databricks, SAS and Splunk, as well as cloud platform vendors Microsoft (Azure Machine Learning) and Amazon Web Services (Amazon SageMaker).

2. Google Cloud Bids To Acquire Looker For $2.6 Billion

In June Google struck a deal to acquire business analytics software developer Looker Data Sciences for $2.6 billion in a move to expand the business intelligence capabilities of the Google Cloud Platform.

Google said that by adding the Looker technology to GCP, Google could provide customers with a more comprehensive analytics solution, from data ingestion to embedded analytics and data visualization.

But the two companies have yet to wrap up the deal and the acquisition has hit a couple of bumps, notably a deeper review of the deal by the U.S. Department of Justice’s antitrust division and closer scrutiny of the acquisition by the U.K.’s competition watchdog agency.

1. Salesforce Buys Tableau Software In Blockbuster $15.7 Billion Deal

Tableau has generally led the pack of business analytics and data visualization software vendors in recent years, going public in 2013 and reaching $1.16 billion in sales in 2018. So it came as a surprise to many in June when the Seattle-based company agreed to be acquired by cloud application giant Salesforce.com for a whopping $15.7 billion. The acquisition was completed Aug. 1.

The acquisition was seen as evidence of the increasingly data-centric nature of today’s business environment. Salesforce and Tableau executives said the combination of Salesforce CRM applications and Tableau business analytics software would create a digital transformation powerhouse.

Salesforce has promised to continue operating Tableau as an independent subsidiary but has yet to provide details about integrating the two companies’ product lines and how – or if – Tableau’s software will work with Salesforce’s existing Einstein business analytics software. (While some expected to hear more at Tableau’s recent Tableau Conference 2019, a shareholder lawsuit in the U.K. reportedly kept executives from discussing those plans.)