CrowdStrike Has Been Doing Updates This Way ‘For Many Years’: What Went Wrong?

With a potentially lengthy recovery ahead, the defective CrowdStrike update that led to an unprecedented worldwide Microsoft outage will ultimately raise questions about the automatic update process for cybersecurity tools.

It’s actually no exaggeration to say, as John Hammond did Friday, that the IT outage caused by a defective CrowdStrike software update has been “earth-shattering.” If ever a cyber incident has been both severely disruptive and global in its proportions, it’s this one—as Hammond, principal security researcher at cybersecurity specialist Huntress, rightly noted.

The twist, however, is that it was not a cyberattack that caused the widespread disaster. Instead, it was the actions of a cybersecurity vendor, whose whole purpose is to prevent cyber disruptions, which led to the meltdown of potentially millions of Microsoft Windows systems worldwide and impacted much of what the modern world depends on, from air travel to health care to banking.

[Related: CrowdStrike CEO George Kurtz: ‘We’re Deeply Sorry’ For Massive Disruptions]

Without a doubt, “this one took us all by surprise,” Hammond said.

George Kurtz, the co-founder and CEO of CrowdStrike, would seem to be as shocked as anyone. This was not some new process that the cybersecurity vendor was using when it deployed a “content” update to its Falcon software; the updates have been done this same way “for many, many years,” Kurtz said Friday on NBC’s “ Today” show.

So: What could have gone wrong?

Trellix CEO Bryan Palma, whose company competes with CrowdStrike on endpoint security, said the time for those questions will certainly come, once the recovery has been completed. Which may not be anytime soon.

“If you have hundreds of thousands or millions of endpoints, it's a big deal,” Palma told CRN. “It’s going to take weeks to figure out. And it's super hard to do at scale. So this isn't a problem that's over in 24 to 48 hours.”

Once things have stabilized, CrowdStrike will undoubtedly face tough questions about its automatic update process, which experts say does not appear to have been staggered to ensure that a problematic update would have a limited effect.

“The next set of questions will be, is this acceptable? Who else does it this way? Why was it done this way?” Palma said. “All that will come out over time.”

ThreatLocker CEO Danny Jenkins said it appears that the CrowdStrike update was not staggered because it was not a full software patch, which would have been released in stages.

Instead, Jenkins said, this was an update to CrowdStrike Falcon likely targeted at protecting customers against newly discovered cyberthreats, which is a frequent type of update for an endpoint security tool.

To keep customers protected, CrowdStrike “wants to push those threat updates instantly, to as many people as possible,” he said.

As a result, a high proportion of CrowdStrike customers were likely to be affected by the update, which the company has said contained an unspecified “defect” for the Windows version of Falcon.

Hammond noted that as part of keeping up with hackers, many cybersecurity vendors have adopted similar practices around automated updating. Access to the Windows kernel—which has been implicated in the Microsoft outage—has also been considered crucial in order to provide strong security, he said.

The conglomeration of factors that made the outage possible is really “the nature of the beast” in terms of today’s cybersecurity practices, Hammond said.

Staying Ahead Of Hackers

In his televised comments Friday, Kurtz said the larger context here is that CrowdStrike does things this way because it is “always trying to stay one step ahead of the adversaries.”

“Our systems are always looking for the latest attacks from these adversaries that are out there,” Kurtz said, indicating that this “content update” went out in connection with the changing threat environment.

Ultimately, “when you look at software, it is a very complex world, and there are a lot of interactions. And always staying ahead of the adversary is certainly … a tall task,” he said.

At the same time, effectively pushing cybersecurity-related updates is all about finding the right “balance,” Trellix’s Palma said. “And obviously, their balance was out of whack here.”

In a statement provided to CRN Friday, CrowdStrike said it is “working with all impacted customers to ensure that systems are back up and they can deliver the services their customers are counting on.”

“We understand the gravity of the situation and are deeply sorry for the inconvenience and disruption,” CrowdStrike said in the statement.

Kin Mitra, president and CEO at Mission Critical Systems, a Fort Lauderdale, Fla.-based CrowdStrike partner, said in an email to CRN that most of his company’s customers use CrowdStrike on Windows machines. “We have been in firefighting mode all day,” Mitra said.

While CrowdStrike has certainly “stepped up to the plate to help customers,” he said, it’s unclear why the vendor “did not do better vetting of the update internally before deploying.”

“I think they have learned a tough lesson,” Mitra said.