Microsoft: Windows ‘Clearly’ Needs Better Resilience After CrowdStrike Outage

‘This incident shows clearly that Windows must prioritize change and innovation in the area of end-to-end resilience,’ a Microsoft executive says in a blog.

Microsoft acknowledged that it must “prioritize change and innovation” for Windows following the massive CrowdStrike-caused outage to the operating system.

The outage, which began July 19 and had lingering impacts for much of the following week, saw 8.5 million Windows devices suffer the “blue screen of death” and become inoperable until they were fixed manually by an IT professional. The societal impacts were wide-ranging and estimates have suggested the costs to major corporations will reach into the billions of dollars.

[Related: SentinelOne CEO: Cybersecurity Shouldn’t Require Constant Updates]

In a blog post Friday, Microsoft executive John Cable wrote that the outage “shows clearly that Windows must prioritize change and innovation in the area of end-to-end resilience.”

“These improvements must go hand in hand with ongoing improvements in security and be in close cooperation with our many partners, who also care deeply about the security of the Windows ecosystem,” wrote Cable, vice president of Windows servicing and delivery at Microsoft.

Notably, Cable also touched on the role of third-party access to the Windows kernel, which is seen as having been a key factor behind the incident. CrowdStrike’s ability to impact the Windows kernel through its defective update has been said to have enabled the outage to occur.

In the blog post, Cable pointed to recently announced capabilities that “provide an isolated compute environment that does not require kernel mode drivers to be tamper resistant.” He also mentioned Microsoft’s Azure Attestation service.

“These examples use modern Zero Trust approaches and show what can be done to encourage development practices that do not rely on kernel access,” Cable wrote. “We will continue to develop these capabilities, harden our platform, and do even more to improve the resiliency of the Windows ecosystem, working openly and collaboratively with the broad security community.”

A report from The Verge characterized the blog as Microsoft opening the conversation around preventing security vendors from accessing the Windows kernel.

CRN has reached out to Microsoft and CrowdStrike for comment.

Ultimately, “the recent CrowdStrike incident underscores the need for mission-critical resiliency within every organization, and our unique ability to support the change required,” Cable wrote in the post.

CrowdStrike said that 97 percent of Windows sensors for Falcon were online as of Thursday.

In the vendor’s “Preliminary Post Incident Review” post last week, CrowdStrike specified that the update that led to the outage involved what’s known as “rapid response content,” which is used as part of performing "behavioral pattern-matching operations” to thwart future cyberattacks.

The defective content in question had been stored within a “proprietary” binary file and was “not code or a kernel driver,” CrowdStrike said.

CrowdStrike disclosed in the preliminary review that a bug in its validation process for security configuration updates to its Falcon platform resulted in the outage.