Microsoft 365 Services Hit By Outage

This is at least the third major outage for Microsoft in July.

Days after a faulty CrowdStrike update downed millions of Windows machines worldwide and a coincidental Microsoft Azure region outage in the United States, Microsoft has reported “access issues and degraded performance” for a variety of services and features under its Microsoft 365 umbrella.

The Redmond, Wash.-based cloud and AI tools vendor posted to X – formerly known as Twitter – at 5:48 a.m. Pacific Time on Tuesday that “we’re currently investigating access issues and degraded performance with multiple Microsoft 365 services and features,” according to the X post. “More information can be found under MO842351 in the admin center. The outage occurred just hours before Microsoft reports earnings for its fourth fiscal quarter.

At 7:51 a.m. Pacific, Microsoft added: “We’ve applied mitigations and rerouted user requests to provide relief. We're monitoring the service to confirm resolution and further information can be found at https://status.cloud.microsoft or under MO842351 in the admin center.” When a CRN reporter tried to access the cloud status website about 10 minutes after the X post, the reporter received a “site can’t be reached” error message.

[RELATED: Microsoft Q4 2024 Earnings Preview: 5 Things To Know]

Microsoft 365 Outage

CRN has reached out to Microsoft for comment.

On a Microsoft Azure Status website, the vendor said that “starting approximately at 11:45 UTC on 30 July 2024, a subset of customers may have experienced issues connecting to Microsoft services globally.”

“We have implemented networking configuration changes and have performed failovers to alternate networking paths to provide relief,” according to the status page. Monitoring telemetry shows improvement in service availability from approximately 14:10 UTC onwards, and we are continuing to monitor to ensure full recovery.

A chart breaking down the issues warned that users of any region in the Americas, Asia Pacific, Europe, the Middle East and Africa (EMEA) could face network infrastructure issues.

Ookla’s Downdetector website showed reports of M365 outages reaching a 24-hour high of 363 Tuesday around 6:22 a.m. Pacific. Azure outage reports reached a 24-hour high of 483 at around the same time.

Users in the “SysAdmin” Subreddit reported experiencing outages in Eastern Europe, the United Kingdom, Canada, and the Eastern U.S.

Some users found humor in the situation. “Why does Team's never go down for me? An afternoon free of calls or messages would be lovely,” one user wrote, referring to Microsoft’s popular communications and collaboration platform.

July 19 Outage

The outage does not appear related to the firestorm that has erupted since a faulty CrowdStrike update interfered with millions of Windows machines on July 19 – with some Windows users including Delta Air Lines still dealing with the effects of the outage days later.

Hours before the CrowdStrike incident, “Microsoft experienced an unrelated (outage) that affected access to various Azure services and customer accounts configured with a single-region service in the Central US region,” according to a July 26 blog post by Cisco subsidiary ThousandEyes.

“This outage occurred around the same time as the CrowdStrike incident, from 9:56 PM (UTC) on July 18 to 12:15 PM (UTC) on July 19,” according to the post. “The close timing of the two incidents may have caused some confusion and led to the larger global IT outage being mistakenly attributed to Microsoft. Although Microsoft systems were affected during the CrowdStrike incident, it was completely unrelated to the Azure incident.”

The outage included “failures of service management operations and connectivity or availability of services,” with ThousandEyes saying that “connectivity into the Central US region appeared impaired, with forwarding loss being observed at the ingress points to the affected region … Among those impacted were Confluent, Elastic Cloud, and Microsoft 365.”

“Microsoft’s status update also identified a configuration change as the underlying cause that impacted the connectivity of backend services, specifically storage clusters and compute resources. This then triggered some automated mitigation with services being restarted repeatedly.”

ThousandEyes also reported on an Azure issue July 13 that led to service interruptions for Grammarly.

“Azure reported that the Azure OpenAI (AOAI) service has an automation system that is implemented regionally but uses a global configuration to manage the lifecycle for certain backend resources,” according to the report. “A change was made to update this configuration to delete unused resources in an AOAI internal subscription. There was a quota on the number of storage accounts on this subscription, which were unused and intended to be cleaned up to prevent storage quota pressure.”