The 10 Biggest Cloud Outages Of 2013
No One Is Safe
No one is safe in the world of cloud outages, and 2013 seemed to make that abundantly clear. From Verizon Terremark to Microsoft Azure to Amazon Web Services and more, cloud companies were taking hits left and right over the past year. While some outages lasted only minutes, others drug out for days, causing companies to lose millions and customers to become increasingly frustrated. Here's a look back at the biggest cloud outages of the past year.
10. Dropbox
On Jan. 10, Dropbox kicked of 2013 with a cloud outage that resulted in the online file storage service going offline for more than 15 hours.
"Creating/joining shared folders, and creating shareable links to files, also affected. We appreciate [your] patience as we resolve this issue," Dropbox said in a Jan. 13 tweet. Later, the company said the outage was caused by a synchronization problem between client software and servers.
The company had another global cloud outage at the end of May, which, fortunately for Dropbox and its users, only lasted for an hour.
9. SCORM Cloud
A planned upgrade to the SCORM cloud to improve performance cut service to the cloud for more than three hours on March 14 when mistakes were made in the rollout. The update was designed to improve performance by changing how the cloud handles caching, but the changes caused a failure in a crucial Amazon server.
In order to fix the problems, SCORM, part of Franklin, Tenn.-based Rustici Software, had to take the cloud offline and temporarily restricted it to one availability zone.
8. Amazon Web Services
In September, a Friday the 13th outage took out regional Amazon Web Services due to a load balancing issue. Amazon resolved the connectivity issues for the load balancers and increased provisioning times to prevent the issue going forward. While the service was only down for about two hours and affected only a single availability zone in Virginia, it was an important reminder for companies to have backup plans for cloud infrastructure in case of outages, said a report by continuity software company Neverfail.
7. Microsoft Mail Services
A firmware update took down Microsoft's Hotmail, Outlook.com and SkyDrive services on March 13 for approximately 16 hours.
"This is an update that had been done successfully previously, but failed in this specific instance in an unexpected way," Arthur de Haan, vice president of Microsoft, wrote in a blog post at the time. "This failure resulted in a rapid and substantial temperature spike in the datacenter. This spike was significant enough ... that it caused our safeguards to come in to place for a large number of servers in this part of the datacenter."
The safeguards blocked users' access to their email and files, and according de Haan, there was a "mix of infrastructure software and human intervention that was needed to bring the core infrastructure back online."
6. Apple iCloud
April proved a tough month for Apple, as the company experienced a series of outages. The first hit on April 9 took out iMessage and FaceTime services for some users, and it returned for a second blow on the same services just three days later. A week later, it was reported that the Apple iTunes Store and Game Center had experienced outages. Also part of the outage were reports that a small percentage of users couldn't send emails through icloud.com for up to 27 hours.
And if those outages weren't enough, Apple wrapped up the month with an iCloud-based services outage, which included disruptions to login, email and iTunes services, among others. The outage spanned more than five hours in some areas.
5. Microsoft Azure, Part 1 And 2
Microsoft both started and ended the year with two Azure outages, the first of which occurred in February, followed by a second one in October.
The Feb. 29 Azure outage kicked some users out completely and left others unable to manage applications for more than eight hours. Microsoft said the outage was due to a "cert issue" related to a time calculation problem around the Leap Year day. Later in the year, a partial outage on Oct. 30 left Azure Compute cloud users unable to upload files or manage websites hosted on the Azure servers. Crediting the outage to a sub-component of the system that took out service worldwide, Microsoft had the issue resolved, with lingering partial outages in FTP services, after more than 20 hours of problems.
4. Amazon.com
An approximately 49 minute outage in August cost Amazon a staggering $5 million in potentially lost revenue, and although it wasn't a long outage, it certainly was an expensive one. While other pages of the site appeared to be working, the gateway page of the e-commerce site was down. On the bright side for Amazon, the outage didn't appear to touch Amazon Web Services, which remained running during the site outage.
On Twitter, some groups claimed responsibility for the outage, but reports said Amazon denied it was a DDoS attack.
3. Google
Even if it's just a five-minute outage, when Google goes down -- the Internet grinds to a halt. And on Aug. 16, Google.com did just that, with all services down for less than five minutes. In that time, reports said that the volume of global Internet traffic plummeted around 40 percent.
That wasn't the only time this year that Google services took a hit, however. Google Drive was down intermittently from March 18 to 20 for a grand total of 17 hours due to a glitch in the network control software. The outage impacted around one-third of Google customers. More recently, Gmail experienced a 12-hour outage on Sept. 23 from a dual network failure that hit around a third of its user base.
2. YahooMail
Starting on Dec. 9 and stretching through to Dec. 13, Yahoo users were unable to access their free email accounts with little explanation from the Internet giant as to why. The high-profile POP-IMAP outage was caused by a "specific hardware outage" in Yahoo's storage system and was determined to be trickier to resolve than initially thought. The problems marked a step back for the company as it works to reinvent itself under CEO Marissa Mayer.
"This has been a very frustrating week for our users and we are very sorry," Mayer said in a statement on Tumblr. "For many of us, Yahoo Mail is a lifeline to our friends, family members and customers. This week, we experienced a major outage that not only interrupted that connection, but caused many of you a massive inconvenience -- that's unacceptable and it's something we're taking very seriously. Unfortunately, the outage was much more complex than it seemed at first, which is why it's taking us several days to resolve the compounding issues."
1. Verizon Terremark
A Verizon Terremark outage on Oct. 27 caused yet another blow to the Affordable Care Act site, which has been struggling since its launch at the beginning of October. Verizon Terremark was in charge of hosting the site's database hub. Verizon vaguely attributed the outage to a "failure in a networking component," and in the process of fixing the component, a problem occurred in the regular maintenance that took the site down. Other sites were also hit by the Terremark outages, Verizon said, but HealthCare.gov was the highest profile site hit by the downtime. At the time of the outages, Verizon did not respond to request for comment, but solution providers told CRN that there should have been redundancies and diversity built into the system to prevent the problem.
A month later, reports said that the Department of Human Services decided not to renew its contract with Verizon and instead signed a $38 million contract with HP's Enterprise Services unit for hosting going forward.