AWS outage: Downtime incident blights users of one of Amazon’s major US datacentre regions

Public cloud giant suffers prolonged service outage due to fault originating in one of its major US datacentre regions

Caroline Donnelly, Senior Editor, UK

Published: 26 Nov 2020 10:57

Amazon Web Services (AWS) users are awaiting a full explanation from the public cloud giant about the cause of a prolonged outage at one of its major US datacentre regions that began on Wednesday 25 November 2020, US time.

The source of the downtime incident is known to have originated within the company’s US-East-1 datacentre region, and caused by a defect in the application programming interface (API) of its real-time data-streaming service, Kinesis Data Streams (KDS).

The issue is known to have blighted the usability of number of high-profile internet services that rely on KDS during the incident, many of whom used the social networking site Twitter to confirm themselves as affected by the downtime issue. One said:

“An Amazon AWS outage is currently impacting Adobe Spark, so you may be having issues accessing/editing your projects. We are actively working with AWS and will report when the issue has subsided. https://t.co/uoHPf44HjL for current Spark status. We apologize for any inconvenience! – Adobe Spark (@AdobeSpark) November 25, 2020.”

The outage has also served to highlight the interdependencies that exist within the wider AWS portfolio, as the issues encountered by the KDS API are known to have negatively affected the performance of a number of other AWS services that rely on it to work.

The company’s cloud service status pages makes reference to other “dependent services” being affected by the outage, which AWS first acknowledged around 2am GMT time on Thursday 26 November.

For example, respondents to the AWS Support Twitter feed reported issues with its code building and test offering, Code Pipeline, its infrastructure monitoring service, Amazon Cloudwatch, and – at one point during the outage – the service status page was also unavailable.

At the time of writing, the AWS service status dashboard confirmed that the company had resolved the issue, and service had been restored to all of the affected parts of the AWS portfolio, but no further details have been given at this time about the circumstances that led to the outage occurring in the first place.

“We have identified the root cause of the Kinesis Data Streams event, and have completed immediate actions to prevent recurrence. Kinesis and CloudWatch are operating normally,” said a statement on the AWS Service Status page, published just after 9am GMT today.

AWS outage: Downtime incident blights users of one of Amazon’s major US datacentre regions

Public cloud giant suffers prolonged service outage due to fault originating in one of its major US datacentre regions

Read more about cloud outages

Read more on Datacentre backup power and power distribution

Be aware of these CloudWatch Logs limits and quotas

AWS outage: API and networking issues disrupt services hosted in major US Amazon datacentre hub

Hazelcast Jet 4.4 brings SQL to stream processing engine

Learn from these real-world AWS serverless examples