Romolo Tavani - stock.adobe.com
IBM Cloud users are still awaiting a full explanation from Big Blue about why they were left unable to access the firm’s off-premise services during a multi-hour outage on Tuesday 10 June 2020.
Users across the globe, including the US, Australia and Japan, were left unable to access a number of core services within the IBM Cloud portfolio – including the firm’s own status page – during the outage, which is known to have begun at around 2.30pm Pacific Time.
The firm’s handling of the outage came under fire during the incident on social networking site Twitter, as users flooded the IBM Cloud account for clarification on whether its services were experiencing technical difficulties, and – if so – when they were likely to be restored.
However, it was not until several hours into the outage that the IBM Cloud Twitter account acknowledged the issue by running a message stating: “We are aware of the reported IBM Cloud outages, we are investigating and service will be restored as soon as possible.”
This was followed up around an hour later with a brief follow-up message, at around 6.30pm Pacific Time, that confirmed its services were back up and running, but it stopped short of revealing the root cause of the problem.
This omission, in turn, prompted a flood of responses from disgruntled IBM Cloud users demanding the firm conduct and publish a thorough post-mortem into the outage.
The now-restored IBM Cloud status page provides some additional detail about what may have caused the outage, as several entries on there for incidents that occurred on the same day and at the same time as the outage make reference to an issue being introduced to its network operations by a third-party provider.
A separate status page for the IBM-owned Aspera service, which is used by enterprises that need to move large files and big data-related repositories, goes into a little more detail, before going on to confirm that a root cause analysis of the incident will be published imminently.
“A third-party network provider was advertising routes which resulted in our [worldwide] traffic becoming severely impeded,” the IBM Aspera status page states.
“This led to IBM Cloud clients being unable to log-in to their accounts, greatly limited internet [datacentre] connectivity and other significant network route-related impacts. Network Specialist have made adjustments to route policies to restore network access, and alleviate the impacts.”
Computer Weekly contacted IBM for further comment and to see when further clarification on the cause of the outage would be forthcoming, but had not received a response at the time of publication.
Read more about global cloud outages
- The majority of organisations (77%) are unaware of the financial toll a cloud outage can have on their business, and where responsibility lies for restoring access to their applications during one.
- Although the Google Cloud outages and performance degradations this week were quickly repaired and ultimately had limited impact on customers, they served as a reminder to customers to keep pressure on vendors to improve cloud reliability.