Visa reveals 'rare' datacentre switch fault as root cause of June 2018 outage

Visa has offered a retrospective analysis of what went wrong in its datacentre during its UK-wide outage on Friday 1 June, in response to a request from the Treasury Select Committee for more detail about the downtime

Caroline Donnelly, Senior Editor, UK

Published: 19 Jun 2018 15:30

Visa has revealed a “rare defect” in a datacentre switch is what stopped millions of credit card transactions from being carried out during its UK-wide outage on Friday 1 June, in a letter to the Treasury Select Committee.

The Committee is understood to have contacted the credit card payments firm, seeking both clarification over the cause of the outage and assurances about what action Visa is taking to prevent a repeat of it occurring at a later date.

Over the course of the 11-page missive, Visa expands on its previous explanation of a “hardware failure” being the cause of the 10-hour outage by laying the blame on a defective switch in its primary UK datacentre, which – in turn – delayed its secondary datacentre from taking over the load.

The primary and secondary datacentre are setup so that either one has sufficient redundant capacity to process all the Visa transactions that take place across Europe should a fault occur, and the systems are tightly synchronised to ensure this can happen at a moment’s notice.

“Each datacentre includes two core switches – a primary switch and a secondary switch. If the primary switch fails, in normal operation the backup switch would take over,” the letter reads.

“In this instance, a component within a switch in our primary data centre suffered a very rare partial failure which prevented the backup switch from activating.”

This, in turn, meant it took longer than intended to isolate the primary datacentre and activate the backup systems that should allow its secondary site to assume responsibility for handling all of the credit card transactions taking place at that time.

Failed transactions

In total, 51.2m Visa transactions were initiated during the outage, and 5.2m failed to go through.

Since the outage resolved, Visa said it has focused its efforts on preventing a repeat of the events of 1 June, but admits it is still not clear on why the offending switch failed when it did.

“We removed components of the switch that malfunctioned and replaced them with new components provided to us by the manufacturer,” the company said.

It is also working with its hardware manufacturer to conduct a “forensics analysis” of the faulty switch, Visa added, and undertaking a “rigorous” internal review of its processes.

“We are working internally to develop and install other new capabilities that would allow us to isolate and remove a failing component from the processing environment in a more automated and timely manner,” it said.

“Bringing in an independent third party to ensure we fully understand and embrace lessons to be learned from this incident.”

Visa reveals 'rare' datacentre switch fault as root cause of June 2018 outage

Visa has offered a retrospective analysis of what went wrong in its datacentre during its UK-wide outage on Friday 1 June, in response to a request from the Treasury Select Committee for more detail about the downtime

Read more about datacentre outages

Failed transactions

Read more on Datacentre disaster recovery and security

Datacentre outages decreasing in frequency, Uptime Institute Intelligence data shows

Fintech summit reveals what’s next for sector

What is an automatic transfer switch?

load shedding