
Syda Productions - stock.adobe.c
Business resilience needs comprehensive approach
The cyber attack on Marks & Spencer showed the vulnerability of even very established companies. But business applications resilience goes beyond cyber security basics
In Britain, describing Marks & Spencer (M&S) as a high-profile retailer is akin to describing King Charles III as a well-known monarch. Founded in the reign of Queen Victoria, M&S is one of a handful of FTSE-100 listed retailers and is unique in being known equally for its groceries and its clothing.
Nearly half of Britons shop there each year and even those who don’t will typically have an opinion on its quality, often with separate assessments of its groceries and its clothing. Rather than being dragged down by having more than 1,000 physical shops, it uses them to support its online sales by allowing customers to choose to pay online and then collect items from stores.
But M&S had to pause this on 22 April after being hit by a ransomware cyber attack. It suspended online orders and contactless payments in shops, reported that some customer data had been stolen and experienced gaps on its food shelves due to distribution problems.
In its full year results published on 21 May 2025, it said it expected a gross £300m hit to its profits as a result of the incident, including through having to revert to manual processes, although this figure could be reduced through insurance pay-outs and other actions.
While it restored contactless payments within days of the report, it took M&S until 9 June to restart online orders, albeit for a limited range of clothes.
“Thankfully, this tragedy has a redemptive arc,” said Times fashion editor Harriet Walker the next day.
Planning to fail
The retailer is one of several to experience a successful attack on its business software applications in recent months. While prevention remains the ideal, the impact on M&S shows the value of resilience when an attack gets through.
It is much better to think about such resilience before rather than after an attack. Consultancy BML has helped to review resilience after significant data breaches, with chief operating officer Jaco Vermeulen recalling one caused by human error and poor security controls.
“They went into a mode of ‘we need to do absolutely everything’, went overboard and became constrictive rather than enabling because they wanted to protect themselves in all forms and fashions,” he says. “You need to find a pragmatic balance.”
Most of Vermeulen’s work is linked to mergers and acquisitions such as due diligence assessments of technology risks, particularly as resulting organisations can have ‘mixed estate’ problems with a range of systems that are not adequately integrated.
“Everyone looks at the shiny,” he says, meaning the new systems that improve efficiency in one area. “What they never look at is foundations.” This includes organisational abilities to integrate systems; identity, access and privilege management; and centralised, replicated data management designed to work for all parts of the business.
“Focus on the boring but important things first,” he says, something often neglected as part of an acquisition process, with a growing, and probably misguided, belief that artificial intelligence (AI) can solve such problems likely to make this worse in the future.
Vermeulen says that secondary business continuity systems, designed to step in if primary systems are compromised, can be worthwhile but need to be weighed against the cost of a successful attack, adding: “Business continuity is directly tied to value preservation.”
He says he worked with a healthcare provider using a specialist piece of cancer care equipment worth around £10m and which was crucial in supporting patients’ care. The provider decided to pay for secondary systems and network redundancy costing hundreds of thousands a year, given the high costs of having the specialist equipment unavailable.
However, when helping a warehouse management and distribution company move from paper to digital, the consultancy advised it against paying a lot more for a secondary system. Instead, the company kept its paper system as a fallback, augmenting this by getting staff to take pictures of labels on mobile devices. When the new digital system was restored, they could fill in gaps with what was collected on paper, partly automated using barcode, QR code and optical character recognition of the pictures.
Penetration testing – getting security experts to find weaknesses – is a common way to test the resilience of technology including business applications.
“I don’t think I’ve seen a test ever where we don’t find holes,” says Alex Woodward, a senior vice-president for cyber security at consultancy CGI.
Common problems include poor security hygiene, including outdated security patching, excessive user permissions and poor management of assets – the last typically involving a small proportion of non-standard hardware and software applications which are not managed to the level of the majority.
Woodward says that organisations focus on critical and high-level vulnerabilities with core systems: “There is typically a backlog of low-end vulnerabilities, lows and mediums in the categorisation system, that are left untreated because they are perceived to be less important.”
Such weaknesses can provide ways in, with chronically underfunded local authorities particularly vulnerable given they run a lot of applications to support their numerous public functions.
These risks can be reduced by giving fewer people access to non-standard software, such as by requiring a reason to use these than a free choice. This allows those who need to use a browser-specific extension to extract data from the enterprise resource management while reducing overall risk.
Having enough lifeboats
A good business applications resilience plan assumes failure at some point. “The philosophy these days really does need to be, ‘You’ll get got at some point, somebody is going to get in’,” says Woodward.
A specific crisis plan for complete loss of standard business applications, including email, online meetings and chat, could consist of printed instructions including phone numbers for team members and the use of “lifeboat systems”, such as a Microsoft-dependent organisation having a small number of Google Workplace licences for those who will directly respond to an incident.
Woodward says that relying on third-party services such as WhatsApp is another option, but notes that they are outside the organisation’s control. While public communication is essential, he advises talking about likelihoods rather than making definitive statements such as “no customer data has been stolen” until the organisation is completely sure.
Davey McGlade, global head of cyber security at technology services company Version 1, adds that internal communications are also important. It can work well to have common chat channels for both management and incident response, otherwise communications are likely to take place between individuals.
He adds that it is worth taking an engineering perspective – for example, testing that a secondary system can work at the scale of the production system it is meant to replace, as tests often take place only on a small scale. Similarly, users need to be comfortable using a secondary system, either because it works in the same way as the production one or because they have been specifically trained on using it. Established organisations can have an advantage in that they can fall back on old processes, whether digital or paper based.
“The challenge for a digital-facing facing business is you get budget for one application,” says McGlade, meaning such organisations may need to pay extra to build resilience through extra capacity or stand-by services. One option is to design systems for graceful degradation, where they suspend less important functions to keep critical ones going.
It makes sense to consider who could be brought in respond to the incident. James Blake, vice-president of global cyber resiliency strategy at data security and management provider Cohesity, says a large hospital in North America, which was trialling its systems, was hit by an attack that compromised and encrypted its main data store.
It did not have a large in-house technology staff but did hold cyber security insurance, and its insurer sent in a response team when contacted. But they repeatedly restored systems from its main back-up service that would get reinfected within a few minutes, leading the hospital to believe the team was seeking evidence to invalidate the insurance policy rather than perform a recovery.
“If you are using a third-party incident responder, who are they working for?” asks Blake.
The hospital then paid for its own incident response team which as well as recovering systems also investigated and remediated. It used the backup provided by Cohesity, which at that point covered some critical data rather than everything, to find the root cause by looking at when as well as what happened.
It found that the original attack involved adding a global policy object (GPO) to the hospital’s Active Directory that would push malware to devices in the same way it deployed new versions of applications, a “living off the land” attack. This led the organisation to improve its preparedness, including planning who it would bring in when it suffered another attack as well as better monitoring of “east-west” lateral communications between devices on its own network (as opposed to ‘north-south’ to and from the internet).
Blake says that relying on business continuity and disaster recovery alone, rather than investigating and solving the problem, is all too common. In some cases, this is because the chief information security officer is unaware of a ransomware attack because the IT department doesn’t report it, so doesn’t trigger a response.
“The CISO is holding the toilet chain because it’s a cyber incident but all the plumbing for recovery is provided by IT,” he says.
Building back stronger
In some cases, organisations will strengthen their resilience thoughtfully after an attack. This is certainly how Marks & Spencer is presenting its approach, with the cyber incident section of its May trading statement saying: “We are seeking to make the most of the opportunity to accelerate the pace of improvement of our technology transformation and have found new and innovative ways of working.”
Daryl Flack, a partner at managed security service provider Avella, says it worked with a large NHS provider on rebuilding after a ransomware attack, adding: “They had to rebuild everything from scratch.”
Although it made interim arrangements, the healthcare provider worked for a year with Avella on moving many systems to a cloud environment, replacing on-premise systems that were attacked. Flack says it has taken time and effort moving many specialists and, in some cases, ageing applications to cloud hosting in a secure fashion, but the move should significantly increase the provider’s resilience. It now segments different applications and devices.
“If they were to get compromised, the limit of that compromise would be that device,” he says. “It’s microsegmentation – it means that the blast radius of an infection stops at a single host, it doesn’t get spread throughout your own environment.”
The ransomware attack that led to the work succeeded through the attackers finding a weak entry point and then moving laterally in the organisation’s systems to gain more privileges – something the segmented cloud-first approach aims to make much harder. Flack adds that the provider has also improved its security monitoring, back-up services and planning for future attacks.
As with his fellow experts Flack says that organisations need to assume that cyber attackers will get in and plan accordingly. “You can’t always defeat the bad guys,” he says. “You have just got to make sure you can bounce back quickly if something bad happens.”
Read more about business resilience
- What is business resilience?
- What is cyber resilience?
- Why resilience is the new boardroom imperative.