chris - Fotolia

Surviving IT failures: A Computer Weekly Downtime Upload podcast

Listen to this podcast

We speak to Junade Ali, author of How to Protect yourself from Killer Computers

Computer expert Junade Ali’s new book, How to Protect yourself from Killer Computers, explores the subject of why do IT systems fail and why is there a culture of denial among those responsible, both on the engineering side and higher levels of management.

He says: “In the topics and case studies covered in the book, there are examples of things like the Post Office Horizon scandal which have led to catastrophic outcomes.” 

The book’s title is evocative, and conjures up Hollywood-esque Terminator and Hal 9000 scenarios. But in the real world, Ali says that there have been many instances when computers have been attributed to causing someone’s death.  “We see examples of patients being given fatal radiation doses because of software and an aircraft entering a death dive because of a rogue automation, which calculated [incorrectly] that the human pilot was going to crash the plane.”

The archetypal software engineering problem leading to a catastrophic computer failure often starts due to a failure in the requirements engineering process. Ali says that when this early stage concern is not addressed, it will snowball into the creation of the software, testing and the roll out phases. The issues will be covered up. People on the team may be afraid to speak out.

The book looks at the phenomenon of escalation of commitment, which, for Ali, is the grey area of ethics that can occur in software development. “Over time, people get pushed into doing things which are more and more extreme.” He points to the recent example of the director of engineering at cryptocurrency exchange FTX, who knew about the fraudulent activities of former boss, Sam Bankman-Fried.

“We have an intrinsic bias towards wanting to protect ourselves against loss, which can often put us in positions where we'll do more and more ethically questionable things to cover up something,” he adds.

Ali has researched whistleblowing in software development, where team members feel unable to voice their concerns. He says: “This ties in with an idea called psychological safety, where people don't feel they have the ability to speak up and raise the alarm when things are going wrong.”

The research conducted with Survation found that the 75% of software engineers who reported wrongdoings at work faced retaliation as a result. “This really is an industry-wide challenge,” Ali warns.

Another problem area in software development that Ali identifies in the book is normalcy bias. He says: “Sometimes people will behave as if they aren't in a crisis at all.” 

Instead of considering a computer system as pure technology to tackle a particular task, Ali believes software engineers should consider it as a sociotechnical system. “You need to consider the human as part of the system,” he says.  “Humans play a critical role in making sure that the system is resilient. Humans  need to be part of the safety net and they should be given psychological safety to be able to speak up and raise the alarm when they see issues coming up.”

Data Center
Data Management