Global Risks – Is Software The Vlieg In De Soep*?

Friday, 18 September 2020
By Patricia Lustig & Gill Ringland

* Fly in the soup - a translation of a Dutch phrase meaning the fly in the ointment

Both of us – the co-authors – spent several years designing, coding and testing software. We have a visceral understanding of what can go wrong, even before the human element – the user – is added to the mix. And then we have the nightmare of finding that the data the system relied on was not available, faulty, or incomplete.

Any system depends on having an algorithm which is appropriate and complete; an accurate implementation; and reliable data.

An example is a system for making soup. The algorithm is the recipe: hopefully it has been tested. The implementation could be by an experienced cook or a robot (though if a robot is to make the soup, a much more detailed set of instructions are needed). The type of soup being made will depend on the ingredients –and how it tastes depends on the type and quality of the indgredients.

Fly in soup.jpg

Software is the same. An algorithm is designed to fulfil the system’s purpose. The algorithm may be imperfect because it fails to cover some of the conditions or does not reflect the real world. It is executed in code which may be an imperfect implementation. The outputs are based on the data used by the software. Problems with utility of data are so widespread that a term “GIGO” – garbage in, garbage out – is well known.

Can this illuminate the utility of a software model to influence government policy during early stages of the Covid-19 pandemic? The algorithms were based on assumptions about the rate and type of infection, and the threat to human life from the virus. These turned out to be false. The implementation in code was suspect according to a number of observers. The available data input was incomplete and misleading: during early stages of the pandemic, those with suspected symptoms were told to not call their GPs or the NHS helpline unless they needed an ambulance. As a result, for instance, the official statistics showed 3 cases in West Berkshire while I was personally aware of 8 people with all the symptoms. Hence, the government scientists and politicians were depending on misleading projections which were based on inadequate algorithms, code implementation and data.

As our dependence on IT increases – accelerated by Covid-19 – so does the number of stories of software failures harming people and costing lives. For instance:

  • Hundreds of Post Office workers were recently ‘vindicated’ by a High Court ruling over a faulty IT system that left them bankrupt and in prison. ‘Bugs, errors and defects’ caused shortfalls in branch accounts, concluded the judge as he approved £58m settlement between the Post Office and more than 550 claimants at a hearing in London. This problem was only found because some of those affected were able to bring documentary evidence to prove the sources of the errors. They had kept personal hard copy files.
  • The Boeing 737 MAX was developed with new engines. Unfortunately, the new engines changed the air flow and made the aircraft unstable. But hardware changes to fix the airflow require more extensive scrutiny from the FAA than software changes. So Boeing loaded a routine called the Manoeuvering Characteristics Augmentation System (MCAS). It would run in the background, waiting for the airplane to enter a high-angle climb. Then it would act, rotating the airplane’s horizontal stabilizer to counteract the changing aerodynamic forces. MCAS was approved with minimal review. But Boeing’s software had a serious problem. Under certain circumstances, it activated erroneously, sending the airplane into an infinite loop of nose-dives. Unless the pilots can - in under four seconds - correctly diagnose the error, throw a specific emergency switch, and start recovery manoeuvres, they will lose control of the airplane and crash. This is exactly what happened in the case of Lion Air Flight 610 and Ethiopian Airlines Flight 302. This problem was only recognised because of the similarities of the two crashes.
  • CareFusion is a medical equipment manufacturer that has experienced several emergency recalls in recent years. In 2015, CareFusion’s Alaris Pump was recalled over a software error that caused the pump, designed to automatically deliver medicine and fluids to hospital patients, to delay an infusion. The consequences, which can range anywhere from medicine being withheld at critical points or accidental over-dosing, can be deadly. Just four days later CareFusion issued a Class I recall over a separate line of ventilators, citing a software flaw that could cause the patient to suffocate. These problems were only found because there were medical staff on hand to recognise malfunctions.

These and countless other examples from airlines, banks, Facebook, and across the commercial and government spectrum, suggest that software engineering standards for development (design and coding) and testing are widely ignored (where they even exist). Software is a new discipline – as the British Computer Society describes, published standards cover parts of the software life cycle, but not all.

It could be argued that lack of application of software engineering standards will be remedied over time through market forces, such as acceptable insurance rates for organisations with a bad track record of software failure. Failures are often what drives innovation. Successful innovation builds on an understanding of both how to improve the design and implementation and understanding the causes of failure.

robot.jpg

In the cases above, people were able to recognise and remedy the errors caused – albeit after deaths or financial loss. The next generation of software systems is likely to outstrip the ability of humans to check the results in detail. Many AI-assisted systems are currently subject to human verification – for instance, the auto-correct feature on text messages, teaching assistants, customer service robots, autonomous vehicles.

However, AI systems are increasingly being used in circumstances where there is no human ability to check the logic or to query the outcomes. For instance,

  • Evaluation of CVs as part of a recruitment process
  • Criteria for getting credit or a mortgage
  • Court sentences.

In these cases, the built in weaknesses of the software are not able to be checked*. We described above three examples of errors of the type which persist in many released software systems: errors of algorithm, implementation and data. These must also be expected in AI systems.

Many experts are also concerned that society’s growing reliance on algorithms (many of which that we only vaguely understand) is problematic. One of the fundamental challenges of machine learning is that the models depend on data supplied by humans. Unfortunately, this data is likely to have been selected according to biases. It is not just the data. Algorithms are developed and implemented by people, and people have in-built biases. For example, we need to think about how and where the machine learning algorithms used today in healthcare, education, and criminal justice, are making biased judgements. And without a mechanism for querying the outcomes.

I have been reading Robert Harris’ The Second Sleep, about an England several centuries after the collapse of our current society. An ancient artifact is discovered, written ten years before the collapse, identifying “six possible scenarios that fundamentally threaten the existence of our advanced science-based way of life”: these are Climate change; nuclear exchange; super-volcano eruption, leading to rapidly accelerated climate change; asteroid strike, also causing accelerated climate change; general failure of computer technology due to either cyber warfare, an uncontrollable virus, or solar activity; pandemic resistant to antibiotics.

The list is reasonably prescient, although the recent pandemic has focused attention on virus-led pandemics, rather than antibiotic resistance. But we think that – right up there – we need to include the risk of software malfunction.

Summary

So, yes, we do think that there is a fly in the soup. In fact, it is a huge and dangerous insect. We think that software is a problem flying just under the radar, ready to fall into the soup, leaving devastation in its wake. It could crash our planet. As we continue to depend on systems that are faulty; that we do not “understand”; and/or those that are based on data or assumptions that are incomplete or faulty, the danger increases.

Patricia Lustig and Gill Ringland, Fellow, British Computer Society, September 2020

www.global-megatrends.com

* This issue is discussed in depth in Bob McDowell's contribution earlier this year:

https://www.longfinance.net/news/pamphleteers/explaining-inexplicable-explaining-decisions-using-artificial-intelligence-machine-learning/