3 reasons why you should care about Causal AI
As the world increasingly turns to AI to solve major problems, finding approaches that provide the WHY and not just the WHAT will be critical for decision-making and responsible AI.
‘Predictive AI’ is the current state of AI that we have all become familiar with. It is the approach taken by modern machine learning (ML) models, which involves consuming copious amounts of data, learning the patterns and predicting the next pattern. The launch of predictive AI onto the world wide stage through GenAI tools has understandably created immense excitement of where AI could be applied to solve pressing issues.
As the world increasingly turns to AI to solve major problems, it would seem sensible to examine the most appropriate applications and use cases of these AI approaches (see table below).
To date, there is evidence that enterprises are reaping benefits from applying predictive AI in instances such as, voice assistants and chatbots, predictive maintenance and optimisation of systems and equipment, customer service operations and personalisation (ref).
But….yes, you knew one was coming….
…as the excitement settles, an obvious question arises from these predictions that deliver an output of ‘what’ will likely occur…and that obvious question is ….‘why’? Why are they expected to happen? Surely knowing this would place more trust in the prediction and inform the right course of decision-making.
Image from Freepik (ref)
Along comes ‘Causal AI’, also referred to deterministic AI. In contrast to prediction (pattern recognition) which only uses data, causal inference is based on counterfactual prediction and uses a combination of data and causal models of the world (ref).
According to the 2022 Gartner Hype Cycle (ref) it will take 5 to 10 years for causal AI to reach mainstream adoption. The business benefits of causal AI are expected to be high and result in increased revenue and/or cost savings.
3 reasons why you should care about Causal AI.
Causal AI will:
Bridge the gap between prediction and decision-making
Avoid potential harms of applying only predictive AI
Enable better AI governance, regulation and policy
On-going discovery and work on causal AI has significant implications for furthering AI’s ability to create value and return on investment. Predictions are far more powerful if they can lead to action; to accurate and robust decision-making that preserves public trust, brand reputation and importantly avoids undue harms to individuals and to society.
Causal AI bridges the gap between prediction and decision-making
Decision making can have deep downstream consequences. Not only does one need to understand why a system makes a decision; one needs to understand the effects of that decision. This understanding will allow improvements to decision-making and the ability to achieve better outcomes (ref).
The impetus for causal AI arose from the need to augment human decisions on tasks where there is a need to understand the actual causes behind an outcome.
Causal AI is an emerging area that essentially works in two steps. First, it collects information and discovers problems within the dataset, examining cause and effect relationships within data. Then, it uses this information to inform the output of AI models, looking for causal relationships that help explain those issues using a plan devised from the collected data (ref).
Essential to understanding causal AI is causal inference, so what exactly is causal inference? Causal inference is the process of determining the actual and independent effect of a particular event that is part of a larger system” (ref).
The foundations of Causal AI were established in the area of causal inference - an interdisciplinary field with contributions from computer science, econometrics, epidemiology, philosophy, statistics, and other disciplines (ref).
Important plug-in here, as a former Epidemiologist myself who spent the first decade of my career immersed in causal inference studies and statistics, I could not agree more with Cassie Kozyrkov comments below and reference to Alfred Spector’s paper entitled “Gaining Benefit from Artificial Intelligence and Data Science: A Three-Part Framework” (ref).
“The field of AI has traditionally been among the most standoffish to folks from non-STEM* backgrounds (and often even standoffish to folks without the classical AI training). This is a major problem, because as the nuts and bolts of getting an AI system up and running become easier as a natural consequence of better tools, we will find that the STEM bits are the easiest part. We'll see that the true beating heart of an AI system is the WHY, not the HOW. We'll find that the "right" solution depends on WHO the system is designed to serve - in both the individual sense and in the societal sense - and we'll see the AI and data science disciplines begging the humanities and social scientists to take their seat at the table” (ref).
Causal inference differs from predictive AI which uses the machine learning process to solve a problem (rather than to test a hypothesis) and which results in a “product” that “does something” (rather than a “finding” that “says something”) (ref).
Image taken from excellent article by Gaurav Shekhar
Causal AI is gaining widespread recognition for its ability to provide more accurate insights and decision-making capabilities. While still an emerging field of AI, at the 2023 World Economic Forum, examples of where causal AI is being applied were given in health, finance and retail (ref).
Causal AI avoids potential harms of applying only predictive AI
In discussing causal AI there are 2 important concepts to understand: correlation and counterfactual.
Correlation is important because regardless of how sophisticated, predictive algorithms are, they risk falling into the trap of equating correlation with causation—in other words, of thinking that because event X precedes event Y, X must be the cause of Y.
Correlation is the relationship between two variables; whereas, causation is where one variable causes an outcome. The difference between correlation and causation is depicted in the diagram below.
Image taken from Simple Psychology article (ref)
For epidemiologists, causal inference is a central and fundamental concept to the discipline’s goal of identifying the causes of disease (both modifiable and nonmodifiable) so that the disease or its consequences might be prevented (ref). Epidemiologists go to great lengths to design studies so that we can study the relationships between a complex web of variables (i.e. age, sex, ethnicity, diet, etc). We go to great lengths to measure these relationships using statistics and define variables as predictors, outcomes, mediators, moderators, confounders, etc. The gold standard in determining causation over correlation in epidemiology is tackled by performing randomised controlled trials. However, the rigour of randomised controlled trials is only available to answer a limited range of questions.
I’ll give you an example of this to highlight the point, taken from a fantastic article in the Stanford Social Innovation Review written by Charles et al, titled the Case for Causal AI (ref):
“In 2016, 2.3 million American adults, or one in 111, were in prison, housed at great cost to federal and state governments. Courts throughout the United States have introduced “recidivism scores” in an attempt to lower incarceration costs by reducing the number of inmates without increasing crime. The recidivism score is a single number reached through a predictive algorithm that estimates the likelihood that a person convicted of a crime will reoffend. In theory, the score makes it possible for a judge to focus on incarcerating those more likely to commit additional crimes, and it should even help to remove potential bias in sentencing. But recidivism scores are inherently faulty because they are based on risk-assessment tools that pick up statistical correlations rather than causations. For example, low income is correlated with crime, but that does not mean it causes crime. Yet people from low-income households may automatically be assigned a high recidivism score, and as a result they are more likely to be sentenced to prison. Fixing the criminal justice system requires a focus on understanding the causes of crime, not merely its correlates.”
The second concept is counterfactual problems, an important concept for determining the explainability of AI which is discussed in the following section.
Counterfactual problems are concerned with “how a model’s inputs would need to change in order to yield an output of a specific kind” (ref).
Using causal AI algorithms one can ask what-if questions. An example is implementing a training program to improve teacher performance. The counterfactual problem could test ‘by how much should we expect student math test scores to improve’? Simulating scenarios to evaluate and compare the potential effect of an intervention (or group of interventions) on an outcome avoids the time and expense of lengthy tests (ref).
Causal AI enables better AI governance, regulation and policy
Applications for predictive AI are being rolled out as a decision-making tool in areas such as medical diagnosis, loan decisions and judgements about criminal recidivism. A long standing issue with these models is that they are opaque. No one understands why the models return the outputs that they do. Opacity of a predictive model makes it difficult to tell whether it will be replicated in a novel situation and removes the potential to obtain external validity. “Of critical importance, in cases where the model is used to inform decisions, and those decisions are subject to ethical scrutiny, opacity creates a gap in the decision-maker’s justification for the decision” (ref).
Opacity, also known as the ‘black box,’ of predictive AI models has led to a long standing discussion around how to provide “explainable AI (XAI) - the ability to provide people with an understanding of why machine learning models yield specific outputs” (ref).
Causal AI is inherently explainable because the approach requires human-guided model construction. The design of causal AI models complies with the rhetoric often heard about AI taking a human-based approach or keeping humans in the loop. Causal AI models can be interrogated for explanations as to why a particular output was reached and can easily be assessed for fairness and bias. The ability to guide what goes into the ‘black box’ provides “accountability, governance, contestability and redress” (ref).
Summary
Predictive models can provide powerful and often accurate information, which hold great value for forecasting and predicting certain events such equipment failure or a customer’s potential to churn. In contrast, causal AI helps to identify the underlying web of causes of a behaviour or event, which may lead to effective interventions and consequently, positive outcomes. Moreover, causal AI doesn’t operate within a black box, allowing the ability to check the model’s reasoning and reduce the risk of biases and potential harms (ref).
Thank you for reading!