|
1 | | -# Introduction {#intro} |
| 1 | +# Introduction {#sec-intro} |
2 | 2 |
|
3 | 3 | Recent advances in Artificial Intelligence (AI) have propelled its adoption in scientific domains outside of Computer Science including Healthcare, Bioinformatics, Genetics and the Social Sciences. While this has in many cases brought benefits in terms of efficiency, state-of-the-art models like Deep Neural Networks (DNN) have also given rise a new type of principal-agent problem in the context of data-driven decision-making. It involves a group of **principals** - i.e. human stakeholders - that fail to understand the behaviour of their **agent** - i.e. the model used for automated decision-making [@borch2022machine]. |
4 | 4 |
|
5 | 5 | Models or algorithms that fall into this category are typically referred to **black-box** models. Despite their shortcomings, black-box models have grown in popularity in recent years and have at times created undesirable societal outcomes [@o2016weapons]. The scientific community has tackled this issue from two different angles: while some have appealed for a strict focus on inherently iterpretable models [@rudin2019stop], others have investigated different ways to explain the behaviour of black-box models. These two sub-domains can be broadly referred to as **interpretable AI** and **explainable AI** (XAI), respectively. |
6 | 6 |
|
7 | | -Among the approaches to XAI that have recently grown in popularity are **Counterfactual Explanations** (CE). They explain how inputs into a model need to change for it to produce different outputs. Counterfactual Explanations that involve realistic and actionable changes can be used for the purpose of **Algorithmic Recourse** (AR) to help individuals who face adverse outcomes. An example relevant to the Social Sciences is consumer credit: in this context AR can be used to guide individuals in improving their creditworthiness, should they have previously been denied access to credit based on an automated decision-making system. |
| 7 | +Among the approaches to XAI that have recently grown in popularity are **Counterfactual Explanations** (CE). They explain how inputs into a model need to change for it to produce different outputs. Counterfactual Explanations that involve realistic and actionable changes can be used for the purpose of **Algorithmic Recourse** (AR) to help individuals who face adverse outcomes. An example relevant to the Social Sciences is consumer credit: in this context AR can be used to guide individuals in improving their creditworthiness, should they have previously been denied access to credit based on an automated decision-making system. |
8 | 8 |
|
9 | | -Existing work on CE and AR has largely been limited to the static setting: given some classifier $M: \mathcal{X} \mapsto \mathcal{Y}$ we are interested in finding close [@wachter2017counterfactual], actionable [@ustun2019actionable], realistic [@joshi2019towards, @schut2021generating], sparse, diverse [@mothilal2020explaining] and ideally causally founded counterfactual explanations [@karimi2021algorithmic] for some individual $x$. The ability of counterfactual explanations to handle dynamics like data and model shifts remains a largely unexplored research challenge at this point [@verma2020counterfactual]. Only one recent work considers the implications of **exogenous** domain and model shifts [@upadhyay2021towards]. The authors propose a simple minimax objective, that minimizes the counterfactual loss function for a maximal model shift. They show that their approach yields more robust counterfactuals in this context than existing approaches. |
| 9 | +Existing work in this field has largely worked in a static setting: various approaches have been proposed to generate counterfactuals for a given individual that is subject to some pre-trained model. More recent work has compared different approaches within this static setting [@pawelczyk2021carla]. In this work we go one step further and ask ourselves: what happens if recourse is provided and implemented repeatedly? What types of dynamics are introduced and how do different counterfactual generators compare in this context? |
10 | 10 |
|
11 | | -This project investigates **endogenous** domain and model shifts, that is shifts that occur when AR is actually implemented by a proportion of individuals and the classifier is updated in response. @fig-dynamics illustrates this idea for a binary problem involving a probabilistic classifier and the counterfactual generator proposed by @wachter2017counterfactual: the implementation of AR for a subset of individuals leads to a domain shift (b), which in turn triggers a model shift (c). As this game of implementing AR and updating the classifier is repeated, the decision boundary moves away from training samples that were originally in the target class (d). |
| 11 | +@fig-dynamics illustrates this idea for a binary problem involving a probabilistic classifier and the counterfactual generator proposed by @wachter2017counterfactual: the implementation of AR for a subset of individuals leads to a domain shift (b), which in turn triggers a model shift (c). As this game of implementing AR and updating the classifier is repeated, the decision boundary moves away from training samples that were originally in the target class (d). We refer to these types of dynamics as **endogenous** because they are induced by the implementation of recourse itself. |
12 | 12 |
|
13 | | -These dynamics may be problematic. As the decision boundary moves in the direction of the non-target class, counterfactual paths become shorter: in the consumer credit example, individuals that previously would have been denied credit based on their input features are suddenly considered as creditworthy. Average default risk across all borrowers can therefore be expected to increase. Conversely, lenders that anticipate such dynamics may choose to deny credit to individuals that have implemented AR, thereby compromising the validity of AR. |
| 13 | +{#fig-dynamics fig.pos="h" width=45%} |
14 | 14 |
|
15 | | -To the best of our knowledge this is the first work investigating endogenous dynamics in AR. Our contributions to the state of knowledge are two-fold. Firstly, we introduce an experimental framework built on top of CARLA, which allows for benchmarking different counterfactual generators in terms of the endogenous domain and model shifts they introduce. To this end we propose a number of novel evaluation metrix. Secondly, we use this framework to provide the first in-depth analysis of the dynamics of recourse induced by various diverse approaches to CE. In particular we focus on the following methodologies: introduced in @wachter2017counterfactual, @joshi2019towards, @mothilal2020explaining and @antoran2020getting. |
| 15 | +We find reasons to believe that these types of endogenous dynamics may be problematic. In @fig-dynamics, as the decision boundary moves in the direction of the non-target class, counterfactual paths become shorter: in the consumer credit example, individuals that previously would have been denied credit based on their input features are suddenly considered as creditworthy. Average default risk across all borrowers can therefore be expected to increase. Conversely, lenders that anticipate such dynamics may choose to deny credit to individuals that have implemented AR, thereby compromising the validity of AR. |
| 16 | + |
| 17 | +To the best of our knowledge this is the first work investigating endogenous dynamics in AR. Our contributions to the state of knowledge are two-fold. Firstly, we introduce an experimental framework extending previous work by @pawelczyk2021carla, which allows for benchmarking different counterfactual generators in terms of the endogenous domain and model shifts they introduce. To this end we propose a number of novel evaluation metrics. Secondly, we use this framework to provide the first in-depth analysis of endogenous recourse dynamics induced by various popular counterfactual generators including @wachter2017counterfactual, @joshi2019towards, @mothilal2020explaining and @antoran2020getting. |
| 18 | + |
| 19 | +The paper is structured as follows: @sec-related ... . |
16 | 20 |
|
17 | | -{#fig-dynamics fig.pos="h" width=45%} |
|
0 commit comments