Skip to content

Commit 9ebef92

Browse files
committed
setting up to have separate paper branch that will serve as static artifact of code at time of publication to ensure replicability
1 parent f908f7d commit 9ebef92

11 files changed

Lines changed: 146 additions & 98 deletions

File tree

CITATION.bib

Lines changed: 0 additions & 8 deletions
This file was deleted.

Project.toml

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -4,38 +4,29 @@ authors = ["Patrick Altmeyer"]
44
version = "0.1.0"
55

66
[deps]
7-
BSON = "fbb218c0-5317-5bc6-957e-2ee96dd4b1f0"
87
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
9-
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
108
CounterfactualExplanations = "2f13d31b-18db-44c1-bc43-ebaf2cff0be0"
119
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
1210
Distances = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7"
13-
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
1411
FileIO = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549"
1512
Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
16-
Gadfly = "c91e804a-d5a3-530f-b6f0-dfbca275c004"
1713
Images = "916415d5-f1e6-5110-898d-aaa5f9f070e0"
1814
KernelFunctions = "ec8451be-7e33-11e9-00cf-bbf324bd1392"
1915
LaplaceRedux = "c52c1a26-f7c5-402b-80be-ba1e638ad478"
2016
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
2117
Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
22-
MLDataUtils = "cc2ba9b6-d476-5e6d-8eaf-a92d5412d41d"
23-
MLDatasets = "eb30cadb-4394-5ae3-aed4-317e484a6458"
2418
MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
2519
MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54"
26-
NNlib = "872c559c-99b0-510c-b3b7-b6c96a88d5cd"
2720
Parameters = "d96e819e-fc66-5662-9728-84c9c7592b0a"
2821
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
2922
PlotThemes = "ccf2f8ad-2431-5c83-bf29-c5338b663b6a"
3023
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
3124
ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca"
3225
RCall = "6f49c342-dc21-5d91-9882-a32aef131414"
33-
RDatasets = "ce6b1742-4840-55fa-b093-852dadbb1d8b"
3426
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
3527
Serialization = "9e88b42a-f829-5b0c-bbe9-9e923198166b"
3628
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
3729
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
38-
Turing = "fce5fe82-541a-59a6-adf8-730c64b5f9a0"
3930
Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"
4031

4132
[compat]

README.md

Lines changed: 14 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,20 @@
11

2-
# Algorithmic Recourse
2+
# AlgorithmicRecourseDynamics.jl - Research Paper 📝
33

4-
## Abstract
4+
**Note** ⚠: You are on the `#original-paper` branch of `AlgorithmicRecourseDynamics.jl`. This branch is a static artifact corresponding to the state of the package at the time the paper was first published. It can be used to replicate the original findings of the paper. For an up-to-date version of the package, please switch to the [`#main`](https://github.com/pat-alt/AlgorithmicRecourseDynamics.jl) branch.
55

6-
Existing work on Counterfactual Explanations (CE) and Algorithmic
7-
Recourse (AR) has largely been limited to the static setting: given some
8-
classifier we are interested in finding close, actionable, realistic,
9-
sparse, diverse and ideally causally founded counterfactuals. The
10-
ability of CE to handle dynamics like data and model drift remains a
11-
largely unexplored research challenge at this point. Only one recent
12-
work considers the implications of exogenous domain and model shifts.
13-
This project instead focuses on endogenous dynamics, that is shifts that
14-
occur when AR is actually implemented by a proportion of individuals.
15-
Early findings suggest that the involved shifts may be large with
16-
important implications on the validity of AR and the overall
17-
characteristics of the sample population.
6+
## At a Glance
187

19-
![](www/poc.png)
8+
The paper titles **Endogenous Macrodynamics in Algorithmic Recourse** is currently under review and not yet published. You can find a preprint along with other resources right here on this branch of the repository:
209

21-
## Code
10+
- [Paper](paper/paper.pdf)
11+
- [Notebooks](dev/notebooks/)
12+
- [Artifacts](https://github.com/pat-alt/AlgorithmicRecourseDynamics.jl/releases/tag/artifacts) (including data and experimental results)
2213

23-
14+
In this work we investigate what happens if Algorithmic Recourse is actually implemented by a large number of individuals. The chart below illustrates what we mean by Endogenous Macrodynamics in Algorithmic Recourse: (a) we have a simple linear classifier trained for binary classification where samples from the negative class ($y=0$) are marked in blue and samples of the positive class ($y=1$) are marked in orange; (b) the implementation of AR for a random subset of individuals leads to a noticable domain shift; (c) as the classifier is retrained we observe a corresponding model shift; (d) as this process is repeated, the decision boundary moves away from the target class.
15+
16+
![](paper/www/poc.png)
17+
18+
## Paper Abstract
19+
20+
Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse (AR) has largely been limited to the static setting and focused on single individuals: given some estimated model the goal is to find valid counterfactuals for individual instance that fulfill various desiderata. The ability of such counterfactuals to handle dynamics like data and model drift remains a largely unexplored research challenge at this point. There has also been surprisingly little work on the related question of how the actual implementation of recourse by one individual may affect other individuals. Through this work we aim to close that gap by systematizing and extending existing knowledge. We first show that many of the existing methodologies can be collectively described by a generalized framework. We then argue that the existing framework fails to account for a hidden external cost of recourse, that only reveals itself when studying the endogenous dynamics of recourse at the group level. Through simulation experiments involving various state-of-the-art counterfactual generators and several benchmark datasets, we generate large numbers of counterfactuals and study the resulting domain and model shifts. We find that the induced shifts are substantial enough to likely impede the applicability Algorithmic Recourse in situations that involve competition for scarce resources. Fortunately, we find various potential mitigation strategies that can be used in combination with existing approaches. Our simulation framework for studying recourse dynamics is fast and open-sourced.

README.qmd

Lines changed: 16 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,28 @@
11
---
2-
format: gfm
2+
format:
3+
gfm:
4+
wrap: none
35
---
46

5-
# AlgorithmicRecourseDynamics.jl
7+
# AlgorithmicRecourseDynamics.jl - Research Paper 📝
68

7-
## Abstract
9+
**Note** ⚠: You are on the `#original-paper` branch of `AlgorithmicRecourseDynamics.jl`. This branch is a static artifact corresponding to the state of the package at the time the paper was first published. It can be used to replicate the original findings of the paper. For an up-to-date version of the package, please switch to the [`#main`](https://github.com/pat-alt/AlgorithmicRecourseDynamics.jl) branch.
810

9-
{{< include paper/sections/abstract.qmd >}}
11+
## At a Glance
1012

11-
![](www/poc.png)
13+
The paper titles **Endogenous Macrodynamics in Algorithmic Recourse** is currently under review and not yet published. You can find
14+
a preprint along with other resources right here on this branch of the
15+
repository:
1216

13-
## Code
17+
- [Paper](paper/paper.pdf)
18+
- [Notebooks](dev/notebooks/)
19+
- [Artifacts](https://github.com/pat-alt/AlgorithmicRecourseDynamics.jl/releases/tag/artifacts) (including data and experimental results)
1420

15-
### Python
21+
In this work we investigate what happens if Algorithmic Recourse is actually implemented by a large number of individuals. The chart below illustrates what we mean by Endogenous Macrodynamics in Algorithmic Recourse: (a) we have a simple linear classifier trained for binary classification where samples from the negative class ($y=0$) are marked in blue and samples of the positive class ($y=1$) are marked in orange; (b) the implementation of AR for a random subset of individuals leads to a noticable domain shift; (c) as the classifier is retrained we observe a corresponding model shift; (d) as this process is repeated, the decision boundary moves away from the target class.
1622

17-
The bulk of the work for this project was done in Python. The relevant source code is under `/src/python`. Use the command below to install the necessary dependencies.
23+
![](paper/www/poc.png)
1824

19-
```{zsh}
20-
pip install -r src/python/requirements.txt
21-
```
25+
## Paper Abstract
2226

23-
#### Trouble-shooting
27+
Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse (AR) has largely been limited to the static setting and focused on single individuals: given some estimated model the goal is to find valid counterfactuals for individual instance that fulfill various desiderata. The ability of such counterfactuals to handle dynamics like data and model drift remains a largely unexplored research challenge at this point. There has also been surprisingly little work on the related question of how the actual implementation of recourse by one individual may affect other individuals. Through this work we aim to close that gap by systematizing and extending existing knowledge. We first show that many of the existing methodologies can be collectively described by a generalized framework. We then argue that the existing framework fails to account for a hidden external cost of recourse, that only reveals itself when studying the endogenous dynamics of recourse at the group level. Through simulation experiments involving various state-of-the-art counterfactual generators and several benchmark datasets, we generate large numbers of counterfactuals and study the resulting domain and model shifts. We find that the induced shifts are substantial enough to likely impede the applicability Algorithmic Recourse in situations that involve competition for scarce resources. Fortunately, we find various potential mitigation strategies that can be used in combination with existing approaches. Our simulation framework for studying recourse dynamics is fast and open-sourced.
2428

25-
The CARLA library used for benchmarking was built in Python 3.7, but we are using other dependencies that require Python >=3.8. For the work done here we have used
26-
27-
28-
### Julia
29-
30-
Some exploratory work was initially done in Julia. The relevant source code is under `/src/julia`.
31-
32-
...

paper/.Rhistory

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,3 +51,116 @@ tab, booktabs = TRUE,
5151
caption = 'VAE architecture and training parameters.',
5252
col.names = c("Hidden Dim.","Epochs")
5353
)
54+
tab <- data.frame(
55+
"Hidden Dim." = c(32,32),
56+
"Hidden Layers." = c(1,2),
57+
"Batch" = c("-",50),
58+
"Dropout" = c("-",0.25)
59+
)
60+
rownames(tab) <- c("Synthetic", "Real-World")
61+
knitr::kable(
62+
tab, booktabs = TRUE,
63+
caption = 'MLP architecture and training parameters.',
64+
col.names = c("Hidden Dim.","Hidden Layers", "Batch", "Dropout")
65+
)
66+
tab <- data.frame(
67+
"Model" = c("MLP","MLP","VAE","VAE"),
68+
"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
69+
"Hidden Dim." = c(32,32,32,32),
70+
"Latent Dim." = c("-","-",2,8),
71+
"Hidden Layers." = c(1,2,1,1),
72+
"Batch" = c("-",50),
73+
"Dropout" = c("-",0.25)
74+
)
75+
tab
76+
tab <- data.frame(
77+
"Model" = c("MLP","MLP","VAE","VAE"),
78+
"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
79+
"Hidden Dim." = c(32,32,32,32),
80+
"Latent Dim." = c("-","-",2,8),
81+
"Hidden Layers." = c(1,2,1,1),
82+
"Batch" = c("-",50,"-","-"),
83+
"Dropout" = c("-",0.25,"-","-")
84+
)
85+
tab
86+
tab <- data.frame(
87+
"Model" = c("MLP","MLP","VAE","VAE"),
88+
"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
89+
"Hidden Dim." = c(32,32,32,32),
90+
"Latent Dim." = c("-","-",2,8),
91+
"Hidden Layers." = c(1,2,1,1),
92+
"Batch" = c("-",50,"-","-"),
93+
"Dropout" = c("-",0.25,"-","-"),
94+
"Epochs" = c(100,100,100,250)
95+
)
96+
tab
97+
library(kableExtra)
98+
tab <- data.frame(
99+
"Model" = c("MLP","MLP","VAE","VAE"),
100+
"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
101+
"Hidden Dim." = c(32,32,32,32),
102+
"Latent Dim." = c("-","-",2,8),
103+
"Hidden Layers." = c(1,2,1,1),
104+
"Batch" = c("-",50,"-","-"),
105+
"Dropout" = c("-",0.25,"-","-"),
106+
"Epochs" = c(100,100,100,250)
107+
)
108+
library(kableExtra)
109+
kbl(
110+
tab, booktabs = TRUE,
111+
caption = 'MLP architecture and training parameters.',
112+
col.names = c("Hidden Dim.","Latent Dim.","Hidden Layers", "Batch", "Dropout", "Epochs"),
113+
collapse_rows = c(1,2)
114+
)
115+
tab <- data.frame(
116+
"Model" = c("MLP","MLP","VAE","VAE"),
117+
"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
118+
"Hidden Dim." = c(32,32,32,32),
119+
"Latent Dim." = c("-","-",2,8),
120+
"Hidden Layers." = c(1,2,1,1),
121+
"Batch" = c("-",50,"-","-"),
122+
"Dropout" = c("-",0.25,"-","-"),
123+
"Epochs" = c(100,100,100,250)
124+
)
125+
library(kableExtra)
126+
kbl(
127+
tab, booktabs = TRUE,
128+
caption = 'MLP architecture and training parameters.',
129+
col.names = c("Model","Data","Hidden Dim.","Latent Dim.","Hidden Layers", "Batch", "Dropout", "Epochs"),
130+
collapse_rows = c(1,2)
131+
)
132+
tab <- data.frame(
133+
"Model" = c("MLP","MLP","VAE","VAE"),
134+
"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
135+
"Hidden Dim." = c(32,32,32,32),
136+
"Latent Dim." = c("-","-",2,8),
137+
"Hidden Layers." = c(1,2,1,1),
138+
"Batch" = c("-",50,"-","-"),
139+
"Dropout" = c("-",0.25,"-","-"),
140+
"Epochs" = c(100,100,100,250)
141+
)
142+
library(kableExtra)
143+
kbl(
144+
tab, booktabs = TRUE,
145+
caption = 'MLP architecture and training parameters.',
146+
col.names = c("Model","Data","Hidden Dim.","Latent Dim.","Hidden Layers", "Batch", "Dropout", "Epochs")
147+
) |>
148+
collapse_rows(1:2, row_group_label_position = 'stack')
149+
tab <- data.frame(
150+
"Model" = c("MLP","MLP","VAE","VAE"),
151+
"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
152+
"Hidden" = c(32,32,32,32),
153+
"Latent" = c("-","-",2,8),
154+
"Layers" = c(1,2,1,1),
155+
"Batch" = c("-",50,"-","-"),
156+
"Dropout" = c("-",0.25,"-","-"),
157+
"Epochs" = c(100,100,100,250)
158+
)
159+
library(kableExtra)
160+
kbl(
161+
tab, booktabs = TRUE,
162+
caption = 'MLP architecture and training parameters.',
163+
col.names = c("Model","Data","Hidden Dim.","Latent Dim.","Hidden Layers", "Batch", "Dropout", "Epochs")
164+
) |>
165+
collapse_rows(1:2, row_group_label_position = 'stack') |>
166+
kable_styling(latex_options = c("scale_down"))

paper/paper.Rmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,9 +61,9 @@ affiliation:
6161
# email: eldon@starfleet-academy.star
6262
# - name: Roy Batty
6363
# email: roy@replicant.offworld
64-
keywords: ["keyword 1", "keyword 2"]
64+
keywords: ["Algorithmic Recourse", "Counterfactual Explanations", "Explainable AI", "Dynamic Systems"]
6565
abstract: |
66-
Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse (AR) has largely been limited to the static setting and focused on single individuals: given some estimated model the goal is to find valid counterfactuals for individual instance that fulfill various desiderata. The ability of such counterfactuals to handle dynamics like data and model drift remains a largely unexplored research challenge at this point. There has also been surprisingly little work on the related question of how the actual implementation of recourse by one individual may affect other individuals. Through this work we aim to close that gap by systematizing and extending existing knowledge. We first show that many of the existing methodologies can be collectively described by a generalized framework. We then argue that the existing framework fails to account for a hidden external cost of recourse, that only reveals itself when studying the endogenous dynamics of recourse at the group level. Through simulation experiments involving various state-of-the-art counterfactual generators and several benchmark datasets, we generate a total of XX million counterfactuals and study the resulting domain and model shifts. We find that the induced shifts are substantial enough to likely impede the applicability Algorithmic Recourse in situations that involve competition for scarce resources. Fortunately, we find various potential mitigation strategies that can be used in combination with existing approaches. Our simulation framework for studying recourse dynamics is fast and open-sourced.
66+
Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse (AR) has largely been limited to the static setting and focused on single individuals: given some estimated model the goal is to find valid counterfactuals for individual instance that fulfill various desiderata. The ability of such counterfactuals to handle dynamics like data and model drift remains a largely unexplored research challenge at this point. There has also been surprisingly little work on the related question of how the actual implementation of recourse by one individual may affect other individuals. Through this work we aim to close that gap by systematizing and extending existing knowledge. We first show that many of the existing methodologies can be collectively described by a generalized framework. We then argue that the existing framework fails to account for a hidden external cost of recourse, that only reveals itself when studying the endogenous dynamics of recourse at the group level. Through simulation experiments involving various state-of-the-art counterfactual generators and several benchmark datasets, we generate large numbers of counterfactuals and study the resulting domain and model shifts. We find that the induced shifts are substantial enough to likely impede the applicability Algorithmic Recourse in situations that involve competition for scarce resources. Fortunately, we find various potential mitigation strategies that can be used in combination with existing approaches. Our simulation framework for studying recourse dynamics is fast and open-sourced.
6767
6868
# use some specific Tex packages if needed.
6969
# with_ifpdf: true

paper/paper.qmd

Lines changed: 0 additions & 41 deletions
This file was deleted.

src/base.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -227,7 +227,7 @@ struct ExperimentResults
227227
experiment::Experiment
228228
end
229229

230-
using DataFrames, CSV, BSON
230+
using DataFrames, CSV
231231
"""
232232
run_experiment(
233233
experiment::Experiment; evaluate_every::Int=2,

www/cat.png

-16.6 KB
Binary file not shown.

www/dog.png

-16.8 KB
Binary file not shown.

0 commit comments

Comments
 (0)