JuliaTrustworthyAI
diff --git a/‎dev/build/paper/paper.html‎
Lines changed: 96 additions & 67 deletions b/‎dev/build/paper/paper.html‎
Lines changed: 96 additions & 67 deletions
diff --git a/‎dev/build/paper/paper.pdf‎
-13.6 KB b/‎dev/build/paper/paper.pdf‎
-13.6 KB
diff --git a/‎dev/build/paper/www/mitigation.png‎
53.3 KB b/‎dev/build/paper/www/mitigation.png‎
53.3 KB
diff --git a/‎dev/notebooks/synthetic.qmd‎
Lines changed: 6 additions & 2 deletions b/‎dev/notebooks/synthetic.qmd‎
Lines changed: 6 additions & 2 deletions
diff --git a/‎dev/paper/paper.tex‎
Lines changed: 303 additions & 208 deletions b/‎dev/paper/paper.tex‎
Lines changed: 303 additions & 208 deletions
diff --git a/‎dev/paper/sections/empirical.qmd‎
Lines changed: 30 additions & 26 deletions b/‎dev/paper/sections/empirical.qmd‎
Lines changed: 30 additions & 26 deletions
diff --git a/‎dev/paper/sections/empirical_2.qmd‎
Lines changed: 79 additions & 6 deletions b/‎dev/paper/sections/empirical_2.qmd‎
Lines changed: 79 additions & 6 deletions
@@ -170,7 +170,11 @@ results = Serialization.deserialize(joinpath(output_path,"results_strict.jls"))
 line_charts = Dict()
 errorbar_charts = Dict()
 for (data_name, res) in results
-    line_charts[data_name] = plot(res)
-    errorbar_charts[data_name] = plot(res,50)
+    plt = plot(res)
+    savefig(plt, joinpath(www_path, "line_chart_$(data_name).png"))
+    line_charts[data_name] = plt
+    plt = plot(res,maximum(res.output.n))
+    savefig(plt, joinpath(www_path, "errorbar_chart_$(data_name).png"))
+    errorbar_charts[data_name] = plt
 end
 ```
@@ -1,6 +1,33 @@
 # Experiment Setup {#sec-empirical}
 
-## Data {#sec-empirical-data}
+This section presents the exact ingredients and parameter choices describing the simulation experiments we ran to produce the findings presented in the next section (@sec-empirical-2). For convenience, we use Algorithm \ref{algo-experiment} as a template to guide us through this section. A few high-level details upfront: each experiment is run for a total of $T=50$ rounds, where in each round we provide recourse to five percent of all individuals in the non-target class, so $B_t=0.05 * N_t^{\mathcal{D}_0}$^[As mentioned in the previous section, we end up providing recourse to a total of $\approx50\%$ by the end of round $T=50$.]. All classifiers and generative models are retrained for 10 epochs in each round $t$ of the experiment. Rather than retraining models from scratch, we initialize all parameters at their previous levels ($t-1$) and compute backpropagate for 10 epochs using the new training data as inputs into the existing model. Evaluation metrics are computed and stored every 10 rounds. 
+
+## $M$ -- Classifiers and Generative Models {#sec-empirical-classifiers}
+
+For each dataset and generator we look at three different types of classifiers all of them built and trained using `Flux.jl` [@innes2018fashionable]: firstly, a simple linear classifier - **Logistic Regression** - implemented as single linear layer with sigmoid activation; secondly, a multilayer perceptron (**MLP**); and finally, a **Deep Ensemble** composed of five MLPs following @lakshminarayanan2016simple that serves as our only probabilistic classifier. We have chosen to work with deep ensembles both for their simplicity and effectiveness at modelling predictive uncertainty. They are also the model of choice in @schut2021generating. The actual neural network architectures are kept simple (@tbl-mlp), since we are only marginally concerned with achieving good initial classifier performance. For the real-world datasets we using mini-batch training and dropout regularization. 
+
+The Latent Space generators rely on separate generative models. Following the authors of both REVISE and CLUE we use Variational Autoencoders (**VAE**) for this purpose. As with the classifiers, we deliberately choose to work with fairly simple architectures (@tbl-vae). More expressive generative models generally also lead to more meaningful counterfactuals produced by Latent Space generators. But in our view this should simply be considered as a vulnerability of counterfactual generators that rely on surrogate models to learn what realistic representations of the underlying data. 
+
+::: {#tbl-panel layout-ncol=1}
+
+| | Hidden Dim. | Hidden Layers | Batch | Dropout |
+|------|------|------|-----|-----|
+| Synthetic | 32 | 1 | - | - |
+| Real-World | 32 | 2 | 50 | 0.25 |
+
+: MLP {#tbl-mlp} 
+
+|  | Hidden Dim. | Epochs |
+|------|------|------|
+| Synthetic | 2 | 100 | 
+| Real-World | 8 | 250 | 
+
+: Variational Autoencoder {#tbl-vae}
+
+Model Architectures
+:::
+
+## $\mathcal{D}$ -- Data {#sec-empirical-data}
 
 We have chosen to work with both synthetic and real-world datasets. Using synthetic data allows us to impose distributional properties that may affect the resulting recourse dynamics. Following @upadhyay2021towards, we generate synthetic data in $\mathbb{R}^2$ to also allow for a visual interpretation of the results. Real-world data is used in order to assess if endogenous dynamics also occur in higher-dimensional settings.
 
@@ -34,32 +61,9 @@ As a third dataset we include the **California Housing** dataset derived from th
 
 Since the simulations involve generating counterfactuals for a significant proportion of the entire sample of individuals, we have randomly undersampled each dataset to yield balanced subsamples consisting of 10,000 individuals each. We have also standardized all explanatory features since our chosen classifiers are sensetive to scale.
 
-## Classifiers and Generative Models {#sec-empirical-classifiers}
-
-For each dataset and generator we look at three different types of classifiers all of them built and trained using `Flux.jl` [@innes2018fashionable]: firstly, a simple linear classifier - **Logistic Regression** - implemented as single linear layer with sigmoid activation; secondly, a multilayer perceptron (**MLP**); and finally, a **Deep Ensemble** composed of five MLPs following @lakshminarayanan2016simple that serves as our only probabilistic classifier. We have chosen to work with deep ensembles both for their simplicity and effectiveness at modelling predictive uncertainty. They are also the model of choice in @schut2021generating. The actual neural network architectures are kept simple (@tbl-mlp), since we are only marginally concerned with achieving good initial classifier performance. For the real-world datasets we using mini-batch training and dropout regularization. 
-
-The Latent Space generators rely on separate generative models. Following the authors of both REVISE and CLUE we use Variational Autoencoders (**VAE**) for this purpose. As with the classifiers, we deliberately choose to work with fairly simple architectures (@tbl-vae). More expressive generative models generally also lead to more meaningful counterfactuals produced by Latent Space generators. But in our view this should simply be considered as a vulnerability of counterfactual generators that rely on surrogate models to learn what realistic representations of the underlying data. 
-
-All classifiers and generative models are retrained for 10 epochs in each round $t$ of the experiment. Rather than retraining models from scratch, we initialize all parameters at their previous levels ($t-1$) and compute backpropagate for 10 epochs using the new training data as inputs into the existing model. 
-
-::: {#tbl-panel layout-ncol=1}
-
-| | Hidden Dim. | Hidden Layers | Batch | Dropout |
-|------|------|------|-----|-----|
-| Synthetic | 32 | 1 | - | - |
-| Real-World | 32 | 2 | 50 | 0.25 |
-
-: MLP {#tbl-mlp} 
-
-|  | Hidden Dim. | Epochs |
-|------|------|------|
-| Synthetic | 2 | 100 | 
-| Real-World | 8 | 250 | 
-
-: Variational Autoencoder {#tbl-vae}
+## $G$ -- Generators
 
-Model Architectures
-:::
+All generators introduced earlier are included in the experiments: Wachter [@wachter2017counterfactual], REVISE [@joshi2019towards], CLUE [@antoran2020getting], DiCE [@mothilal2020explaining] and Greedy [@schut2021generating]. In addition, we introduce two new generators in @sec-empirical-2-mitigate that directly address the issue of endogenous domain and model shifts. We also test to what extent it may be beneficial to combine ideas underlying the various generators. 
 
 
 
 
@@ -2,14 +2,87 @@
 
 ## Endogenous Macrodynamics
 
-## Potential Mitigation Strategies
 
-### Gravitational Counterfactual Explanations {#sec-empirical-2-mitigate}
 
-A straight-forward choice simply extends the baseline approach by @wachter2017counterfactual: instead of only penalizing the distance of the individuals' counterfactual to its factual, we propose penalizing its distance to some sensible point in the target domain, for example the sample average: $\bar{\mathbf{x}}$. For such a recourse objective, higher choices of $\lambda_2$ relative to $\lambda_1$ will lead counterfactuals to gravitate towards the specified point in the target domain. In the remainder of this paper we will therefore refer to this approach as **Gravitational** generator, when we investigate its potential usefulness for mitigating endongenous macrodynamics^[Note that despite the naming convention our goal here is not to provide yet another counterfactual generator, but merely investigate the most simple penalty we can think of with respect to its effectiveness.].
+## Potential Mitigation Strategies {#sec-empirical-2-mitigate}
 
-#### A note on convergence
+In the following we explore several ideas for strategies that may help to mitigate those types of endogenous recourse dynamics we observed in the previous subsection. All of them essentially boil down to one simple principle: to avoid substantial domain and model shifts, the generated counterfactuals should comply as much as possible with the true data generating process. This principle is really at the core of Latent Space generators, and hence it is not surprising that we have found these types of generators to perform comparably well in the previous subsection. But as we have mentioned earlier, generators that rely on separate generative models carry an additional computational cost and - perhaps more importantly - their performance hinges on the performance of said generative models. Fortunately, it turns out that we can use a number of other, much simpler strategies, which we will discuss now. 
 
-For this simple mitigating strategy underlying the Gravitational generator to work as expected, one needs to ensure that counterfactual search continues, even after a predetermined threshold probability $\gamma$ has potentially already been reached. @fig-convergence illustrates this distinction: if one chooses to terminate search once the desired threshold is reached (left panel) the gravitational pull towards $\bar{\mathbf{x}}$ is never actually satisfied (compare to right panel). More generally, if convergence is defined simply in terms of flipping the predicted label with some desired degree of confidence, this corresponds to essentially ignoring any parts of the counterfactual search objective that do not involve $\ell(M(f(s_k^\prime)),t)$ beyond that point. While this may be appropriate for some applications, in general this seems like an odd convention. Since we nonetheless seen convergence specified simply in terms of reaching the threshold probability in some places^[@joshi2019towards define convergence of Algorithm 1 in this way. The implementation of @wachter2017counterfactual in CARLA is also defined in this way.], we thought it worth making this distinction explicit.
+### More Conservative Decision Thresholds
 
-![Comparison of counterfactual search outcome with simple (left) and strict convergence (right).](www/gravitational_generator_comparison.png){#fig-convergence fig.pos="h" width=45%}
+The most obvious and trivial mitigation strategy is to simply choose a higher decision threshold $\gamma$. Under $\gamma=0.5$, counterfactuals will end up near the decision boundary by construction. Since this is the region of maximal aleotoric uncertainty, the classifier is bound to be thrown off. By simply setting a more conservative decision threshold, we can avoid this issue to some extent. A potential drawback of this approach is that a classifier with high decisiveness may classify samples with high confidence even far away from the training data. 
+
+### Classifier Preserving ROAR
+
+Another potential strategy draws inspiration from ROAR @upadhyay2021towards: to preserve the classifier, we propose to simply explicitly penalize the loss it incurs when evaluated on the counterfactual $x^\prime$ at given parameter values. Recall that $g(\cdot)$ denotes what we had defined as the external cost in @eq-collective. Then formally we let
+
+$$
+\begin{aligned}
+g(f(s_k^\prime)) = l(M(f(s_k^\prime)),y^\prime)
+\end{aligned}
+$$ {#eq-clap}
+
+for each counterfactual $k$ where $l$ denotes the loss function used to train $M$. This approach, which we shall refer to as **ClapROAR**, is based on the intuition that (endogenous) model shifts will be triggered by counterfactuals that increase classifier loss. It is closely linked to the idea of choosing higher decision threshold, but likely better at avoiding the potential pitfalls associated with highly decisive classifiers. It also makes the private vs. external cost trade-off more explicit and hence manageable.
+
+### Gravitational Counterfactual Explanations 
+
+Yet another strategy simply extends Wachter as follows: instead of only penalizing the distance of the individuals' counterfactual to its factual, we propose penalizing its distance to some sensible point in the target domain, for example the sample average: $\bar{x}$:
+
+$$
+\begin{aligned}
+g(f(s_k^\prime)) = \text{dist}(f(s_k^\prime),\bar{x})
+\end{aligned}
+$$ {#eq-clap}
+
+Once again we can putting this in the context of @eq-collective, the former penalty can be thought of here as the private cost incurred by the individual, while the latter reflects the external cost incurred by other individuals. Higher choices of $\lambda_2$ relative to $\lambda_1$ will lead counterfactuals to gravitate towards the specified point $\bar{x}$ in the target domain. In the remainder of this paper we will therefore refer to this approach as **Gravitational** generator, when we investigate its potential usefulness for mitigating endongenous macrodynamics^[Note that despite the naming convention our goal here is not to provide yet another counterfactual generator, but merely investigate the most simple penalty we can think of with respect to its effectiveness.]. 
+
+@fig-mitigation shows an illustrative example that demonstrates the differences in counterfactual outcomes when using the various mitigation strategies compared to the baseline approach, that is Wachter with $\gamma=0.5$: choosing a higher decision threshold pushes the counterfactual a little further into the target domain; this effect is even stronger for ClapROAR; finally, using the Gravitational generator the counterfactual ends up all the way inside the target domain in the neighbourhood of $\bar{x}$^[In order for the Graviational generator and ClapROAR to work as expected, one needs to ensure that counterfactual search continues, independent of the threshold probability $\gamma$.].
+
+```{julia}
+#| eval: false
+using Random
+Random.seed!(2022)
+
+# Data:
+using MLJ
+N = 1000
+X, ys = make_blobs(N, 2; centers=2, as_table=false, center_box=(-5 => 5), cluster_std=0.5)
+ys .= ys.==2
+X = X'
+xs = Flux.unstack(X,2)
+data = zip(xs,ys)
+counterfactual_data = CounterfactualData(X,ys')
+
+# Models:
+using AlgorithmicRecourseDynamics
+M = AlgorithmicRecourseDynamics.Models.FluxModel(counterfactual_data)
+M = AlgorithmicRecourseDynamics.Models.train(M, counterfactual_data)
+
+# Generators:
+generators = Dict(
+    "Generic (γ=0.5)" => GenericGenerator(decision_threshold=0.5),
+    "Generic (γ=0.9)" => GenericGenerator(decision_threshold=0.9),
+    "Gravitational" => GravitationalGenerator(),
+    "ClapROAR" => ClapROARGenerator()
+)
+
+# Counterfactuals
+x = select_factual(counterfactual_data, rand(1:size(X)[2])) 
+y = round(probs(M, x)[1])
+target = ifelse(y==1.0,0.0,1.0) # opposite label as target
+T = 50
+counterfactuals = Dict([name => generate_counterfactual(x, target, counterfactual_data, M, gen; T=T, latent_space=false) for (name, gen) in generators])
+
+# Plots:
+plts = []
+for (name,ce) ∈ counterfactuals
+    plt = plot(ce; title=name, colorbar=false, ticks = false, legend=false)
+    plts = vcat(plts..., plt)
+end
+plt = plot(plts..., size=(750,200), layout=(1,4))
+savefig(plt, "dev/paper/www/mitigation.png")
+```
+
+![The differences in counterfactual outcomes when using the various mitigation strategies compared to the baseline approach, that is Wachter with $\gamma=0.5$](www/mitigation.png){#fig-mitigation fig.pos="h" width=45%}
+
+Our findings ...