JuliaTrustworthyAI
diff --git a/‎CITATION.bib‎
Lines changed: 0 additions & 8 deletions b/‎CITATION.bib‎
Lines changed: 0 additions & 8 deletions
diff --git a/‎Project.toml‎
Lines changed: 0 additions & 9 deletions b/‎Project.toml‎
Lines changed: 0 additions & 9 deletions
diff --git a/‎README.md‎
Lines changed: 14 additions & 17 deletions b/‎README.md‎
Lines changed: 14 additions & 17 deletions
diff --git a/‎README.qmd‎
Lines changed: 16 additions & 20 deletions b/‎README.qmd‎
Lines changed: 16 additions & 20 deletions
diff --git a/‎paper/.Rhistory‎
Lines changed: 113 additions & 0 deletions b/‎paper/.Rhistory‎
Lines changed: 113 additions & 0 deletions
diff --git a/‎paper/paper.Rmd‎
Lines changed: 2 additions & 2 deletions b/‎paper/paper.Rmd‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎paper/paper.qmd‎
Lines changed: 0 additions & 41 deletions b/‎paper/paper.qmd‎
Lines changed: 0 additions & 41 deletions
diff --git a/‎src/base.jl‎
Lines changed: 1 addition & 1 deletion b/‎src/base.jl‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎www/cat.png‎
-16.6 KB b/‎www/cat.png‎
-16.6 KB
diff --git a/‎www/dog.png‎
-16.8 KB b/‎www/dog.png‎
-16.8 KB
@@ -4,38 +4,29 @@ authors = ["Patrick Altmeyer"]
 version = "0.1.0"
 
 [deps]
-BSON = "fbb218c0-5317-5bc6-957e-2ee96dd4b1f0"
 CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
-CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
 CounterfactualExplanations = "2f13d31b-18db-44c1-bc43-ebaf2cff0be0"
 DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
 Distances = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7"
-Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
 FileIO = "5789e2e9-d7fb-5bc7-8068-2c6fae9b9549"
 Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
-Gadfly = "c91e804a-d5a3-530f-b6f0-dfbca275c004"
 Images = "916415d5-f1e6-5110-898d-aaa5f9f070e0"
 KernelFunctions = "ec8451be-7e33-11e9-00cf-bbf324bd1392"
 LaplaceRedux = "c52c1a26-f7c5-402b-80be-ba1e638ad478"
 LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
 Logging = "56ddb016-857b-54e1-b83d-db4d58db5568"
-MLDataUtils = "cc2ba9b6-d476-5e6d-8eaf-a92d5412d41d"
-MLDatasets = "eb30cadb-4394-5ae3-aed4-317e484a6458"
 MLJ = "add582a8-e3ab-11e8-2d5e-e98b27df1bc7"
 MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54"
-NNlib = "872c559c-99b0-510c-b3b7-b6c96a88d5cd"
 Parameters = "d96e819e-fc66-5662-9728-84c9c7592b0a"
 Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
 PlotThemes = "ccf2f8ad-2431-5c83-bf29-c5338b663b6a"
 Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
 ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca"
 RCall = "6f49c342-dc21-5d91-9882-a32aef131414"
-RDatasets = "ce6b1742-4840-55fa-b093-852dadbb1d8b"
 Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
 Serialization = "9e88b42a-f829-5b0c-bbe9-9e923198166b"
 Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
 StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
-Turing = "fce5fe82-541a-59a6-adf8-730c64b5f9a0"
 Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"
 
 [compat]
 
@@ -1,23 +1,20 @@
 
-# Algorithmic Recourse
+# AlgorithmicRecourseDynamics.jl - Research Paper 📝
 
-## Abstract
+**Note** ⚠: You are on the `#original-paper` branch of `AlgorithmicRecourseDynamics.jl`. This branch is a static artifact corresponding to the state of the package at the time the paper was first published. It can be used to replicate the original findings of the paper. For an up-to-date version of the package, please switch to the [`#main`](https://github.com/pat-alt/AlgorithmicRecourseDynamics.jl) branch.
 
-Existing work on Counterfactual Explanations (CE) and Algorithmic
-Recourse (AR) has largely been limited to the static setting: given some
-classifier we are interested in finding close, actionable, realistic,
-sparse, diverse and ideally causally founded counterfactuals. The
-ability of CE to handle dynamics like data and model drift remains a
-largely unexplored research challenge at this point. Only one recent
-work considers the implications of exogenous domain and model shifts.
-This project instead focuses on endogenous dynamics, that is shifts that
-occur when AR is actually implemented by a proportion of individuals.
-Early findings suggest that the involved shifts may be large with
-important implications on the validity of AR and the overall
-characteristics of the sample population.
+## At a Glance
 
-![](www/poc.png)
+The paper titles **Endogenous Macrodynamics in Algorithmic Recourse** is currently under review and not yet published. You can find a preprint along with other resources right here on this branch of the repository:
 
-## Code
+-   [Paper](paper/paper.pdf)
+-   [Notebooks](dev/notebooks/)
+-   [Artifacts](https://github.com/pat-alt/AlgorithmicRecourseDynamics.jl/releases/tag/artifacts) (including data and experimental results)
 
-…
+In this work we investigate what happens if Algorithmic Recourse is actually implemented by a large number of individuals. The chart below illustrates what we mean by Endogenous Macrodynamics in Algorithmic Recourse: (a) we have a simple linear classifier trained for binary classification where samples from the negative class ($y=0$) are marked in blue and samples of the positive class ($y=1$) are marked in orange; (b) the implementation of AR for a random subset of individuals leads to a noticable domain shift; (c) as the classifier is retrained we observe a corresponding model shift; (d) as this process is repeated, the decision boundary moves away from the target class.
+
+![](paper/www/poc.png)
+
+## Paper Abstract
+
+Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse (AR) has largely been limited to the static setting and focused on single individuals: given some estimated model the goal is to find valid counterfactuals for individual instance that fulfill various desiderata. The ability of such counterfactuals to handle dynamics like data and model drift remains a largely unexplored research challenge at this point. There has also been surprisingly little work on the related question of how the actual implementation of recourse by one individual may affect other individuals. Through this work we aim to close that gap by systematizing and extending existing knowledge. We first show that many of the existing methodologies can be collectively described by a generalized framework. We then argue that the existing framework fails to account for a hidden external cost of recourse, that only reveals itself when studying the endogenous dynamics of recourse at the group level. Through simulation experiments involving various state-of-the-art counterfactual generators and several benchmark datasets, we generate large numbers of counterfactuals and study the resulting domain and model shifts. We find that the induced shifts are substantial enough to likely impede the applicability Algorithmic Recourse in situations that involve competition for scarce resources. Fortunately, we find various potential mitigation strategies that can be used in combination with existing approaches. Our simulation framework for studying recourse dynamics is fast and open-sourced.
@@ -1,32 +1,28 @@
 ---
-format: gfm
+format: 
+    gfm:
+        wrap: none
 ---
 
-# AlgorithmicRecourseDynamics.jl
+# AlgorithmicRecourseDynamics.jl - Research Paper 📝
 
-## Abstract
+**Note** ⚠: You are on the `#original-paper` branch of `AlgorithmicRecourseDynamics.jl`. This branch is a static artifact corresponding to the state of the package at the time the paper was first published. It can be used to replicate the original findings of the paper. For an up-to-date version of the package, please switch to the [`#main`](https://github.com/pat-alt/AlgorithmicRecourseDynamics.jl) branch.
 
-{{< include paper/sections/abstract.qmd >}}
+## At a Glance
 
-![](www/poc.png)
+The paper titles **Endogenous Macrodynamics in Algorithmic Recourse** is currently under review and not yet published. You can find
+a preprint along with other resources right here on this branch of the
+repository:
 
-## Code 
+- [Paper](paper/paper.pdf)
+- [Notebooks](dev/notebooks/)
+- [Artifacts](https://github.com/pat-alt/AlgorithmicRecourseDynamics.jl/releases/tag/artifacts) (including data and experimental results)
 
-### Python
+In this work we investigate what happens if Algorithmic Recourse is actually implemented by a large number of individuals. The chart below illustrates what we mean by Endogenous Macrodynamics in Algorithmic Recourse: (a) we have a simple linear classifier trained for binary classification where samples from the negative class ($y=0$) are marked in blue and samples of the positive class ($y=1$) are marked in orange; (b) the implementation of AR for a random subset of individuals leads to a noticable domain shift; (c) as the classifier is retrained we observe a corresponding model shift; (d) as this process is repeated, the decision boundary moves away from the target class.
 
-The bulk of the work for this project was done in Python. The relevant source code is under `/src/python`. Use the command below to install the necessary dependencies.
+![](paper/www/poc.png)
 
-```{zsh}
-pip install -r src/python/requirements.txt
-```
+## Paper Abstract
 
-#### Trouble-shooting
+Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse (AR) has largely been limited to the static setting and focused on single individuals: given some estimated model the goal is to find valid counterfactuals for individual instance that fulfill various desiderata. The ability of such counterfactuals to handle dynamics like data and model drift remains a largely unexplored research challenge at this point. There has also been surprisingly little work on the related question of how the actual implementation of recourse by one individual may affect other individuals. Through this work we aim to close that gap by systematizing and extending existing knowledge. We first show that many of the existing methodologies can be collectively described by a generalized framework. We then argue that the existing framework fails to account for a hidden external cost of recourse, that only reveals itself when studying the endogenous dynamics of recourse at the group level. Through simulation experiments involving various state-of-the-art counterfactual generators and several benchmark datasets, we generate large numbers of counterfactuals and study the resulting domain and model shifts. We find that the induced shifts are substantial enough to likely impede the applicability Algorithmic Recourse in situations that involve competition for scarce resources. Fortunately, we find various potential mitigation strategies that can be used in combination with existing approaches. Our simulation framework for studying recourse dynamics is fast and open-sourced. 
 
-The CARLA library used for benchmarking was built in Python 3.7, but we are using other dependencies that require Python >=3.8. For the work done here we have used
-
-
-### Julia
-
-Some exploratory work was initially done in Julia. The relevant source code is under `/src/julia`.
-
-...
@@ -51,3 +51,116 @@ tab, booktabs = TRUE,
 caption = 'VAE architecture and training parameters.',
 col.names = c("Hidden Dim.","Epochs")
 )
+tab <- data.frame(
+"Hidden Dim." = c(32,32),
+"Hidden Layers." = c(1,2),
+"Batch" = c("-",50),
+"Dropout" = c("-",0.25)
+)
+rownames(tab) <- c("Synthetic", "Real-World")
+knitr::kable(
+tab, booktabs = TRUE,
+caption = 'MLP architecture and training parameters.',
+col.names = c("Hidden Dim.","Hidden Layers", "Batch", "Dropout")
+)
+tab <- data.frame(
+"Model" = c("MLP","MLP","VAE","VAE"),
+"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
+"Hidden Dim." = c(32,32,32,32),
+"Latent Dim." = c("-","-",2,8),
+"Hidden Layers." = c(1,2,1,1),
+"Batch" = c("-",50),
+"Dropout" = c("-",0.25)
+)
+tab
+tab <- data.frame(
+"Model" = c("MLP","MLP","VAE","VAE"),
+"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
+"Hidden Dim." = c(32,32,32,32),
+"Latent Dim." = c("-","-",2,8),
+"Hidden Layers." = c(1,2,1,1),
+"Batch" = c("-",50,"-","-"),
+"Dropout" = c("-",0.25,"-","-")
+)
+tab
+tab <- data.frame(
+"Model" = c("MLP","MLP","VAE","VAE"),
+"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
+"Hidden Dim." = c(32,32,32,32),
+"Latent Dim." = c("-","-",2,8),
+"Hidden Layers." = c(1,2,1,1),
+"Batch" = c("-",50,"-","-"),
+"Dropout" = c("-",0.25,"-","-"),
+"Epochs" = c(100,100,100,250)
+)
+tab
+library(kableExtra)
+tab <- data.frame(
+"Model" = c("MLP","MLP","VAE","VAE"),
+"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
+"Hidden Dim." = c(32,32,32,32),
+"Latent Dim." = c("-","-",2,8),
+"Hidden Layers." = c(1,2,1,1),
+"Batch" = c("-",50,"-","-"),
+"Dropout" = c("-",0.25,"-","-"),
+"Epochs" = c(100,100,100,250)
+)
+library(kableExtra)
+kbl(
+tab, booktabs = TRUE,
+caption = 'MLP architecture and training parameters.',
+col.names = c("Hidden Dim.","Latent Dim.","Hidden Layers", "Batch", "Dropout", "Epochs"),
+collapse_rows = c(1,2)
+)
+tab <- data.frame(
+"Model" = c("MLP","MLP","VAE","VAE"),
+"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
+"Hidden Dim." = c(32,32,32,32),
+"Latent Dim." = c("-","-",2,8),
+"Hidden Layers." = c(1,2,1,1),
+"Batch" = c("-",50,"-","-"),
+"Dropout" = c("-",0.25,"-","-"),
+"Epochs" = c(100,100,100,250)
+)
+library(kableExtra)
+kbl(
+tab, booktabs = TRUE,
+caption = 'MLP architecture and training parameters.',
+col.names = c("Model","Data","Hidden Dim.","Latent Dim.","Hidden Layers", "Batch", "Dropout", "Epochs"),
+collapse_rows = c(1,2)
+)
+tab <- data.frame(
+"Model" = c("MLP","MLP","VAE","VAE"),
+"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
+"Hidden Dim." = c(32,32,32,32),
+"Latent Dim." = c("-","-",2,8),
+"Hidden Layers." = c(1,2,1,1),
+"Batch" = c("-",50,"-","-"),
+"Dropout" = c("-",0.25,"-","-"),
+"Epochs" = c(100,100,100,250)
+)
+library(kableExtra)
+kbl(
+tab, booktabs = TRUE,
+caption = 'MLP architecture and training parameters.',
+col.names = c("Model","Data","Hidden Dim.","Latent Dim.","Hidden Layers", "Batch", "Dropout", "Epochs")
+) |>
+collapse_rows(1:2, row_group_label_position = 'stack')
+tab <- data.frame(
+"Model" = c("MLP","MLP","VAE","VAE"),
+"Data" = c("Synthetic", "Real-World", "Synthetic", "Real-World"),
+"Hidden" = c(32,32,32,32),
+"Latent" = c("-","-",2,8),
+"Layers" = c(1,2,1,1),
+"Batch" = c("-",50,"-","-"),
+"Dropout" = c("-",0.25,"-","-"),
+"Epochs" = c(100,100,100,250)
+)
+library(kableExtra)
+kbl(
+tab, booktabs = TRUE,
+caption = 'MLP architecture and training parameters.',
+col.names = c("Model","Data","Hidden Dim.","Latent Dim.","Hidden Layers", "Batch", "Dropout", "Epochs")
+) |>
+collapse_rows(1:2, row_group_label_position = 'stack') |>
+kable_styling(latex_options = c("scale_down"))
@@ -61,9 +61,9 @@ affiliation:
   #         email: eldon@starfleet-academy.star
   #       - name: Roy Batty
   #         email: roy@replicant.offworld
-keywords: ["keyword 1", "keyword 2"]
+keywords: ["Algorithmic Recourse", "Counterfactual Explanations", "Explainable AI", "Dynamic Systems"]
 abstract: |
-  Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse (AR) has largely been limited to the static setting and focused on single individuals: given some estimated model the goal is to find valid counterfactuals for individual instance that fulfill various desiderata. The ability of such counterfactuals to handle dynamics like data and model drift remains a largely unexplored research challenge at this point. There has also been surprisingly little work on the related question of how the actual implementation of recourse by one individual may affect other individuals. Through this work we aim to close that gap by systematizing and extending existing knowledge. We first show that many of the existing methodologies can be collectively described by a generalized framework. We then argue that the existing framework fails to account for a hidden external cost of recourse, that only reveals itself when studying the endogenous dynamics of recourse at the group level. Through simulation experiments involving various state-of-the-art counterfactual generators and several benchmark datasets, we generate a total of XX million counterfactuals and study the resulting domain and model shifts. We find that the induced shifts are substantial enough to likely impede the applicability Algorithmic Recourse in situations that involve competition for scarce resources. Fortunately, we find various potential mitigation strategies that can be used in combination with existing approaches. Our simulation framework for studying recourse dynamics is fast and open-sourced. 
+  Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse (AR) has largely been limited to the static setting and focused on single individuals: given some estimated model the goal is to find valid counterfactuals for individual instance that fulfill various desiderata. The ability of such counterfactuals to handle dynamics like data and model drift remains a largely unexplored research challenge at this point. There has also been surprisingly little work on the related question of how the actual implementation of recourse by one individual may affect other individuals. Through this work we aim to close that gap by systematizing and extending existing knowledge. We first show that many of the existing methodologies can be collectively described by a generalized framework. We then argue that the existing framework fails to account for a hidden external cost of recourse, that only reveals itself when studying the endogenous dynamics of recourse at the group level. Through simulation experiments involving various state-of-the-art counterfactual generators and several benchmark datasets, we generate large numbers of counterfactuals and study the resulting domain and model shifts. We find that the induced shifts are substantial enough to likely impede the applicability Algorithmic Recourse in situations that involve competition for scarce resources. Fortunately, we find various potential mitigation strategies that can be used in combination with existing approaches. Our simulation framework for studying recourse dynamics is fast and open-sourced. 
 
 # use some specific Tex packages if needed. 
 # with_ifpdf: true
 
@@ -227,7 +227,7 @@ struct ExperimentResults
     experiment::Experiment
 end
 
-using DataFrames, CSV, BSON
+using DataFrames, CSV
 """
     run_experiment(
         experiment::Experiment; evaluate_every::Int=2,