Skip to content

Commit 9e63624

Browse files
committed
work on data
1 parent 158dcf2 commit 9e63624

4 files changed

Lines changed: 15 additions & 13 deletions

File tree

docs/src/paper/Project.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,14 @@
11
[deps]
22
AlgorithmicRecourseDynamics = "3d1ede72-abb8-4340-bf8e-2ae06849b5ec"
3+
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
34
CounterfactualExplanations = "2f13d31b-18db-44c1-bc43-ebaf2cff0be0"
45
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
56
Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
67
Images = "916415d5-f1e6-5110-898d-aaa5f9f070e0"
78
LaplaceRedux = "c52c1a26-f7c5-402b-80be-ba1e638ad478"
89
MLJBase = "a7f614a8-145f-11e9-1d2a-a57a1082229d"
10+
MLUtils = "f1d291b0-491e-4a28-83b9-f70985020b54"
911
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
1012
RCall = "6f49c342-dc21-5d91-9882-a32aef131414"
1113
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
14+
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"

docs/src/paper/data_preprocessing/real_world_data.qmd renamed to docs/src/paper/data_preprocessing/_real_world_data.qmd

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,39 @@
11
---
22
title: Preprocessing Real-World Data
3-
jupyter: julia-1.7
43
---
54

65
```{julia}
7-
using Pkg; Pkg.activate("dev")
8-
```
6+
#| echo: false
97
10-
```{julia}
11-
include("dev/utils.jl")
12-
using AlgorithmicRecourseDynamics
13-
using CounterfactualExplanations, Flux, Plots, PlotThemes, Random, LaplaceRedux, LinearAlgebra
14-
theme(:wong)
8+
include("docs/src/paper/setup.jl")
9+
eval(setup)
1510
output_path = output_dir("real_world")
1611
www_path = www_dir("real_world")
1712
data_path = data_dir("real_world")
1813
```
1914

2015
## California Housing Data
2116

22-
Fetching the data using Python's `sklearn`:
17+
Fetching the data using Python's `sklearn` (run this in the Python REPL):
2318

24-
```{python}
19+
```python
2520
from sklearn.datasets import fetch_california_housing
2621
df, y = fetch_california_housing(return_X_y=True, as_frame=True)
2722
df["target"] = y.values
28-
data_path = "../../artifacts/upload/data/real_world"
23+
data_path = "dev/artifacts/upload/data/real_world"
2924
import os
25+
if not os.path.isdir(os.path.join(data_path,"raw")):
26+
os.makedirs(os.path.join(data_path,"raw"))
3027
df.to_csv(os.path.join(data_path,"raw/cal_housing.csv"), index=False)
3128
```
3229

3330
Loading the data into Julia session:
3431

3532
```{julia}
36-
using CSV, DataFrames, Statistics, StatsBase
3733
df = CSV.read(joinpath(data_path, "raw/cal_housing.csv"), DataFrame)
3834
# Features:
3935
X = Matrix(df[:,Not(:target)])
40-
dt = fit(ZScoreTransform, X, dims=1)
36+
dt = StatsBase.fit(ZScoreTransform, X, dims=1)
4137
StatsBase.transform!(dt, X)
4238
# Target:
4339
y = df.target
File renamed without changes.

docs/src/paper/setup.jl

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,16 +10,19 @@ setup = quote
1010
using AlgorithmicRecourseDynamics.Models: model_evaluation
1111
using CounterfactualExplanations
1212
using CounterfactualExplanations: counterfactual, counterfactual_label
13+
using CSV
1314
using DataFrames
1415
using Flux
1516
using Images
1617
using LaplaceRedux
1718
using Markdown
1819
using MLJBase
20+
using MLUtils
1921
using Plots
2022
using Random
2123
using RCall
2224
using Serialization
25+
using StatsBase
2326

2427
# Setup
2528
Random.seed!(2023) # global seed to allow for reproducibility

0 commit comments

Comments
 (0)