Interactive dashboards and web applications to learn basic concepts of statistics.
Live site: https://quarcs-lab.github.io/apps4statistics/
| App | What it teaches |
|---|---|
| Log Transformations | Why a log transformation tames right-skewed distributions, reduces the influence of outliers, and turns hockey-stick growth into straight lines. |
More apps coming soon.
The first dashboard teaches why log transformations are one of the most common tools in data analysis. It is organized into three interactive sections, each answering one question:
Section 1 — Right-skewed distributions. Simulates income from a lognormal distribution (adjustable μ, σ, and sample size). A strip plot shows every household as a dot, with mean and median lines. A box plot below summarizes the same data. On the raw scale the distribution is visibly right-skewed (long upper tail, mean above median, asymmetric whiskers); on the log scale it becomes a symmetric bell. Live readouts update skewness and the mean–median gap in real time as the user drags sliders.
Section 2 — Outliers and the mean–median gap. Starts with a tight symmetric bulk and lets the user inject 0–20 outliers at a controllable magnitude (up to 1000× the median). Three complementary views show the same data side by side on the raw and log scales: a strip plot with mean/median lines, a color-coded outlier-rule visualization (blue = inside Tukey's typical range, red = outside), and a classic box plot. The pedagogical chain: outliers pull the mean above the median on the raw scale; the log transformation compresses them and brings mean and median back together.
Section 3 — The hockey stick. Plots real GDP per capita (Maddison Project Database 2020) for the United States, with optional comparisons to Argentina and Japan. On a linear y-axis the series looks like a hockey stick; toggling to a log y-axis reveals a roughly straight line whose slope is the annualized growth rate. Optional overlays include a fitted log-linear trend (with R²) and a rolling growth-rate panel. The user can see that a constant slope means a stable growth rate — and that bends in the line mean the rate is changing.
All visualizations default to a sample size of 100 so every individual data point is clearly visible. Users can increase the sample with a slider to see how larger samples smooth out noise.
Each dashboard is a small, self-contained static site:
- Vanilla HTML, CSS, and JavaScript — no build step.
- Plotly.js loaded from a pinned CDN (basic bundle, ~800 KB).
- All data either generated client-side or embedded directly in the source so nothing depends on an external service.
The whole project is served as a GitHub Pages site from the main branch root directory. A .nojekyll file at the root prevents Jekyll from rewriting paths.
From the repo root:
python3 -m http.server 8000Then open http://localhost:8000/.
- Create a sibling folder, e.g.
central-limit-theorem/. - Inside it, add
index.html,styles.css,app.js, and any data files. Link the shared site stylesheet with<link rel="stylesheet" href="../assets/styles.css">. - In the root
index.html, add a new<li>to the.card-gridpointing to your new folder. - Commit and push — GitHub Pages redeploys in about a minute.
GDP per capita data: Bolt, Jutta and Jan Luiten van Zanden, Maddison style estimates of the evolution of the world economy. A new 2020 update, Maddison Project Database 2020, University of Groningen.
MIT