Skip to content

Commit 4ddf98f

Browse files
author
Your Name
committed
docs: add Decision Tree algorithm explanation
1 parent 03a4251 commit 4ddf98f

1 file changed

Lines changed: 92 additions & 0 deletions

File tree

docs/decision_tree.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Decision Tree Algorithm
2+
3+
## Overview
4+
A **Decision Tree** is a supervised machine learning algorithm used for both classification and regression tasks.
5+
It works by recursively splitting the dataset into smaller subsets based on feature values until a stopping criterion is met.
6+
7+
---
8+
9+
## Mathematical Concept
10+
11+
Decision Trees use measures like **Entropy**, **Information Gain**, and **Gini Impurity** to decide where to split.
12+
13+
### 1. Entropy
14+
Entropy measures the amount of uncertainty or impurity in the dataset:
15+
16+
H(S) = - Σ p(x) log₂ p(x)
17+
18+
19+
Where:
20+
- `p(x)` = probability of class `x`
21+
- Lower entropy = more pure dataset
22+
23+
---
24+
25+
### 2. Information Gain
26+
Information Gain measures the reduction in entropy after splitting on an attribute:
27+
28+
29+
30+
IG(S, A) = H(S) - Σ ( |Sv| / |S| ) * H(Sv)
31+
32+
33+
Where:
34+
- `S` = dataset
35+
- `A` = attribute (feature)
36+
- `Sv` = subset after splitting by `A`
37+
38+
A split with the **highest information gain** is chosen.
39+
40+
---
41+
42+
### 3. Gini Index
43+
An alternative to entropy for impurity:
44+
45+
46+
47+
Gini(S) = 1 - Σ (p(i)²)
48+
49+
50+
Where:
51+
- `p(i)` = probability of class `i` in dataset `S`
52+
53+
A pure dataset has Gini = 0.
54+
55+
---
56+
57+
## Practical Use Cases
58+
- **Business**: Predicting customer churn
59+
- **Finance**: Credit scoring / loan approval
60+
- **Healthcare**: Diagnosing diseases based on symptoms
61+
- **Cybersecurity**: Spam / phishing detection
62+
63+
---
64+
65+
## Advantages
66+
- Simple to understand and visualize
67+
- Handles both numerical and categorical data
68+
- Requires little preprocessing (no normalization or scaling)
69+
70+
---
71+
72+
## Limitations
73+
- Prone to overfitting (can be solved using pruning or ensembles like Random Forests)
74+
- Small changes in data can lead to different trees (instability)
75+
76+
---
77+
78+
## Example Usage
79+
80+
```python
81+
from machine_learning.decision_tree import DecisionTree
82+
83+
# Sample dataset
84+
X = [[1], [2], [3], [4], [5]]
85+
y = [0, 0, 1, 1, 1]
86+
87+
# Train model
88+
tree = DecisionTree(max_depth=2)
89+
tree.fit(X, y)
90+
91+
# Prediction
92+
print(tree.predict([[2]])) # Output: [0]

0 commit comments

Comments
 (0)