feat : Added readme to decision trees

arjunprakash027 · Dec 8, 2024 · 34dffef · 34dffef
1 parent 6ecbcb1
commit 34dffef
Showing 1 changed file with 87 additions and 0 deletions.
diff --git a/trees/decision_trees_ground_up/README.MD b/trees/decision_trees_ground_up/README.MD
@@ -0,0 +1,87 @@
+# Decision Tree Overview
+
+## What is a Decision Tree?
+
+A **Decision Tree** is a machine learning algorithm that splits a dataset based on feature values in the form of a binary tree. It traverses from the root node to the final leaf node to derive predictions. Key characteristics include:
+
+- **Splitting Mechanism:** The dataset is recursively split based on feature values.
+- **Optimization Criteria:** Uses metrics like Entropy or Information Gain to determine the optimal split. The quality of these splits is fundamental to the algorithm's effectiveness.
+- **Versatility:** Applicable to both classification and regression tasks.
+
+## The Art of Splits
+
+### Entropy
+
+**Entropy** quantifies the amount of uncertainty or impurity in a dataset at a high level.
+
+#### Formula for Entropy
+
+To calculate entropy for a particular split:
+
+1. **Calculate Probabilities:** Determine the probability of each class in the split.
+2. **Compute Logarithms:** Take the logarithm of each probability.
+3. **Multiply:** Multiply each probability by its corresponding logarithm.
+4. **Sum Up:** Perform the above steps for every class in the leaf and sum the results to obtain the entropy of that leaf.
+
+**Formula:**
+\[
+\text{Entropy} = -\sum_{i=1}^{n} p_i \log(p_i)
+\]
+where \( p_i \) is the probability of class \( i \).
+
+#### Why Use Logarithms of Probabilities?
+
+- **Weighting Rare Events:** An event with a probability of 0.01 is rarer and carries more weight than an event with a probability of 0.9. Logarithms scale these probabilities non-linearly, which is beneficial for emphasizing rare events.
+- **Mathematical Properties:** The logarithm function is defined as:
+  - \( \log(x) = y \) is equivalent to \( 10^y = x \).
+  - As \( x \) grows exponentially, \( y \) increases very slowly.
+  - **Examples:**
+    - \( \log(10) = 1 \)
+    - \( \log(100) = 2 \)
+    - \( \log(1000) = 3 \)
+    - \( \log(10^{16}) = 16 \)
+
+This property ensures that large changes in \( x \) result in only modest changes in \( y \), preventing any single class from disproportionately influencing the entropy.
+
+### Other Splitting Criteria
+
+There are several other methods for determining the optimal splits in a decision tree:
+
+- **Information Gain:** Measures the reduction in entropy achieved by partitioning the data.
+- **Gini Impurity:** Evaluates the likelihood of incorrect classification of a randomly chosen element.
+- **Gradient-Based Splitting:** Utilizes gradient information for more advanced splitting strategies.
+
+These methods will be covered in detail in subsequent sections.
+
+## Workflow of Decision Tree Construction
+
+The process of building a decision tree involves the following steps:
+
+1. **Iterate Through Features:**
+   - Loop through each feature in the dataset.
+
+2. **Determine Feature Range:**
+   - Identify the minimum and maximum values of the current feature.
+
+3. **Generate Split Points:**
+   - Sort the feature values and calculate the mean value between two consecutive positions to create potential split points.
+
+4. **Split the Dataset:**
+   - For each mean value, divide the dataset into two subsets:
+     - **Left Leaf:** Instances with feature values less than or equal to the mean.
+     - **Right Leaf:** Instances with feature values greater than the mean.
+
+5. **Calculate Entropy:**
+   - For each resulting leaf, determine the class distribution and compute the entropy.
+   - Weight the entropy of each leaf according to the class distribution.
+   - Sum the weighted entropies to obtain the overall entropy of the split.
+
+6. **Evaluate Split Quality:**
+   - Compare the calculated entropy with the current best entropy.
+   - If the new entropy is lower, update the best split to this node.
+
+7. **Recursive Splitting:**
+   - Apply the above steps recursively to each leaf node.
+   - Continue until a predefined depth limit or another stopping criterion is reached.
+
+This recursive process ensures that the tree optimally partitions the data, minimizing entropy and enhancing prediction accuracy.