1. Input Data Points
The algorithm starts with your input data points (X, y). These points represent the relationship we want to model.
2. Find Best Split
The algorithm tries different thresholds on the feature (X) to split the data into two groups, aiming to reduce variance in each group.
3. Split Data
Data is split into left and right groups based on the threshold. Each group is more homogeneous (less variance).
4. Recursively Build Subtrees
The algorithm repeats the splitting process on each group until max depth or minimum variance is reached, building a tree structure.
5. Make Predictions
To predict a new point, the tree is traversed from root to leaf by comparing the input to thresholds, returning the leaf's average value.