I like how you talk about Conditional Inference. My thesis is suppose to overcome the brute force of exhaustive search for best splits that Random Forest does (I use Dr. Loh's GUIDE trees) using statistical methods.
> Many implementations of random forest default to 1/3 of your predictor variables.
This is interesting. I hear it was sqroot(total number of predictors).
> Ensemble methods combine many individual trees to create one better, more stable model.
I think stable can be more clarify to having good training accuracy and low generalize error (unseen data error rate) compare to individual tree. This is what Dr. Ho talk about with forest.
But other than that I think it's an awesome tutorial.
I've seen what other tree and forest do for better generalization with unseen data is pruning is using CV and choosing 0.5 to 1.0 std error as a cut off point. That may be a thing to talk about if you are interested in that.
Thank you for the useful feedback! I'll have to look up GUIDE trees.
> This is interesting. I hear it was sqroot(total number of predictors).
I was probably looking at the randomForest R package documentation [1], which says:
> mtry Number of variables randomly sampled as candidates at each split. Note that the default values are different for classification (sqrt(p) where p is number of variables in x) and regression (p/3)
I checked the H2O implementation of random forest [2] and they use the same defaults.
I'll add a note about the one third default being specific to regression since that seems like an important distinction.
I like how you talk about Conditional Inference. My thesis is suppose to overcome the brute force of exhaustive search for best splits that Random Forest does (I use Dr. Loh's GUIDE trees) using statistical methods.
> Many implementations of random forest default to 1/3 of your predictor variables.
This is interesting. I hear it was sqroot(total number of predictors).
> Ensemble methods combine many individual trees to create one better, more stable model.
I think stable can be more clarify to having good training accuracy and low generalize error (unseen data error rate) compare to individual tree. This is what Dr. Ho talk about with forest.
But other than that I think it's an awesome tutorial.
I've seen what other tree and forest do for better generalization with unseen data is pruning is using CV and choosing 0.5 to 1.0 std error as a cut off point. That may be a thing to talk about if you are interested in that.