⧼exchistory⧽
5 exercise(s) shown, 0 hidden
May 26'23
For a random forest, let p be the total number of features and m be the number of features selected at each split.
Determine which of the following statements is/are true.
- When [math]m = p[/math], random forest and bagging are the same procedure.
- [math]\frac{p-m}{p}[/math] is the probability a split will not consider the strongest predictor.
- The typical choice of [math]m[/math] is [math]\frac{p}{2}[/math]
- None
- I and II only
- I and III only
- II and III only
- The correct answer is not given by (A), (B), (C), or (D).
May 26'23
Determine which of the following statements about random forests is/are true?
- If the number of predictors used at each split is equal to the total number of available predictors, the result is the same as using bagging.
- When building a specific tree, the same subset of predictor variables is used at each split.
- Random forests are an improvement over bagging because the trees are decorrelated.
- None
- I and II only
- I and III only
- II and III only
- The correct answer is not given by (A), (B), (C), or (D).
May 26'23
Determine which of the following statements regarding statistical learning methods is/are true.
- Methods that are highly interpretable are more likely to be highly flexible.
- When inference is the goal, there are clear advantages to using a lasso method versus a bagging method.
- Using a more flexible method will produce a more accurate prediction against unseen data.
- I only
- II only
- III only
- I, II and III
- The correct answer is not given by (A), (B), (C), or (D).
May 26'23
You are given a dataset with two variables, which is graphed below. You want to predict y using x.
Determine which statement regarding using a generalized linear model (GLM) or a random forest is true.
- A random forest is appropriate because the dataset contains only quantitative variables.
- A random forest is appropriate because the data does not follow a straight line.
- A GLM is not appropriate because the variance of y given x is not constant.
- A random forest is appropriate because there is a clear relationship between y and x.
- A GLM is appropriate because it can accommodate polynomial relationships.
Apr 24'24
Determine which of the following statements is true
- Linear regression is a flexible approach
- Lasso is more flexible than a linear regression approach
- Bagging is a low flexibility approach
- There are methods that have high flexibility and are also easy to interpret
- None of (A), (B), (C), or (D) are true