⧼exchistory⧽
5 exercise(s) shown, 0 hidden
May 26'23

For a random forest, let p be the total number of features and m be the number of features selected at each split.

Determine which of the following statements is/are true.

  • When [math]m = p[/math], random forest and bagging are the same procedure.
  • [math]\frac{p-m}{p}[/math] is the probability a split will not consider the strongest predictor.
  • The typical choice of [math]m[/math] is [math]\frac{p}{2}[/math]
  • None
  • I and II only
  • I and III only
  • II and III only
  • The correct answer is not given by (A), (B), (C), or (D).

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

May 26'23

Determine which of the following statements about random forests is/are true?

  • If the number of predictors used at each split is equal to the total number of available predictors, the result is the same as using bagging.
  • When building a specific tree, the same subset of predictor variables is used at each split.
  • Random forests are an improvement over bagging because the trees are decorrelated.
  • None
  • I and II only
  • I and III only
  • II and III only
  • The correct answer is not given by (A), (B), (C), or (D).

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

May 26'23

Determine which of the following statements regarding statistical learning methods is/are true.

  • Methods that are highly interpretable are more likely to be highly flexible.
  • When inference is the goal, there are clear advantages to using a lasso method versus a bagging method.
  • Using a more flexible method will produce a more accurate prediction against unseen data.
  • I only
  • II only
  • III only
  • I, II and III
  • The correct answer is not given by (A), (B), (C), or (D).

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

May 26'23

You are given a dataset with two variables, which is graphed below. You want to predict y using x.

Determine which statement regarding using a generalized linear model (GLM) or a random forest is true.

  • A random forest is appropriate because the dataset contains only quantitative variables.
  • A random forest is appropriate because the data does not follow a straight line.
  • A GLM is not appropriate because the variance of y given x is not constant.
  • A random forest is appropriate because there is a clear relationship between y and x.
  • A GLM is appropriate because it can accommodate polynomial relationships.

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

Apr 24'24

Determine which of the following statements is true

  • Linear regression is a flexible approach
  • Lasso is more flexible than a linear regression approach
  • Bagging is a low flexibility approach
  • There are methods that have high flexibility and are also easy to interpret
  • None of (A), (B), (C), or (D) are true

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.