May 26'23

Exercise

For a random forest, let p be the total number of features and m be the number of features selected at each split.

Determine which of the following statements is/are true.

  • When [math]m = p[/math], random forest and bagging are the same procedure.
  • [math]\frac{p-m}{p}[/math] is the probability a split will not consider the strongest predictor.
  • The typical choice of [math]m[/math] is [math]\frac{p}{2}[/math]
  • None
  • I and II only
  • I and III only
  • II and III only
  • The correct answer is not given by (A), (B), (C), or (D).

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

May 26'23

Key: B

I is true. Random forests differ from bagging by setting m < p.

II is true. p – m represents the splits not chosen.

III is false. Typical choices are the square root of p or p/3.

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

00