ABy Admin
May 25'23

Exercise

Determine which of the following statements about selecting the optimal number of clusters in K-means clustering is/are true.

  • K should be set equal to n, the number of observations.
  • Choose K such that the total within-cluster variation is minimized.
  • The determination of K is subjective and there does not exist one method to determine the optimal number of clusters.
  • I only
  • II only
  • III only
  • I, II and III
  • The correct answer is not given by (A), (B), (C), or (D).

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

ABy Admin
May 26'23

Key: C

I is false. Setting K = n will almost certainly overfit as there are unlikely to be that many true clusters.

II is false. Within-cluster variation is minimized when K = n, which as noted above is unlikely to be optimal.

III is true. There is no exact method for determining the optimal value of K.

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

00