ABy Admin
May 25'23
Exercise
You apply 2-means clustering to a set of five observations with two features. You are given the following initial cluster assignments:
Observation | X1 | X2 | Initial cluster |
---|---|---|---|
1 | 1 | 3 | 1 |
2 | 0 | 4 | 1 |
3 | 6 | 2 | 1 |
4 | 5 | 2 | 2 |
5 | 1 | 6 | 2 |
Calculate the total within-cluster variation of the initial cluster assignments, based on Euclidean distance measure.
- 32.0
- 70.3
- 77.3
- 118.3
- 141.0
ABy Admin
May 26'23
Key: C
The means for cluster 1 are (1 + 0 + 6)/3 = 2.3333 for X1 and (3 + 4 + 2)/3 = 3 for X2 and the variation is
(1 − 2.3333)2 + (3 − 3)2 + (0 − 2.3333)2 + (4 − 3)2 + (6 − 2.3333)2 + (2 − 3)2 = 22.6667.
The means for cluster 2 are (5 + 1)/2 = 3 for X1 and (2 + 6)/2 = 4 for X2 and the variation is
(5 − 3)2 + (2 − 4)2 + (1 − 3)2 + (6 − 4)2 = 16.
The total within-cluster variation is (per equation (10.12) in the first edition of ISLR)
2(22.6667 + 16) = 77.33.