May 25'23
Exercise
You are given a set of n observations, each with p features. Determine which of the following statements is/are true with respect to clustering methods.
- The n observations can be clustered on the basis of the p features to identify subgroups among the observations.
- The p features can be clustered on the basis of the n observations to identify subgroups among the features.
- Clustering is an unsupervised learning method and is often performed as part of an exploratory data analysis.
- None
- I and II only
- I and III only
- II and III only
- The correct answer is not given by (A), (B), (C), or (D).
May 26'23
Key: E
I and II are both true because the roles of rows and columns can be reversed in the clustering algorithm. (See Section 10.3 of An Introduction to Statistical Learning.)
III is true. Clustering is unsupervised learning because there is no dependent (target) variable. It can be used in exploratory data analysis to learn about relationships between observations or features.