Revision as of 20:29, 26 May 2023 by Admin (Created page with "'''Key: E''' The total Gini index for Split 1 is 2[20(12/20)(8/20) + 80(18/80)(62/80)]/100 = 0.375 and for Split 2 is 2[10(8/10)(2/10) + 90(22/90)(68/90)]/100 = 0.364...")
Exercise
ABy Admin
May 26'23
Answer
Key: E
The total Gini index for Split 1 is
2[20(12/20)(8/20) + 80(18/80)(62/80)]/100 = 0.375
and for Split 2 is
2[10(8/10)(2/10) + 90(22/90)(68/90)]/100 = 0.3644.
Smaller is better, so Split 2 is preferred. The factor of 2 is due to summing two identical terms (which occurs when there are only two classes).
The total entropy for Split 1 is
–[20(12/20)ln(12/20) +20(8/20)ln(12/20) + 80(18/80)ln(18/80) + 80(62/80)ln(62/80)]/100 = 0.5611
and for Split 2 is
– [10(8/10)ln(8/10) +10(2/10)ln(2/10) + 90(22/90)ln(22/90) + 90(68/90)ln(68/90)]/100 =0.5506.
Smaller is better, so Split 2 is preferred.
For Split 1, there are 8 + 18 = 26 errors and for Split 2 there are 2 + 22 = 24 errors. With fewer errors, Split 2 is preferred.