Revision as of 20:29, 26 May 2023 by Admin (Created page with "'''Key: E''' The total Gini index for Split 1 is 2[20(12/20)(8/20) + 80(18/80)(62/80)]/100 = 0.375 and for Split 2 is 2[10(8/10)(2/10) + 90(22/90)(68/90)]/100 = 0.364...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Exercise


ABy Admin
May 26'23

Answer

Key: E

The total Gini index for Split 1 is

2[20(12/20)(8/20) + 80(18/80)(62/80)]/100 = 0.375 

and for Split 2 is

2[10(8/10)(2/10) + 90(22/90)(68/90)]/100 = 0.3644. 

Smaller is better, so Split 2 is preferred. The factor of 2 is due to summing two identical terms (which occurs when there are only two classes).

The total entropy for Split 1 is

 –[20(12/20)ln(12/20) +20(8/20)ln(12/20) + 80(18/80)ln(18/80) + 80(62/80)ln(62/80)]/100 = 0.5611 

and for Split 2 is

 – [10(8/10)ln(8/10) +10(2/10)ln(2/10) + 90(22/90)ln(22/90) + 90(68/90)ln(68/90)]/100 =0.5506. 

Smaller is better, so Split 2 is preferred.

For Split 1, there are 8 + 18 = 26 errors and for Split 2 there are 2 + 22 = 24 errors. With fewer errors, Split 2 is preferred.

Copyright 2023. The Society of Actuaries, Schaumburg, Illinois. Reproduced with permission.

00