exercise:21ff64a0e9: Difference between revisions

Latest revision as of 00:48, 14 June 2024

Write a program to allow you to compare the strategies play-the-winner and play-the-best-machine for the two-armed bandit problem of Example. Have your program determine the initial payoff probabilities for each machine by choosing a pair of random numbers between 0 and 1. Have your program carry out 20 plays and keep track of the number of wins for each of the two strategies. Finally, have your program make 1000 repetitions of the 20 plays and compute the average winning per 20 plays. Which strategy seems to be the best? Repeat these simulations with 20 replaced by 100. Does your answer to the above question change?

@@ Line 1: / Line 1: @@
-<div class="d-none"><math>
+Write a program to allow you to compare the strategies play-the-winner and play-the-best-machine for the two-armed bandit problem of [[guide:E05b0a84f3#exam 4.17|Example]].  Have your program determine the initial payoff probabilities for
-\newcommand{\NA}{{\rm NA}}
-\newcommand{\mat}[1]{{\bf#1}}
-\newcommand{\exref}[1]{\ref{##1}}
-\newcommand{\secstoprocess}{\all}
-\newcommand{\NA}{{\rm NA}}
-\newcommand{\mathds}{\mathbb}</math></div> Write a program to allow you to compare the strategies play-the-winner
-and play-the-best-machine for the two-armed bandit problem of Example \ref{exam
-.17}.  Have your program determine the initial payoff probabilities for
 each machine by choosing a pair of random numbers between 0 and 1.  Have your
 program carry out 20 plays and keep track of the number of wins for each of the