exercise:21ff64a0e9: Difference between revisions

From Stochiki
(Created page with "<div class="d-none"><math> \newcommand{\NA}{{\rm NA}} \newcommand{\mat}[1]{{\bf#1}} \newcommand{\exref}[1]{\ref{##1}} \newcommand{\secstoprocess}{\all} \newcommand{\NA}{{\rm NA}} \newcommand{\mathds}{\mathbb}</math></div> Write a program to allow you to compare the strategies play-the-winner and play-the-best-machine for the two-armed bandit problem of Example \ref{exam 4.17}. Have your program determine the initial payoff probabilities for each machine by choosin...")
 
No edit summary
 
Line 1: Line 1:
<div class="d-none"><math>
Write a program to allow you to compare the strategies play-the-winner and play-the-best-machine for the two-armed bandit problem of [[guide:E05b0a84f3#exam 4.17|Example]].  Have your program determine the initial payoff probabilities for
\newcommand{\NA}{{\rm NA}}
\newcommand{\mat}[1]{{\bf#1}}
\newcommand{\exref}[1]{\ref{##1}}
\newcommand{\secstoprocess}{\all}
\newcommand{\NA}{{\rm NA}}
\newcommand{\mathds}{\mathbb}</math></div> Write a program to allow you to compare the strategies play-the-winner
and play-the-best-machine for the two-armed bandit problem of Example \ref{exam
4.17}.  Have your program determine the initial payoff probabilities for
each machine by choosing a pair of random numbers between 0 and 1.  Have your
each machine by choosing a pair of random numbers between 0 and 1.  Have your
program carry out 20 plays and keep track of the number of wins for each of the
program carry out 20 plays and keep track of the number of wins for each of the

Latest revision as of 23:48, 13 June 2024

Write a program to allow you to compare the strategies play-the-winner and play-the-best-machine for the two-armed bandit problem of Example. Have your program determine the initial payoff probabilities for each machine by choosing a pair of random numbers between 0 and 1. Have your program carry out 20 plays and keep track of the number of wins for each of the two strategies. Finally, have your program make 1000 repetitions of the 20 plays and compute the average winning per 20 plays. Which strategy seems to be the best? Repeat these simulations with 20 replaced by 100. Does your answer to the above question change?