exercise:21ff64a0e9: Difference between revisions
From Stochiki
(Created page with "<div class="d-none"><math> \newcommand{\NA}{{\rm NA}} \newcommand{\mat}[1]{{\bf#1}} \newcommand{\exref}[1]{\ref{##1}} \newcommand{\secstoprocess}{\all} \newcommand{\NA}{{\rm NA}} \newcommand{\mathds}{\mathbb}</math></div> Write a program to allow you to compare the strategies play-the-winner and play-the-best-machine for the two-armed bandit problem of Example \ref{exam 4.17}. Have your program determine the initial payoff probabilities for each machine by choosin...") |
No edit summary |
||
Line 1: | Line 1: | ||
Write a program to allow you to compare the strategies play-the-winner and play-the-best-machine for the two-armed bandit problem of [[guide:E05b0a84f3#exam 4.17|Example]]. Have your program determine the initial payoff probabilities for | |||
and play-the-best-machine for the two-armed bandit problem of | |||
4.17 | |||
each machine by choosing a pair of random numbers between 0 and 1. Have your | each machine by choosing a pair of random numbers between 0 and 1. Have your | ||
program carry out 20 plays and keep track of the number of wins for each of the | program carry out 20 plays and keep track of the number of wins for each of the |
Latest revision as of 23:48, 13 June 2024
Write a program to allow you to compare the strategies play-the-winner and play-the-best-machine for the two-armed bandit problem of Example. Have your program determine the initial payoff probabilities for each machine by choosing a pair of random numbers between 0 and 1. Have your program carry out 20 plays and keep track of the number of wins for each of the two strategies. Finally, have your program make 1000 repetitions of the 20 plays and compute the average winning per 20 plays. Which strategy seems to be the best? Repeat these simulations with 20 replaced by 100. Does your answer to the above question change?