exercise:C67e5a9a0b: Difference between revisions
(Created page with "<div class="d-none"><math> \newcommand{\NA}{{\rm NA}} \newcommand{\mat}[1]{{\bf#1}} \newcommand{\exref}[1]{\ref{##1}} \newcommand{\secstoprocess}{\all} \newcommand{\NA}{{\rm NA}} \newcommand{\mathds}{\mathbb}</math></div> Consider the two-armed bandit problem of Example. Bruce Barnes proposed the following strategy, which is a variation on the play-the-best-machine strategy. The machine with the greatest probability of winning is p...") |
No edit summary |
||
Line 1: | Line 1: | ||
Consider the two-armed bandit problem of [[guide:E05b0a84f3#exam 4.17 |Example]]. Bruce Barnes proposed the following strategy, which is a variation on the | |||
Bruce Barnes proposed the following strategy, which is a variation on the | |||
play-the-best-machine strategy. The machine with the greatest probability of | play-the-best-machine strategy. The machine with the greatest probability of | ||
winning is played ''unless'' the following two conditions hold: (a) the | winning is played ''unless'' the following two conditions hold: (a) the |
Latest revision as of 23:44, 13 June 2024
Consider the two-armed bandit problem of Example. Bruce Barnes proposed the following strategy, which is a variation on the play-the-best-machine strategy. The machine with the greatest probability of winning is played unless the following two conditions hold: (a) the difference in the probabilities for winning is less than .08, and (b) the ratio of the number of times played on the more often played machine to the number of times played on the less often played machine is greater than 1.4. If the above two conditions hold, then the machine with the smaller probability of winning is played. Write a program to simulate this strategy. Have your program choose the initial payoff probabilities at random from the unit interval [math][0,1][/math], make 20 plays, and keep track of the number of wins. Repeat this experiment 1000 times and obtain the average number of wins per 20 plays. Implement a second strategy---for example, play-the-best-machine or one of your own choice, and see how this second strategy compares with Bruce's on average wins.