|
|
|
|
|
|
|
|
|
|
|
|
|
|
• |
In
turn, players roll a single die as many times as desire.
|
|
|
|
– |
If a
player stops before rolling a 1, the player adds the total of the
|
|
|
numbers
rolled in sequence to their cumulative score.
|
|
|
|
– |
If a
player rolls a 1, the player receives no score.
|
|
|
• |
The
goal is to be the first player to reach a score of 100.
|
|
|
• |
Qn(s,a) ß (1-an)Qn-1(s,a) + an(r
+ g maxa'[Qn-1(s',a')])
|
|
|
|
|
• |
an = 1/(1+visitsn(s,a))
|
|
|
|
|
• |
What
are the ramifications of one's choice for g?
|
|
|
• |
How
can one best speed convergence observing real game
|
|
|
experience?
|
|