Example: Simplified Blackjack
Read the homework description of
simplified Blackjack.
What would you try for g and why?
What would be the states and actions of
your nondeterministic MDP?
Do you expect your optimal policy to match
the strategy described?