CS 371: Introduction to
Artificial Intelligence
Game-Playing
|
|
|
Introduction |
|
Minimax |
|
Alpha-beta pruning |
|
Expectiminimax |
Games as Search Problems
|
|
|
Previously, we've looked at search
problems with static environments.
(One agent affects the environment.) |
|
Now, we generalize just a bit and allow
two agents to affect the environment in turn. à dynamic environment |
|
Previously, we've looked for a sequence
of actions to a goal state. |
|
Now, we're looking for a sequence of
actions which maximizes some utility measure regardless of how an adversarial
agent acts. |
Example of Tree Search
|
|
|
|
Search: Peg solitaire |
|
jump a peg over another to empty space,
removing jumped peg |
|
Initial state: only one space empty |
|
Goal state: only one space occupied |
|
Find sequence of jumps from initial
state to goal |
Example of Game-Tree
Search
|
|
|
|
Search: Tic-Tac-Toe |
|
players place X and O in turn. |
|
Initial state: empty 3´3 grid |
|
Goal state: three of a player's symbol
in a row |
|
Count win = +1, draw = 0, loss = -1 |
|
Find sequence of move which maximizes
utility regardless of adversarial play |
Example of Game-Tree
Search
|
|
|
Suppose you construct the complete tree
of possible plays. |
|
Evaluate terminal states as (+1,0,-1) |
|
Evaluate non-terminal states as
maximum/minimum of children evaluations for player X/O respectively. |
|
This propagation of evaluations is
called minimax. |
|
Consider minimax on a subtree of
possible tic-tac-toe plays… |
Slide 7
Slide 8
Slide 9
Slide 10
Slide 11
Slide 12
Slide 13
Slide 14
Slide 15
Slide 16
Another Minimax Example
Minimax Decision-Making
Perfect Decisions in
Two-Player Games
|
|
|
Problem definition: initial state,
operators, terminal test, utility (or payoff) function |
|
Given whole game tree, minimax yields
perfect decisions* |
|
Minimax: minimum of the maximum of the
minimum of the maximum of the… |
|
*assuming adversary acts according to
minimax à importance of player modeling |
|
Can’t search whole game tree, so… |
Imperfect Decisions in
Two-Player Games
|
|
|
|
Evaluate states passing cutoff-test
according to heuristic evaluation function |
|
Consider Chess |
|
enormous state space |
|
can't possibly search whole tree with
current computational limitations |
|
Must |
|
limit depth of search |
|
evaluate non-terminal nodes at limit |
"How would you
evaluate these..."
|
|
|
How would you evaluate these positions? |
|
Material advantage isn't the whole
story. |
Heuristic Game-Play
|
|
|
|
A good evaluation function |
|
returns actual value at terminal
states, |
|
approximates actual value at
non-terminal nodes, and |
|
isn't too computationally intensive |
|
Most attribute recent game-playing
success to better speed ("brute force") rather than better
evaluation (knowledge base) |
|
Still, most minimax search is
pointless… |
Pruning Example
Pruning Example
Pruning Example
Pruning Example
Pruning Example
Pruning Example
Pruning Example
Pruning Example
Pruning Example
Pruning Example
Pruning Example
Pruning Example
Pruning Example
Pruning Guarantees
|
|
|
While we search the tree, we can keep
track of guaranteed maximum/minimum utilities if play proceeds to each node. |
|
When we see a contradiction in
guarantees, we can prune remaining children from further consideration,
because we've proven a rational player will never reach that node. |
Assuming
Rationality? Ha!
|
|
|
What if other player isn't rational? |
|
If evaluation is perfect, then one can
always do as well if not better against an irrational player with rational
play. |
Alpha-Beta Pruning
|
|
|
|
Let a,b be local lower,upper bound guarantees |
|
"If play proceeds here, root will
score at least a." |
|
"If play proceeds here, root will
score at most b." |
|
Pruning thus according to a and b is called alpha-beta pruning. |
|
Minimax search with alpha-beta pruning
is sometimes called alpha-beta search. |
Alpha-Beta Pruning
Algorithm
Alpha-Beta Pruning
Exercise
Alpha-Beta Pruning
Exercise (cont.)
Games of Chance
Chance Nodes
Expectiminimax
|
|
|
|
A chance node is evaluated as follows |
|
the value of each child is multiplied
times the probability of reaching that child |
|
these products are then summed. |
|
Disadvantages to this approach: |
|
branching factor of chance nodes can be
large! |
|
no pruning allowed |
|
evaluation functions are hard!… |
Expectiminimax
Evaluations
Why Minimax Isn't Always
Appropriate