CS 391 Selected Topics: Game AI
Homework #4


Due the beginning of class on Tuesday 3/6. (More time larger assignment weight)

Note: This work is to be done in groups of 2.  Each group will submit one assignment.  Although you may divide the work, both team members should be able to present/describe their partner's work upon request. 

0. HW4 Preparation: Save and compile the HW4 starter code that will be emailed through the class mailing list.  Run the OptimalPlayer code once to compute and export an optimal policy data file.

1. Function Approximation for Optimal Fowl Play:  Use two different function approximation techniques to create a low-memory means of computing approximately optimal play of the game Fowl Play for 2 players.  You must implement at least one of the techniques.  Otherwise, you are free to use (and credit) any open source implementations you find online.

Fowl Play is a simple jeopardy card game.  For the 2 player game, the first player to score 50 points wins.  The deck contains 42 chicken cards and 6 wolf cards.

Problem motivation:  Having computed optimal play for Fowl Play, we would like to create an implementation suitable for a small, memory-limited device (e.g. Lego Mindstorm robot, Arduino microcontroller, or smart phone app).  Memory limits are assumed to preclude the direct use of the computed policy.  Given the policy, we would like to use forms of function approximation to effectively "compress" policy information while still retaining high-quality play.

Representation: The game state may be described by 5 variables: current player score (i), opponent score (j), turn total (k), wolves drawn since the last shuffle (w), and chickens drawn since the last shuffle (c).  Actions are simple DRAW (1) and HOLD (0).

So which function(s) should I approximate?  That is your decision.  There are a number of straightforward possibilities:

In the first two value function scenarios, there is the advantage that a softmax action selection policy can further be applied to offer different player skill levels according to the underlying chosen temperature parameter τ.  You have access to all of these optimal play functions through the OptimalPlayer class supplied.

So which function approximation techniques am I permitted to apply?

This too is your decision.  There are again a number of possibilities:

We will discuss a few of these in class, but there are ample online resources for learning about each.

What's supplied in the starter code?

How cool is that!?  I get to choose my learning goals and be creative with an open engineering challenge!

That's right, it's very cool.  Assignment(s) judged as superior may find a possibility for joint publication and/or fun prototyping on one of the platforms mentioned.

One caveat: I recommend trying something simple first and soon.  Remember the KISS principle.  You'll want to get something to outperform the MaxScorePlayer.  Of course, you will not get anything to outperform the OptimalPlayer.

How close can you get to optimal play while keeping the code that computes the action short?  For example:  Consider an approach where you use an open source neural network package (e.g. Neuroph) or write your own neural network.  Once you've trained a small network on one of the functions, the relatively few weights can be hard-coded in 2D arrays, and simple feed-forward computation can compute the intended play action.  The code and memory requirements for the computation of the function approximator can use all the memory and time that you wish.  The actual implementation of the player using the function approximator should be (1) relatively simple, (2) fast, (3) not require file I/O, and (4) exceed performance of the MaxScorePlayer.