Machine Learning Laboratory Experiences for Introducing
Undergraduates to Artificial Intelligence

OVERVIEW 
The
jeopardy dice
game Pig is very simple to describe, yet the
optimal policy
for play is far from trivial. Using the computation of the
optimal solution as a central challenge problem, we introduce dynamic
programming and value iteration methods, applying them to similar problems
using the Java language.

OBJECTIVES 
The object of this project is
to give the student a deep, experiential understanding of dynamic
programming and value iteration through explanation,
implementation examples, and implementation exercises

PREREQUISITES 
The student should understand
basic probability and algebraic systems of equations (although knowledge
of linear algebraic solution techniques are not required). The
student should also understand the syntax of Java. Students who
program in C++ should be able to follow most of the Java examples. Before starting the project, the student will want to understand the definition of a Markov decision process, perhaps by covering the recommended background reading below. 
BACKGROUND 
To understand the terminology
of Markov decision processes and general application of value iteration,
we recommend that students read sections 17.1 (Sequential Decision
Problems) and 17.2 (Value Iteration) of:
Students are encouraged to play the game of Pig in order to have a better
understanding of the problem. An
applet with an optimal computer
player is available at the
Game of Pig
page. 
DESCRIPTION 
The project is described in the
file pig.pdf. You will need the
free Adobe
Acrobat Reader to view this file. Example code described in the project is available here: This code was generated from the same source file that generated pig.pdf using the literate programming tool noweb. We recommend having students do at least one dynamic programming exercise and one value iteration exercise. Students seeking
more extended challenges will also find descriptions of related advanced
projects for dynamic programming and value iteration. 
SYLLABUS 
Here is a sample syllabus used in CS 371  Introduction to AI at Gettysburg College: First class: Tuesday 9/28/04 The following additional coverage is not included in this project, but
may provide the instructor additional ideas: Our approach to this problem is to simply apply value iteration, discretizing the state space into a 400by400 grid of cells, and approximating the dynamics of each time step as a mapping from one cell to another. In other words, overlay the 2D state space (position, velocity) with a 400by400 grid. Simulate one timestep from each grid intersection and note which is the nearest intersection to the resulting state. This defines a mapping to discretely approximate the system dynamics. Then apply value iteration with a reward of 1 for each step that the system is not at the rightmost position. Cambridge, Massachusetts, 1998. 