CS 391 - Selected Topics: Game Artificial Intelligence
Readings
Each reading assignment should be completed before the class on the date
indicated. These readings are subject to change; check here for updates.
If a reading assigned in class does not match the reading assignment here,
the reading assigned in class supercedes.
1/31: Auer, N. Cesa-Bianchi, P. Fischer.
P.
Finite-time Analysis of the Multiarmed
Bandit Problem. (See also
here.) Sections 1 (skipping "In their classical paper ... mild
assumptions." section), 2 (special attention to the algorithms), 4, and 5.
Topics: regret, upper confidence bounds (UCBs), UCB action selection
algorithms.
2/2: R.S. Sutton, A. G. Barto.
Reinforcement Learning: an introduction.
Chapter 3: intro-3.3, 3.6-3.8. Topics: reinforcement learning
problem, agent, environment, state, action, reward, returns, Markov decision
processes (MDPs), state-value function V, action-value function Q, Bellman
equation, optimal policy, optimal value-functions, Bellman optimality
equation, backup diagrams.
4/3: Essentials of Game Theory, Ch. 1, 2, Definition 3.2.1, Section 3.5;
Handout: "Introduction to Counterfactual Regret Minimization", Section 1; [Optional: S. Hart, A. Mas-Colell.
A Simple
Adaptive Procedure Leading to Correlated Equilibrium. (G'burg
library) Sections 2-3; H. Kuhn. Simplified Two-Person Poker.]