CS 391 - Special Topic: Machine Learning
Project


Timeline

Written Proposal

Write a proposal for a reinforcement learning project that can be completed with 3 weeks of programming.  Your description of the problem should be complete and unambiguous.  Use the text example and exercise descriptions as a model for your problem description.  Then describe your goal and the technique(s) you plan to apply towards the achievement of this goal.

Keep it simple!...  If you find your project too simple, you can always look to more difficult problems.  It's much more frustrating to bite off more than you can chew and not complete it successfully.  My advice is to pick a challenge problem that interests you (e.g. poker, magnetic levitation control, lego robotics, etc.), and boil the problem down to its essential characteristics.  Remove constraints, make simplifying assumptions, and do everything you can to reduce the size of the state space without changing the nature of the problem.  For example, Piglet is a much simpler dice game than Farkle (a.k.a. Ten Thousand), but they're both very similar jeopardy games at heart.  Piglet (if it weren't already done as an assignment) would make a much better goal than Farkle for this assignment.

... but not too simple.  Your problem should be an associative task, preferably one with a possibility of repeated states within a single trial.  The n-armed bandit would not be a good choice of problem unless (in an exceptional case) one wanted to do fundamental research on various update rules as we did in Chapter 2.

Making wise choices up front is perhaps the most important step for a successful project.  Give this good, sustained thought, and feel free to stop by to discuss the merits of various ideas you have.   

Problem statement draft

Your problem statement draft should be a complete specification of the machine learning problem you are addressing.  That is, you should fully specify the states, actions, and rewards of your ML system.  If you're using a function approximator, you should give full specifications of your planned use.  For instance, if you're using an artificial neural network, completely specify the input units, # of hidden units, and output unit(s).  These specifications are non-binding; you may adjust them if you find other specifications are more sensible.  At this stage, you need to be implementing and thinking in terms of the implementation details.

Total system complete

Submit your work using "submit cs391 proj1".  You should have a completely implemented system.  Unless I've said otherwise in reviewing your proposal, your agent need not make smart decision.  You should be able, however, to demonstrate that the entire simulation is fully functional.  Thus, your states, actions, rewards, environment, agent, and simulation needs to be completely and correctly implemented.  Be prepared to give a brief, informal demonstration.

Project milestone complete

At this point, you should be able to show some evidence of simple learning.  Pick an easily achievable milestone.  Having achieved it, you can build on it in the coming week.  In the worst case, the following week can be used to achieve a modest success. Submit your work using "submit cs391 proj2". 

Project complete

All project work and associated documentation should be complete at this point.  Your code should be clearly commented, and your README file should indicate how to operate your program and collect your data.  Submit your work using "submit cs391 proj3".

Abstract

For our symposium, we'll have a handout with all names, titles, and abstracts for presentations.  E-mail this information directly to me.  An abstract is a 1-2 paragraph summary of what you'll be presenting.  

Project presentation

Plan to briefly present (1) a clear, concise problem description, (2) a high-level description of the technique(s) you applied the problem (names and their general characteristics; no algorithms), and (3) the results you observed (e.g. graph(s), demo).  You'll have ~5 minutes, but do your best to plan for 4.  We'd like to leave time for one audience question after each overview.

Complete paper draft

Submit (hardcopy or e-mail) a complete draft of your paper for review on 4/30.  Plan to stop by my office on 4/30 or 5/1 before 4PM to go over my comment.  The main purpose of this review is to ensure that your content in complete.  That is, I want to ensure that all information is included that is necessary to reproduce your results.   For instance, people often forget to include important parameters such as learning rates or reward discount rates.

Here's a suggested structure for your final paper:

Final paper

After making sure you have all the necessary content, polish your paper's presentation.  Your code, it's documentation, and this paper will serve as a resource to future students of this material.  Future projects may seek to build on your results.  Please make sure you do a quality, professional job of informing the reader of your work.

(Look at the papers posted in the Glatfelter 2nd floor hallway as
examples.)