Homework #7

CS 371 - Introduction to Artificial Intelligence
Homework #7

Due: Monday, 12/3
Note: Work is to be done individually.

Data Mining Survey

Each student will:

choose and get Prof. Neller's approval for a data mining (or supervised/unsupervised machine learning) algorithm,
teach the algorithm during one class period from 11/2-12/3, and
submit experimental results applying the algorithm to an approved classification/function approximation benchmark problem (e.g. hand-written zip code classification).

Each student should meet with Prof. Neller during office hours to discuss possible topics, topic scope, helpful resources, possibilities for in-class student exercises, suitable benchmark problems, etc. The objective here is to allow students to (1) gain diverse exposure to different data mining techniques, while (2) allowing each student to pick a technique of interest and go into sufficient depth for a high-quality, interactive, 50-minute presentation, and (3) exposing students to common, specific benchmark problems.

Topics will be first-come, first-served. With approval, students with closely related topics may find a suitable division of the topics and present each half individually in successive class periods for continuity, thus enabling complementary presentations. Grading will be based 75% on the presentation and 25% on the submitted benchmark problem experimentation write-up (submitted in a .zip file on Moodle).

Presentation grading considerations include:

Does the presenter have evident understanding of the algorithm?
Is this understanding clearly communicated through simple examples and concise explanations?
Has the presenter chosen a feasible amount of knowledge to digest in the 50-minute class period?
Are there sufficient visualizations, hands on exercises (e.g. Weka application, online apps, provided code) to move beyond traditional lecture to more engaged experiential learning?
Has the presenter clearly defined the domain of best applicability, including pros/cons of the algorithm with respect to other algorithms.
Basically: Is the algorithm well taught?

Benchmark problem experimentation considerations include:

Is the benchmark problem suitable?
Has the student performed an adequate evaluation of the algorithms performance (e.g. n-fold cross validation)?
Has the student clearly summarized the results?