CS 391 - Selected Topics: Kaggle Competition
Course Information

Course Overview

Data Science has become one of the most important emerging topics in Computer Science curricula. Kaggle.com is one of the most important sites for fostering community learning and advancement of Data Science.

While not the only Data Science competition website, Kaggle quickly became the best-known when it began offering Machine Learning competitions in 2010. It quickly grew to become a hub of communication, learning, and practice for the Data Science community, which can partly be attributed to the incentives for community recognition offered on the site. Recognition is given in the form of virtual medals which contribute towards Kaggle Progression System ranks of Novice, Contributor, Expert, Master, and Grandmaster for contribution in each of four areas: Competitions, Notebooks, Datasets, and Discussion. Whereas competition medals are earned by good performance in Kaggle competitions, notebook, dataset, and discussion contributions are recognized by community upvotes. Further, competitors are often required to share and explain their competition work for others to learn from. This recognition system thus provides incentives for those seeking to distinguish themselves (e.g. to potential employers/clients) to provide helpful contributions to the community in exchange for such public recognition and third-party validation of expertise.

The purpose of this Selected Topics course is to provide students with time, incentive, and support to learn Data Science at their own pace through engagement with Kaggle competitions. To this end, we will form flexible teams to work on current Kaggle competitions, and we will have two Kaggle InClass exam competitions to assess student learning. Students will conscientiously log expected work outside of class, and exercise the lifelong learning skills they will need beyond college.

Learning Objectives

Text

There is no text for this selected topics course.  Rather, students will be reading from a selection of recommended and found resources online. As students will be taking this course from a variety of background preparations, students will be encouraged to read so as to fill the greatest gaps in the Data Science understands, and what is most relevant to the Kaggle competitions they engage.

Instructor

Todd Neller
Lecture: T,Th 1:10-2:25AM, Online.
Office Hours: via Zoom (link distributed via email) 
E-mail: 

Grading

60% Work Logs
30% Exams
10% Class Attendance / Participation

Attendance

Class attendance and participation is required.  If you attend all classes and are willing to participate, you'll get 100% for this part of your grade.  Even if you know enough to give a particular lecture, please consider the value of helping your peers during in-class exercises.

Woody Allen is quoted as saying "80% of success is just showing up."   While our class attendance/participation grade is not 80% of the final grade, it is critical that late arrivals and unexcused absences are not excessive.  Missing more than half of class unexcused is considered being absent.  An unexcused late arrival is counted as a half absence.  If the total number of absences counted this way exceeds 20% of class meetings, i.e. 6 absences or more, the student will have failed the course.

Work Expectations

You are expected to work an average of 9 hours per week beyond class time Gettysburg College policy, in accordance with federal and state standards, equates 1 credit unit with an average of 12 hours of work per week with 50 minute classes counting as 1 full hour of work.  During these remaining 9 hours beyond class, a student is expected to learn from assigned readings, complete exercises related to such readings, attend required colloquia, and complete assignments.

Think of your college studies as a more-than-full-time job, and engage in it with passion.  After all, you get out of it what you put into it, and it is my hope that you'll gain much from your investment in this course.  If you'd like to learn more about how to better track tasks and manage time as a student, consider watching my short tutorial on getting things done.

Honor Code

Honesty, Integrity, Honor. These are more important than anything we will teach in this class. Students can and are encouraged to help each other understand course concepts, but all graded work must be done independently unless otherwise specified (e.g. group work). Submitted work should be created by those submitting it. Submission of plagiarized code or design work is a violation of the Honor Code, which I strictly enforce. For detailed information about the Honor Code, see http://www.gettysburg.edu/about/offices/provost/advising/honor_code/index.dot.

What is permitted:

What is not permitted:

Put simply, students may discuss assignments at an abstract level (e.g. specifications, algorithm pseudocode), but must actually implement solutions independently or in permitted groups. Credit should be given where credit is due. Let your conscience be your guide. Do not merely focus on the result; learn from the process.