COSC-288: Introduction to Machine Learning

Fall 2018

Contents

Announcements

Where, When, Who

Class Time: TR 2:00 – 3:15 PM
Classroom: REI 284 REI 559
   
Instructor: Mark Maloof
Office: 325 St. Mary's Hall
Mailbox: 329A St. Mary's Hall
Office Hours: In-person (325 STM): TR 11:00 AM–12:00 PM; online: M 10:30–11:30 AM and W 3:00–4:00 PM; or by appointment. Send me an email to get the Zoom link for online office hours.

Description

This undergraduate course surveys the major research areas of machine learning focusing on classification. Through traditional lectures and programming projects, students learn (1) to understand the foundations of machine learning, (2) to design and implement methods of machine learning, (3) to evaluate methods of machine learning, and (4) to conduct empirical evaluations of multiple methods of machine learning. The course compares and contrasts machine learning with related endeavors, such as statistical learning, pattern classification, data mining, and information retrieval. Topics include instance-based approaches, naive Bayes, decision trees, rule induction, linear classifiers, support vector machines, neural networks, ensemble methods, evaluation, and applications. Students complete five programming projects using Java. There are midterm and final exams.

Prerequisites: Advanced Programming (COSC-150) and Data Structures (COSC-160).

Primary Text:

Learning Goals

By the end of the semester, students will be able to:

Policies

My course policies are designed to supplement the University's Undergraduate Honor System and the CS Department's Honor Policy. Unless stated otherwise when I distribute an assignment, the following is the default for all assignments for this course. I've developed my policies from past teaching experiences and from the CS Department's Honor Code at George Mason University.

I am obligated to refer all suspected cases of academic dishonesty by undergraduate students to Georgetown's Honor Council. If you have any questions about these policies or how they apply, please discuss such concerns with me during class, during office hours, or by e-mail.

In my experience, students at Georgetown do honest work. The small percentage of students who have submitted someone else's work as their own did so because they did not manage their time wisely.

Students must follow proper scholarly practice for all submitted work, whether graded or ungraded and whether a draft or final version of a proposal, paper, or program. We must acknowledge our reliance on the work of others through citation.

Students may be quite adept at and knowledgeable about citing and quoting material from traditional sources, such as books and articles. Typically, we do not have cite facts, common math formulae, or expressions of our own ideas, observations, interpretations, and analyses, However, students new to computer science may not realize that formulae, theorems, proofs, algorithms, and programs can require the same treatment as any other form of expression.

For convenience, you do not need to cite the course materials, conversations with me or information you obtain from class lectures and discussions. If you are unsure about what requires citation or what constitutes proper scholarly practice, please ask me during class, during office hours, or by e-mail.

I design my courses and assignments so students have what they need to complete the assignments individually without consulting outside resources. I determine the size of and credit for assignments based on the assumption that the work for them is the result of individual effort using only the course resources and materials. Students who use outside resources to complete assignments may not be eligible for full credit. Students who do not acknowledge their use of outside resources to complete assignments may be in violation of my course policies and the university's policies on academic integrity.

The following list details acceptable and unacceptable practices:

Policies dealing logistics:

Schedule

  1. Introduction: Definitions, Areas, History, Paradigms
  2. Instance-based learning: k-NN
  3. Probabilistic learning: MLE, Bayes' Theorem, naive Bayes
  4. Evaluation: Train/Test Methodologies, Measures, ROC Analysis
  5. Decision Trees: ID3, C4.5, Stumps
  6. Rule Learning: Ripper, OneR
  7. Midterm Exam
  8. Neural Networks: Linear classifiers, Perceptron
  9. Neural Networks: Multilayer networks, Back-propagation
  10. Support Vector Machines: Perceptron, Dual representation
  11. Support Vector Machines: Margins, Kernels, Training, SMO
  12. Ensemble Methods: Bagging, Boosting
  13. Ensemble Methods: Random Forests, Voting, Weighting

Assignments and Grading

String Grades::getLetterGrade()
{
  if (grade >= 94)
    return "A";
  else if (grade >= 90)
    return "A-";
  else if (grade >= 87)
    return "B+";
  else if (grade >= 84)
    return "B";
  else if (grade >= 80)
    return "B-";
  else if (grade >= 77)
    return "C+";
  else if (grade >= 74)
    return "C";
  else if (grade >= 70)
    return "C-";
  else if (grade >= 67)
    return "D+";
  else if (grade >= 64)
    return "D";
  else
    return "F";
} // Grades::getLetterGrade

I use automatic grading routines to assign an initial grade for the projects. It is important to emphasize that the grade you obtain from Autolab is an initial grade and may not be your final grade. There are many important aspects of a program that are difficult or impossible to assess using automatic grading routines. For example, automatic grading routines can not determine if you have written proper documentation. They can not easily assess if an implementation of an operation is optimally efficient. As a consequence, I start with the initial grade you obtain from Autolab and take further deductions if necessary.

For complete implementations, I use the following distribution as a guide:

Notice that an implementation consisting entirely method stubs would obtain an initial grade of 60%. Such an implementation is incomplete, and would be subject to further deductions based on the effort required to implement the required operations.

In addition to the above, the following deductions may be taken if applicable:

Materials: Readings, Videos, and Links

Other Interesting Links

Copyright © 2019 Mark Maloof. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed.