COSC-288: Introduction to Machine Learning

Fall 2018

Contents

Announcements

Where, When, Who

Class Time: TR 2:00 – 3:15 PM
Classroom: REI 284
   
Instructor: Mark Maloof
Office: 325 St. Mary's Hall
Mailbox: 329A St. Mary's Hall
Office Hours: M 12:00–1:30 PM, R 2:30–4:00 PM (or by appointment).

Description

This undergraduate course surveys the major research areas of machine learning. Through traditional lectures and programming projects, students learn (1) to understand the foundations of machine learning, (2) to implement methods of machine learning in a high-level programming language, (2) to comprehend papers from the primary literature, and (4) to design and conduct their own studies. The course compares and contrasts machine learning with related endeavors, such as statistical learning, pattern classification, data mining, and information retrieval. Topics include instance-based approaches, naive Bayes, decision trees, rule induction, linear classifiers, neural networks, support vector machines, ensemble methods, evaluation, and applications.

Prerequisites: Advanced Programming (COSC-150) and Data Structures (COSC-160).

Primary Text:

Learning Goals

By the end of the semester, students will be able to:

Policies

My course policies are designed to supplement the University's Undergraduate Honor System and the CS Department's Honor Policy. Unless stated otherwise when I distribute an assignment, the following is the default for all assignments for this course. I've developed my policies from past teaching experiences and from the CS Department's Honor Code at George Mason University.

I am obligated to refer all suspected cases of academic dishonesty by undergraduate students to Georgetown's Honor Council. If you have any questions about these policies or how they apply, please discuss such concerns with me during class, during office hours, or by e-mail.

In my experience, students at Georgetown do honest work. The small percentage of students who have submitted someone else's work as their own did so because they did not manage their time wisely.

Students must follow proper scholarly practice for all submitted work, whether graded or ungraded and whether a draft or final version of a proposal, paper, or program. We must acknowledge our reliance on the work of others through citation.

Students may be quite adept at and knowledgeable about citing and quoting material from traditional sources, such as books and articles. Typically, we do not have cite facts, common math formulae, or expressions of our own ideas, observations, interpretations, and analyses, However, students new to computer science may not realize that formulae, theorems, proofs, algorithms, and programs can require the same treatment as any other form of expression.

For convenience, you do not need to cite the course materials, conversations with me or information you obtain from class lectures and discussions. If you are unsure about what requires citation or what constitutes proper scholarly practice, please ask me during class, during office hours, or by e-mail.

I design my courses and assignments so students have what they need to complete the assignments individually without consulting outside resources. I determine the size of and credit for assignments based on the assumption that the work for them is the result of individual effort using only the course resources and materials. Students who use outside resources to complete assignments may not be eligible for full credit. Students who do not acknowledge their use of outside resources to complete assignments may be in violation of my course policies and the university's policies on academic integrity.

The following list details acceptable and unacceptable practices:

Policies dealing logistics:

Assignments and Grading

String Grades::getLetterGrade()
{
  if (grade >= 94)
    return "A";
  else if (grade >= 90)
    return "A-";
  else if (grade >= 87)
    return "B+";
  else if (grade >= 84)
    return "B";
  else if (grade >= 80)
    return "B-";
  else if (grade >= 77)
    return "C+";
  else if (grade >= 74)
    return "C";
  else if (grade >= 70)
    return "C-";
  else if (grade >= 67)
    return "D+";
  else if (grade >= 64)
    return "D";
  else
    return "F";
} // Grades::getLetterGrade

Materials: Readings, Videos, and Links

Schedule

  1. Introduction: Definitions, Areas, History, Paradigms
  2. Instance-based learning: k-NN
  3. Probabilistic learning: MLE, Bayes' Theorem, naive Bayes
  4. Evaluation: Train/Test Methodologies, Measures, ROC Analysis
  5. Decision Trees: ID3, C4.5, Stumps
  6. Rule Learning: Ripper, OneR
  7. Midterm Exam
  8. Neural Networks: Linear classifiers, Perceptron
  9. Neural Networks: Multilayer networks, Back-propagation
  10. Support Vector Machines: Perceptron, Dual representation
  11. Support Vector Machines: Margins, Kernels, Training, SMO
  12. Ensemble Methods: Bagging, Boosting
  13. Ensemble Methods: Random Forests, Voting, Weighting

Other Interesting Links