Fall 2018

- Announcements
- Where, When, Who
- Description
- Learning Goals
- Policies
- Schedule
- Assignments and Grading
- Materials: Readings, Videos, and Links
- Other Interesting Links

- 12/13/18: Posted room for final exam: Car Barn 203.
- 11/20/18: Posted p5.
- 11/6/18: Posted p4.
- 10/15/18: Posted p3.
- 10/9/18: Changed the due date for p2 to 10/17.
- 9/26/18: Posted p2.
- 9/4/18: Updated the classroom to WAL 491
- 9/4/18: Updated the classroom to ICC 108
- 8/30/18: Posted p1.
- 8/29/18: Updated the classroom to WAL 498
- 8/26/18: Set exam and assignment dates
- 4/2/18: Created this Web page.

Class Time: | TR 9:30–10:45 AM |

Classroom: | |

Instructor: | Mark Maloof |

Office: | 325 St. Mary's Hall |

Mailbox: | 329A St. Mary's Hall |

Office Hours: | In-person (325 STM): TR 11:00 AM–12:00 PM; online: M 10:30–11:30 AM and W 3:00–4:00 PM; or by appointment. Send me an email to get the Zoom link for online office hours. |

This graduate lecture surveys the major research areas of machine learning focusing on classification. Through traditional lectures and programming projects, students learn (1) to understand the foundations of machine learning, (2) to design and implement methods of machine learning, (3) to evaluate methods of machine learning, and (4) to conduct empirical evaluations of multiple methods of machine learning. The course compares and contrasts machine learning with related endeavors, such as statistical learning, pattern classification, data mining, and information retrieval. Topics include Bayesian decision theory, instance-based approaches, Bayesian methods, decision trees, rule induction, density estimation, linear classifiers, support vector machines, neural networks, ensemble methods, learning theory, evaluation, and applications. Students complete five programming projects using Java. There are midterm and final exams.

Prerequisites: Students should have taken undergraduate courses in computer science through data structures; at the very least, students must be able to implement trees and graphs in a high-level object-oriented programming language. Students should have also taken undergraduate courses in mathematics, such as calculus, linear algebra, and probability and statistics. Students will use Java to complete the projects for this course.

Primary Text:

*Machine Learning: The Art and Science of Algorithms that Make Sense of Data*, by Peter Flach [ WWW | CUP | Amazon ]

By the end of the semester, students will be able to:

- explain the main foundations of machine learning
- understand and design object-oriented systems for machine learning
- implement methods of machine learning using a high-level programming language
- conduct performance evaluations of methods of machine learning
- design and conduct empirical studies

My course policies are designed to supplement the CS Department's Honor Policy. Unless stated otherwise when I distribute an assignment, the following is the default for all assignments for this course. I've developed my policies from past teaching experiences and from the CS Department's Honor Code at George Mason University.

I am obligated to refer all suspected cases of academic dishonesty by master's students to the Honor Council. I am obligated to refer all suspected cases of academic dishonesty by doctoral students to the dean of the Graduate School. If you have any questions about these policies or how they apply, please discuss such concerns with me during class, during office hours, or by e-mail.

In my experience, students at Georgetown do honest work. The small percentage of students who have submitted someone else's work as their own did so because they did not manage their time wisely.

Students must follow proper scholarly practice for all submitted work, whether graded or ungraded and whether a draft or final version of a proposal, paper, or program. We must acknowledge our reliance on the work of others through citation.

Students may be quite adept at and knowledgeable about citing and quoting material from traditional sources, such as books and articles. Typically, we do not have cite facts, common math formulae, or expressions of our own ideas, observations, interpretations, and analyses, However, new graduate students in computer science may not realize that formulae, theorems, proofs, algorithms, and programs can require the same treatment as any other form of expression.

For convenience, you do not need to cite the course materials, conversations with me or information you obtain from class lectures and discussions. If you are unsure about what requires citation or what constitutes proper scholarly practice, please ask me during class, during office hours, or by e-mail.

I design my courses and assignments so students have what they need to complete the assignments individually without consulting outside resources. I determine the size of and credit for assignments based on the assumption that the work for them is the result of individual effort using only the course resources and materials. Students who use outside resources to complete assignments may not be eligible for full credit. Students who do not acknowledge their use of outside resources to complete assignments may be in violation of my course policies and the university's policies on academic integrity.

The following list details acceptable and unacceptable practices:

- You can:
- obtain assistance in understanding course materials (textbooks, lecture notes, assignments);
- obtain assistance in learning to use the computing facilities;
- obtain assistance in learning to use special features of a programming language's implementation;
- obtain assistance in determining the syntactic correctness of a particular programming language statement or construct;
- obtain an explanation of a particular syntactic error;
- obtain explanations of compilation or run-time error messages.

- You can obtain assistance only from me and the teaching assistants:
- in designing the data structures and algorithms used in your solution;
- in modifying the design of an algorithm or data structure determined to be faulty;
- in implementing your algorithm or data structure in a programming language;
- in correcting a faulty implementation of your algorithm or data structure;
- in determining the semantic correctness of your program;
- in designing an experimental study and interpreting its results.

- You can not:
- show or give a copy of your work in any amount or form to another student;
- see or receive a copy of someone else's work in any amount or form;
- attempt to gain access to files other than your own or those that I designate and authorize;
- attempt to reverse engineer routines used for automatic grading;
- inspect or retain in your possession another student's work, whether it was given to you by another student, it was found after other student discarded his or her work, or it accidentally came into your possession;
- collaborate in any way with someone else in the design, implementation, or logical revision of an algorithm;
- use or present as your own any algorithm, data structure, or implementation that is not of your own or of my design, or which is not part of the course's required reading. If you modify any procedure which is presented in the course's texts that is not specifically mentioned in class or covered in reading assignments, then a citation with page number must be given;
- incorporate code written by others (such as can be found on the Internet).

Policies dealing logistics:

- You have permission to occasionally take digital photos of the material on the board for your personal use, but you can not post these recordings on the Internet. It is important to understand that all of the course material is covered by copyright, either mine or someone else's.
- You should submit all assignments on time. For late projects, there will be a 1% deduction for each minute after the deadline. In the real world, it won't be your grade that decreases. It'll be your stock price.
- I grant extensions only to students who have documentation for a medical issue, a family emergency, or an accommodation from the Academic Resource Center. In the cases of a medical issue or a family emergency, it would be best to coordinate with your advising dean since these situations often affect your work in all of your classes.
- If you use your laptop for development, you should keep a backup of your projects on a university or department machine (e.g., cs-class) so I can verify that you completed the work for an assignment before the deadline.
- You must take the final exam with the section and during the period designated by the Registrar.
- It is my job to maintain a constructive learning environment for everyone. Please silence your cell phone and other electronic devices.
- If you must miss class, be sure to get the lecture notes from a classmate.
- It is fine if you must leave class early, arrive late, or leave the room to answer your phone, but you should do so in a manner that does not disturb your fellow students.
- In the case of inclement weather that results in the university's closure, we will meet virtually during normal class times using Zoom.

- Introduction: Definitions, Areas, History, Paradigms
- Bayesian Decision Theory
- Instance-based learning:
*k*-NN,*kd*-trees - Probabilistic learning: MLE, Bayes' Theorem, MAP, naive Bayes, Bayesian naive Bayes
- Density estimation: Parametric, Non-parametric, Bayesian
- Evaluation: Train/Test Methodologies, Measures, ROC Analysis
- Decision Trees: ID3, C4.5, Stumps, VFDT
- Rule Learning: Ripper, OneR
- Midterm Exam
- Neural Networks: Linear classifiers, Perceptron
- Neural Networks: Multilayer networks, Back-propagation
- Support Vector Machines: Perceptron, Dual representation
- Support Vector Machines: Margins, Kernels, Training, SMO
- Ensemble Methods: Bagging, Boosting
- Ensemble Methods: Random Forests, Voting, Weighting
- Hidden Variables:
*k*-means, Expectation-Maximization

- Programming Projects, 50%
- Project 1, assigned R 8/30, due R 9/27 @ 5 PM, 10 points
- Project 2, assigned R 9/27, due
~~M 10/15~~W 10/17 @ 5 PM, 10 points - Project 3, assigned M 10/15, due T 11/6 @ 5 PM, 10 points
- Project 4, assigned T 11/6, due W 11/21 @ 11:59 PM, 10 points
- Project 5, assigned W 11/21, due M 12/10 @ 11:59 PM, 10 points
- Midterm Exam, R 10/18, 20%
- Final Exam, R 12/20 9–11am in CBN 203, 30%

String Grades::getLetterGrade() { if (grade >= 94) return "A"; else if (grade >= 90) return "A-"; else if (grade >= 87) return "B+"; else if (grade >= 84) return "B"; else if (grade >= 80) return "B-"; else if (grade >= 67) return "C"; else return "F"; } // Grades::getLetterGrade

I use automatic grading routines to assign an initial grade for the projects. It is important to emphasize that the grade you obtain from Autolab is an initial grade and may not be your final grade. There are many important aspects of a program that are difficult or impossible to assess using automatic grading routines. For example, automatic grading routines can not determine if you have written proper documentation. They can not easily assess if an implementation of an operation is optimally efficient. As a consequence, I start with the initial grade you obtain from Autolab and take further deductions if necessary.

For complete implementations, I use the following distribution as a guide:

- 40%: Compiles against the autograder
- 20%: Executes without failure using the autograder
- 40%: Autograder unit tests
- 10%: Internal documentation, if required
- Purpose of classes and methods explained in documentation comments
- Purpose, range, and meaning of identifiers explained, where needed
- Complex flow of control explained
- 10%: Style and formatting
- Nested indentation for loops and conditionals
- All class, method, and function headers emphasized
- Comments set off from code
- Vertical alignment of comments, where appropriate
- White space between blocks of code (and comments)
- Mnemonic identifier names
- 80%: Algorithm and Implementation
- Correct implementation of the operations
- Proper object-oriented design and implementation
- Appropriate error checking and diagnostic messages
- Appropriate data and object types
- Correct, clean, organized output format

Notice that an implementation consisting entirely method stubs would obtain an initial grade of 60%. Such an implementation is incomplete, and would be subject to further deductions based on the effort required to implement the required operations.

In addition to the above, the following deductions may be taken if applicable:

- 20%: Does not compile on
`cs-class` - 1–10%: Incomplete or improper submission
- 1–5%: My effort for fixing any minor issue
- 1–5%: Inefficiently implemented routine
- 1% per minute: Late deduction

- Autolab, for submitting projects
- Piazza, for online discussion
- Canvas, for document distribution and for submitting projects if Autolab is down
- Perez-Hernandez, D. (28 March 2014).
Taking notes by hand benefits recall,
researchers find.
*The Chronicle of Higher Education*. (Read if interested.) - Mueller, P. A. and Oppenheimer, D. M. (2014).
The pen is mightier
than the keyboard: Advantages of longhand over laptop note taking.
*Psychological Science*, 25(6):1159–1168. (Read if interested.) - Flach, P. (2012).
*Machine learning: The art and science of algorithms that make sense of data*. Cambridge University Press, Cambridge. - Mitchell, T. M. (1997).
*Machine learning*. McGraw-Hill, New York, NY. - Murphy, K. P. (2012).
*Machine learning: A probabilistic perspective*[electronic resource]. MIT Press, Cambridge, MA. - Gomes, L. (20 Oct 2014).
Machine-learning maestro Michael Jordan on the delusions of
Big Data and other huge engineering efforts.
*IEEE Spectrum*. - Domingos, P. (2012).
A few useful things to know about machine learning.
*Communications of the ACM*55(10): 78–87. - Duda, R. O., and Hart, P. E. and Stork, D. G. (2000).
*Pattern classification*. John Wiley & Sons, New York, NY. - Slides: Bayesian Decision Theory.
- JASON (2017). Perspectives on Research in Artificial Intelligence and Artificial General Intelligence Relevant to DoD. Technical Report JSR-16-Task-003. The MITRE Corporation, 7515 Colshire Drive, McLean, VA 22102-7508.
- Provost, F., Fawcett, T., and Kohavi, R. (1998).
The case against accuracy estimation for comparing
induction algorithms.
In
*Proceedings of the Fifteenth International Conference on Machine Learning*, 445–453. Morgan Kaufmann, San Francisco, CA. - Fawcett, R. (2006).
An introduction to ROC analysis.
*Pattern Recognition Letters*27(8): 859–928. - Slides: Kernel-Density Estimation.
- Domingos, P. and Hulten, G. (2000).
Mining high-speed data streams.
In
*Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining*, 71–80. ACM Press, New York, NY. - Cohen, W. W. (1995).
Fast effective rule induction.
In
*Proceedings of the Twelfth International Conference on Machine Learning*, 115–123. Morgan Kaufmann, San Francisco, CA. - Goodfellow, I., Bengio, Y. and Courville, A. (2017).
Deep
feedforward networks.
In
*Deep Learning*. MIT Press, Cambridge, MA. - Rojas, R. (1996).
The
backpropagation algorithm.
In
*Neural Networks*. Springer, Berlin-Heidelberg. - Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012).
ImageNet classification with deep convolutional neural networks.
In
*Advances in Neural Information Processing Systems 25*, 1097–1105. Curran Associates, Inc., Red Hook, NY. - Montavon, G., Orr, G. B., and Müller, K.-R. (2012). Neural networks: Tricks of the Trade, 2nd Edition. Lecture Notes in Computer Science, Volume 7700. Springer, Berlin-Heidelberg.
- LeCun, Y.A., Bottou, L., Orr, G. B., and Müller, K.-R. (2012).
Efficient BackProp
In
*Neural networks: Tricks of the Trade*, 9–48. Lecture Notes in Computer Science, Volume 7700. Springer, Berlin-Heidelberg. - Hearst, M.A., et al. (1998).
Support vector machines.
*IEEE Intelligent Systems and their Applications*13(4): 19–28. - Müller, K.-R., et al. (2001).
An introduction to kernel-based learning algorithms.
*IEEE Transactions on Neural Networks*12(2): 181–201.

- Google Scholar
- Blog post: Practical Advice for Building Deep Neural Networks
- Article: The Great AI Awakening
- Article: Meet Cepheus, the virtually unbeatable poker-playing computer
- Article: Heads-up limit hold'em poker is solved
- Survey: Research Leaders on Data Mining, Data Science, and Big Data key trends, top papers

*
Copyright © 2019 Mark Maloof. All Rights Reserved.
This material may not be published, broadcast, rewritten,
or redistributed.
*