Fall 2022
Class Time: | TR 9:30 AM – 10:45 PM ET |
Classroom: | WAL 398 |
Instructor: | Mark Maloof |
Office: | 325 St. Mary's Hall |
Mailbox: | 329A St. Mary's Hall |
Office Hours: | None for 24–25 academic year. |
This undergraduate course surveys the major areas of machine learning focusing on classification. Through traditional lectures and programming projects, students learn (1) to understand the foundations of machine learning, (2) to design and implement methods of machine learning, (3) to evaluate methods of machine learning, and (4) to conduct empirical evaluations of multiple methods of machine learning. The course compares and contrasts machine learning with related endeavors, such as statistical learning, pattern classification, data mining, and information retrieval. Topics include instance-based approaches, naive Bayes, decision trees, rule induction, linear classifiers, support vector machines, neural networks, ensemble methods, evaluation, and applications. Students complete five programming projects using Java. There are midterm and final exams.
Prerequisites: Advanced Programming (COSC-150) and Data Structures (COSC-160).
Primary Text:
By the end of the semester, students will be able to:
My course policies are designed to supplement the University's Undergraduate Honor System and the CS Department's Honor Policy. Unless stated otherwise when I distribute an assignment, the following is the default for all assignments for this course. I have developed my policies from past teaching experiences and from the CS Department's Honor Code at George Mason University.
I am obligated to refer all suspected cases of academic dishonesty by undergraduate and master's students to Georgetown's Honor Council. I am obligated to refer all suspected cases of academic dishonesty by doctoral students to the dean of the Graduate School. If you have any questions about these policies or how they apply, please discuss such concerns with me during class, during office hours, by e-mail, or using the class's discussion board.
In my experience, students at Georgetown do honest work. The very small percentage of students who have submitted someone else's work as their own did so because they did not manage their time wisely.
Students must always follow proper scholarly practice for all submitted work, whether graded or ungraded and whether a draft or final version of a proposal, paper, program, or problem set. As scholars, we must acknowledge our reliance on the work of others through citation. You can never submit someone else's work as your own without proper attribution. The University assumes that students learned how to properly cite material at their previous institutions. If this is not the case, please let me know.
Indeed, students may be quite adept at and knowledgeable about citing and quoting material from traditional sources, such as books and articles. Typically, we do not have cite facts, common math formulae, or expressions of our own ideas, observations, interpretations, and analyses, However, students new to computer science may not realize that theorems, proofs, algorithms, programs, and code fragments may require the same treatment as any other form of expression.
For convenience, you do not need to cite the course materials, which includes the syllabus, sources linked from the syllabus, the course textbook, the class lectures and discussion, posts on the discussion board, and conversations with me and courses assistants. You must, however, cite the use of any resource that is not part of the course materials. Note that “the textbook” does not extend to the textbook's Web site, its contents, lecture slides, solution manuals, code repositories, or any other material related to the textbook. The syllabus links to any such material pertinent for the class. If you are unsure about what requires citation or what constitutes proper scholarly practice, please ask me during class, during office hours, by e-mail, or using the discussion board.
I design my courses and assignments so students have what they need to complete the assignments individually without consulting outside resources. I determine the size of and credit for assignments based on the assumption that the work for them is the result of individual effort using only the course resources and materials. Students who use outside resources to complete assignments may not be eligible for full credit. Students who do not acknowledge their use of outside resources to complete assignments may be in violation of my course policies and the university's policies on academic integrity.
The materials that I create and use for my courses (“Course Materials”) are my intellectual property. You may not disseminate or reproduce them in any form for public distribution (e.g., sale, exchange, etc.) without my written permission. Course Materials include all written or electronic documents and materials that I provide, including but not limited to syllabi, current and past assessments and their solutions (e.g., exams, homeworks, projects, problem sets, etc.), and presentations such as lectures, videos, slides, etc. Course Materials may only be used by students enrolled in the course for academic (i.e., course-related) purposes. Furthermore, your solutions to assessments are derivative works of my copyrighted material and are therefore subject to that protection because they necessarily incorporate my protected expression. Consequently, you may not further disseminate or reproduce in any form for distribution (e.g., uploading to websites, sale, exchange, etc.) your solutions to assessments.
Published course readings (book chapters, articles, reports, etc.) available in Canvas are copyrighted material. I make these works available to students through licensed databases or fair use. They are protected by copyright law, and may not be further disseminated or reproduced in any form for distribution (e.g., uploading to websites, sale, exchange, etc.) without permission of the copyright owner.
You can find more information about intellectual property and copyright here: https://www.library.georgetown.edu/copyright. You can find more information about computer acceptable use policy and intellectual property here: https://security.georgetown.edu/it-policies-procedures/computer-systems-aup.
Copyright issues aside, if you post your solutions to assessments on the Internet, it is possible that students in future classes will find your solutions and submit them as their own work without attribution. Naturally, students who do so violate my course policies, and the Honor Council will with high probability find them in violation and sanction them. The Honor Council may also find you in violation because you facilitated cheating and violated copyright law and the Web site's terms of service.
I understand that it is often important for securing a job or an internship for students to provide prospective employers with a portfolio of their work. I recommend that students devise a scheme for doing so that does not violate copyright law, does not violate the terms of service of the site on which you have posted material protected by copyright, and does not facilitate cheating.
The following list details acceptable and unacceptable practices:
Policies dealing logistics:
Absences will be excused for participation in a university-sponsored varsity athletics (with confirmation from the appropriate contact) and for religious holidays (for a list of religious holidays that are excused absences, please visit: https://campusministry.georgetown.edu/religious_holy_days.) You are responsible for any material covered during your absence.
Absences for minor illnesses do not count as excused, but rather count against your three allowed absences. If you will be missing more than two classes because of a serious illness or because you have been asked to quarantine, you should contact your advising dean, who will help you work with all of your professors to formulate a recovery plan. The same goes for other types of emergencies, such as family illnesses. We will deal with such absences in a compassionate way, but since they affect all of your courses, you will work with your advising dean to make a plan for getting back on track.
Finally, if you must miss class for any reason, be sure to get the lecture notes from a classmate.
String Grades::getLetterGrade() { if (grade >= 94) return "A"; else if (grade >= 90) return "A-"; else if (grade >= 87) return "B+"; else if (grade >= 84) return "B"; else if (grade >= 80) return "B-"; else if (grade >= 77) return "C+"; else if (grade >= 74) return "C"; else if (grade >= 70) return "C-"; else if (grade >= 67) return "D+"; else if (grade >= 64) return "D"; else return "F"; } // Grades::getLetterGrade
I use automatic grading routines to assign an initial grade for the projects. It is important to emphasize that the grade you obtain from Autolab is an initial grade and may not be your final grade. There are many important aspects of a program that are difficult or impossible to assess using automatic grading routines. For example, automatic grading routines can not determine if you have written proper documentation. They can not easily assess if an implementation of an operation is optimally efficient. As a consequence, I start with the initial grade you obtain from Autolab and take further deductions if necessary. I grade the last submission for practical and pedgagical reasons.
For complete implementations, I use the following distribution as a guide:
Notice that an implementation consisting entirely method stubs would obtain an initial grade of 60%. Such an implementation is incomplete, and would be subject to further deductions based on the effort required to implement the required operations.
In addition to the above, the following deductions may be taken if applicable:
Copyright © 2022 Mark Maloof. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed.