COSC-575: Machine Learning

Project 5
Fall 2018

Due: M 12/10 @ 11:59 P.M.
10 points

In this project, you will implement online learners and reproduce some of the experimental results in Carvalho and Cohen's Single-pass online learning: Performance, voting schemes and online feature selection. Implement Balanced Winnow as the class BW and Modified Balanced Winnow as the class MBW. Implement voting for both learners, and use the switch -v to turn on voting.

Replicate the experiment described in the paper and reproduce the results presented in Table 3 for the methods above and the data sets Nursery, Wisc (breast-cancer.mff), Congress (votes.mff), and Adult, which you can retrieve from the UCI Machine Learning Repository. Implement the method Performance.getAvgF1, which returns the average F1 measure. You can assume that the autograder will pass to these learners the modified data sets with the original nominal features, as described in the paper. You do not have to use a two-tailed t-test to determine if the results are statistically significant.

In one page or less, write a report that compares your experimental results with those reported in in Carvalho and Cohen (2006). Include a PDF version of your report with your submission. Please name it report.pdf.

Other resources that may be helpful:

The implementations must follow sound principles of object-oriented Implement each learner as a single executable. No windows. No menus. No prompts. Just do it.

The logic of the implementation should be the same as that for the previous implementations. If the user runs a learner and specifies only a training set, then the program should evaluate using 10-fold cross-validation and output the results. Naturally, the user can use the -x switch to change the default. If the user provides the -p switch and a proportion, then the program conducts an evaluation using the hold-out method. Otherwise, if the user specifies both a training and testing set, then the program should build a model from the training set, evaluate it on the testing set, and output the results.

Instructions for Submission

In a file named HONOR, please include the statement:
In accordance with the class policies and Georgetown's Honor Code,
I certify that, with the exceptions of the class resources and those
items noted below, I have neither given nor received any assistance
on this project.
Include this file in your zip file submit.zip.

Submit p5 exactly like you submitted p4. Make sure you remove all debugging output before submitting.

Plan B

If Autolab is down, upload your zip file to Canvas.

Copyright © 2019 Mark Maloof. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed.