COSC-288: Machine Learning

Project 3
Fall 2022

Due: F 11/4 @ 5:00 P.M.
9 points

Implement Flach's tree learner for nominal attributes with the modifications we discussed in lecture: Use gain ratio to determine the best attribute for splitting. Implement pre-pruning for nodes with three or fewer examples.

Extend the Evaluator class from p2 to perform evaluation using the hold-out method. Use the -p switch to let users specify the proportion of the examples to use as training data. You do not need to worry about users entering conflicting options.

Finally, implement ZeroRule, a classifier that always predicts the majority class of the training examples for observations. This classifier is called ZeroRule presumably because it is similar to an if-then rule with zero conditions. Actually, it is equivalent to the rule true → majority class. It is also equivalent to a decision rule that always predicts the class with the highest prior probability. It is good practice to include ZeroRule in an evaluation of learning methods because it can provide important context for a classifier's improvement over the baseline that ZeroRule provides.

  • To help you get started, I put some class and method declarations on cs-class in p3.zip, which you can retrieve using the commands:
    cs-class-1% cd
    cs-class-1% cp ~maloofm/cosc288/p3.zip ./
    cs-class-1% unzip p3.zip
    

    The implementation must be general. It must follow sound object-oriented design principles. Implement the learner as a single executable. No windows. No menus. No prompts. Just do it.

    The logic of the implementation should be as follows. The user must provide a training set (using the -t switch). By default, the program should evaluate the method on the training set using 10-fold cross-validation and output the results. Naturally, the user can use the -x switch to change the default. If the user uses the switch -p, then the program should use the hold-out method to evaluate the method and print the results. Finally, if the user uses the switch -T, then the program should use the examples in the testing file to evaluate the method and print the results. You can assume that users will not give conflicting switches for evaluation.

    Instructions for Submission

    In a file named HONOR, please include the statement:
    In accordance with the class policies and Georgetown's Honor System,
    I certify that, with the exceptions of the class resources and those
    items noted below, I have neither given nor received any assistance
    on this project.
    
    Name
    NetID
    
    Include this file in your zip file submit.zip.

    Submit p3 exactly like you submitted p2. Make sure you remove all debugging output before submitting.

    Plan B

    If Autolab is down, upload your zip file to Canvas.

    Copyright © 2019 Mark Maloof. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed.