COSC 288: Introduction to Machine Learning

Project 3
Spring 2016

Due: Wed, Mar 23 @ 11:59 P.M.
10 points

Building upon your implementation for p1 and p2, implement ID3. Also implement routines for hold-out and for handling separate testing sets.

Tasks:

  1. Implement ID3.

  2. Extend the Evaluator class from p2 to perform evaluation using the hold-out method. Use the -p switch to let users specify the proportion if the examples to use as training data.

  3. Extend the Evaluator class from p2 to perform evaluation on a set of testing examples that a user provides using the -T switch.

The implementation must be general in the sense that it must work for all data sets with nominal attributes and nominal class labels. Our convention is that the last attribute of the attribute declarations is the class label.

Here's a new data set: soybean.mff. The task is to diagnose a soybean disease based on environmental factors and the condition of the plant.

Implement ID3 a single executable. No windows. No menus. No prompts. Just do it.

The logic of each implementation should be as follows. The user must provide a training set (using the -t switch). By default, the program should evaluate the method on the training set using 10-fold cross-validation and output the results. Naturally, the user can use the -x switch to change the default. The output should consist only of the average accuracy and some measure of dispersion, such as variance, standard deviation, standard error, or a 95% confidence interval. If the user uses the switch -p, then the program should use the hold-out method to evaluate the method and print the accuracy. Finally, if the user uses the switch -T, then the program should use the examples in the testing file to evaluate the method and print the accuracy. You can assume that users will not give conflicting switches for evaluation.

Make sure your implementations of k-NN and naive Bayes from p2 work with the changes to Evaluator. They should just work. Include these implementations with your submission for this project. Yes. Your Makefile must build all three executables.

Instructions for Submission

In the header comments in at least the main file of your project, provide the following information:
//
// Name
// E-mail Address
// Platform: Windows, MacOS, Linux, Solaris, etc.
// Language/Environment: gcc, g++, java, g77, ruby, python.
//
// In accordance with the class policies and Georgetown's Honor Code,
// I certify that, with the exceptions of the class resources and those
// items noted below, I have neither given nor received any assistance
// on this project.
//

When you are ready to submit your program for grading, create a zip file of the directory containing only your project's source and build instructions, and upload it to Blackboard.

Copyright © 2019 Mark Maloof. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed.