COSC 688: Machine Learning

Project 2
Fall 2007

Due: Oct 30 @ 5 P.M.
15 points

Implement ID3, the forerunner of C4.5 (i.e., symbolic attributes, gain ratio for attribute selection, and no pruning). Conduct an evaluation of ID3, naive Bayes, and k-NN using the 1984 Congressional Voting Record. Compute average measures of performance and measures of dispersion.

The implementation of ID3 must be general, meaning that it must work with any data set with symbolic attributes.

The implementations must be general, meaning that they must work for any similarly represented data set with numeric and symbolic attributes and symbolic class labels. Finally, if you consult sources outside the class materials, you must cite those sources. As I said in class, I will grade you on the quality of your sources. You may not consult with anyone other than me about the project, and you may not consult or use someone else's source code or implementation. Feel free to contact me with questions, but it would be best to discuss any issues in class so everyone can contribute and benefit.

Instructions for Submission: In the header comments, provide the following information:

/**
 * Name:
 * E-mail Address:
 * Platform: Windows, OS X, Linux (seva), Solaris, bsd, etc.
 * Language/Environment: gcc, g++, java, python, ruby, clisp, g77, g95, etc.
 *
 * In accordance with the class policies and Georgetown's Honor Code,
 * I certify that, with the exceptions of the class resources and those
 * items noted below, I have neither given nor received any assistance
 * on this project.
 */ 
When you are ready to submit your program for grading, create a compressed archive of a directory containing only your project's source, and send it to me by e-mail as an attachment. The directory's name should be the same as your net ID.

For example, assume your net ID is ab123. If the directory p1 contains your project, then rename the directory to ab123.

To make the archive smaller, remove any object files, such as .class, a.out, and .o files.

Use zip, tar, or jar to create an archive:

% zip ab123.tar ab123/*
% tar -cf ab123.tar ab123
% jar -cf ab123.jar ab123
Use jar only for Java projects. If you use jar or tar, then compress the archive by typing
% gzip ab123.tar
% gzip ab123.jar
which creates a file ab123.tar.gz and ab123.jar.gz, respectively.

N.B. If you use zip, then you need to change the extension of your file to something other than .zip, as UIS strips .zip attachments. The extension .piz works pretty well. So you'd rename ab123.zip to ab123.piz.

Attach the file containing your project to an e-mail and send it to me.

Make sure you send a carbon copy of your project to yourself, so you'll have a record of when you submitted your project. Ideally, also keep a copy on a university or department machine. However, make sure that your archive, directory, or files are not readable by others.

Submit your project before 5:00 P.M. on the due date.