COSC-270: Artificial Intelligence

Project 4
Spring 2022

Due: W 4/13 @ 5 PM ET
9 points

Implement IREPc as discussed in Section 2.2 of Cohen's paper entitled Fast Effective Rule Induction.

We will talk about rule-selection heuristics in lecture. To make the project tractable, your implementation does not need to handle numeric attributes or missing values, and it needs to work only with the 1984 Congressional Voting Record. See also the names file. I created a version of this data set in a simplified format: votes-comments.dta. I also put this file on cs-class, which you can retrieve using the command:

cs-class% cp ~maloofm/cosc270/votes-comments.dta ./
Feel free to remove the comments and read the data set into your program, or you can hard-code the data set into your program. Use the democrat class as the positive class. You can remove comments and reorganize the data, but you cannot modify the data in any other way.

To evaluate the learned rules, implement the hold-out method, which involves selecting a random set of the original examples as a training set and using the remaining examples as a testing set. Use 75% of the original examples as the training set. The training set serves as input to IREP. Once the program produces a set of rules, then it evaluates the rules on the examples of the test set. Seed the random number generator with the system clock so you will get a different selection of examples and a different set of rules each time you run IREP.

IREP.main should produce twenty-five sets of rules using different training and testing sets using the hold-out method. For each rule set produced, main should print the accuracy the rules on the testing set for both classes (i.e., the overall accuracy) and for each class (i.e., the true-positive and false-positive rates). It should print the last set of rules produced. Finally, it should then print the average accuracy of the twenty-five rule sets as the JSON string {"Average Accuracy": <accuracy>}, where <accuracy> is a floating-point number.

Instructions for Electronic Submission

In a file named HONOR, provide the following information:
Name
NetID

In accordance with the class policies and Georgetown's Honor Code,
I certify that, with the exceptions of the course materials and those
items noted below, I have neither given nor received any assistance
on this project.

When you are ready to submit your project for grading, put your source files, Makefile, and honor statement in a zip file named submit.zip. Upload the zip file to Autolab using the assignment p4. Make sure you remove all debugging output before submitting.

Plan B

If Autolab is down, upload your zip file to Canvas.

Copyright © 2022 Mark Maloof. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed.