COSC 387: Artificial Intelligence

Project 5
Spring 2008

Due: Mon, Apr 28 @ 5 P.M.
13 points

Using a language of your choosing, implement a version of IREP as discussed in Cohen's paper entitled Fast Effective Rule Induction. You can use only the standard libraries included with the programming language. We will talk about rule-selection heuristics in lecture.

To make the project tractable, your implementation does not have to handle numeric attributes, it does not have to prune (see below), and it needs to work only with the 1984 Congressional Voting Record. See also the names file. I put a version of this data set in a simplified format on seva, which you can retrieve using the commands:

seva% cp ~maloofm/cosc387/votes.dta ./
Feel free to remove the comments and read the data set into your program, or you can hard-code the data set into your program.

To evaluate the learned rules, implement the hold-out method.

The implementation should print the rules it learns from the training examples, and the accuracy of those rules on the testing set.

Extra Credit

Implement pruning (10%).

Instructions for Electronic Submission

In the header comments of the primary file, provide the following information:
;;;;
;;;; COSC 387 Project 5
;;;; Name
;;;; E-mail Address
;;;; Platform: Windows, Linux (seva), etc.
;;;; Language: Lisp, C, Java, C++, Python, Ruby, PL/I, ...
;;;;
;;;; In accordance with the class policies and Georgetown's Honor Code,
;;;; I certify that, with the exceptions of the course materials and those
;;;; items noted below, I have neither given nor received any assistance
;;;; on this project.
;;;;

If you need to submit a single file, assuming its name is p5.lisp, type

seva% java -jar submit.jar -a p5 -f p5.lisp
If you need to submit multiple files, if you haven't already, place all of your code in a subdirectory named p5. To create this subdirectory, type
seva% mkdir p5
To descend into the directory, type
seva% cd p5
All of the files for your project should be in this directory. The submit program should be above this directory:
seva% ls ..
p5/ submit.jar

If you need to include a message to me about your submission, then place the message in a file named README. Place the README file in the project's directory.

To move up from the p5 directory, type

seva% cd ..
You should be above the p5 directory:
seva% ls
p5/ submit.jar

(Additional useful Unix commands)

When you're ready to submit, change the name of the directory to your netid. For example, if your netid is maloofm, then rename the directory p5 by typing

seva% mv p5 maloofm
Create a zip file of the directory and its contents by typing
seva% zip -r p5.zip maloofm/*
This command creates a zip file named p5.zip by recursively (-r) copying all of the files (*) from the directory maloofm/.

To submit the zip file type

seva% java -jar submit.jar -a p5 -f p5.zip
p5 is the name of the assignment (-a) and p5.zip is the file (-f) to be submitted for that assignment.

If the program submits the file successfully, you will receive a receipt by e-mail at the address <netid>@georgetown.edu.

Submit your project only once.

Once you've submitted your project, it is important to keep an electronic copy on a university machine (e.g., seva) that preserves the modification date and time. If we lose your project or the submission system breaks, then we will need to look at the modification date and time of your project to ensure that you submitted it before it was due.

You can also change the directory's name back to the original name. For example,

seva% mv maloofm p5
Note that changing the name of the directory does not change the dates of the files in the directory. You can also remove the zip file from your directory:
seva% rm p5.zip

You must submit your project before 5 PM on the due date.

Plan B

If something goes wrong with submit, then send your project as an attachment to an e-mail to me. Remember that UIS strips zip files from e-mails, so rename your zip from p5.zip to p5.piz, or something like that.

Copyright © 2019 Mark Maloof. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed.