COSC 288: Introduction to Machine Learning

Project 5
Spring 2016

Due: Mon, May 2 @ 11:59 P.M.
10 points

Student's choice:

  1. Implement AdaBoost. You can also implement Flach's Algorithm 11.3. You will need to select one of your implementations from a previous assignment and modify it to handle weighted examples. Naive Bayes is one choice, but ID3 is probably a better choice. You might also consider implementing a decision stump, a one-level decision tree, based on your ID3 implementation. Students who boost backprop will be transferred to one of their safe schools.

    It is fine to hard code the boosted method, but I must be able to run the base method as a separate executable for the purpose of comparison.

    @article{freund.jjsai.99,
      author = "Freund, Y. and Schapire, R. E.",
      title = "A short introduction to boosting",
      journal = "Journal of Japanese Society for Artificial Intelligence",
      year = 1999,
      volume = 14,
      number = 5,
      pages = "771--780" }
    
  2. Implement Forest-RI. Include your implementation of ID3 with your submission so I can run this base method separately.
    @article{breiman.ml.01,
      author = "Breiman, L.",
      title = "Random forests",
      journal = "Machine Learning",
      year = 2001,
      volume = 45,
      number = 1,
      pages = "5--32" }
    
  3. Implement Stacking.
    @article{wolpert.nn.92,
      author = "Wolpert, D. H.",
      title = "Stacked generalization",
      journal = "Neural Networks",
      year = 1992,
      volume = 5,
      number = 2,
      pages = "241--259" }
    

Here are some new data sets to play with:

The implementations must follow sound principles of object-oriented design and implementation. Implement each learner as a single executable. No windows. No menus. No prompts. Just do it.

The logic of the implementation should be the same as that for the previous implementations. If the user runs a learner and specifies only a training set, then the program should evaluate using 10-fold cross-validation and output the results. Naturally, the user can use the -x switch to change the default. If the user specifies a proportion with -p, then the program should use hold-out to evaluate the learning method. Otherwise, if the user specifies both a training and testing set, then the program should build a model from the training set, evaluate it on the testing set, and output the results.

Instructions for Submission

In the header comments in at least the main file of your project, provide the following information:
//
// Name
// E-mail Address
// Platform: Windows, MacOS, Linux, Solaris, etc.
// Language/Environment: gcc, g++, java, g77, ruby, python.
//
// In accordance with the class policies and Georgetown's Honor Code,
// I certify that, with the exceptions of the class resources and those
// items noted below, I have neither given nor received any assistance
// on this project.
//

Same submission instructions: When you are ready to submit your program for grading, create a zip file of the directory containing only your project's source and build instructions, and upload it to Blackboard.

Copyright © 2019 Mark Maloof. All Rights Reserved. This material may not be published, broadcast, rewritten, or redistributed.