| Project 5 - Forensic Hash Analysis, Linked List Style
	  Assigned: November 28th, 2005Program source code due: December 7th, 2005
 
 | Forensic Hash Analysis, Linked List StyleComputer geeks are lazy, but in a good way - they
	generally try to avoid  unnecessary work. In computer forensics,
	they do this by using an automated process to find files that
	they already know about. One mechanism commonly used to do this
	is call hash analysis.
 
	  Simply put, a hash is a function that takes some data in
	  the form of bits and returns a fixed-length string that is
	  dependent on the data. Just to confuse things a little, the
	  output of a hash function is also commonly called
	  a hash. The cool thing about hash functions is that it is
	  incredibly rare for two different bunches of bits to have the
	  same hash. It is also insanely difficult to find data that will
	  match a given hash output. There are some ways to find two
	  different pieces of data that do produce the same hash output,
	  though.  
	  For a hash analysis, the examiner will develop a library of
	  interesting hashes, or download databases if known hashes, such
	  as the one created
	  by NIST. They will then
	  compute the hash of some interesting or suspect files from the
	  case they are working on, and compare those with the known hashes
	  to identify ones that match.  
	  Your job for this project will be to write a program to help
	  perform hash analyses. Because we have some specific stuff to
	  learn, namely using functions and vectors, I am
	  going to make you do a few specific things in your program, as
	  described below.
	  For this project, the files we are working with will have a
	  specific format. The hash files will start with three lines of
	  comments describing what the hashes are; each of these lines
	  starts with a # sign. . All following lines of
	  the file will have a hash, one to each line. An example is
	  shown below. 
	  
	   
	      #This file contains hashes of hacking toolsThe other type of files we will be working with will be a file
	  that contains a list of file names and hashes. These are the
	  files we want to compare to our hash set. They also have 3
	  lines of comments preceding them. A small example of this is:#Last updated 10/21/2005
 #Clay Shields
 f4d5d0c0671be202bc241807c243e80b
 9a8ad92c50cae39aa2c5604fd0ab6d8c
 8eb379c256416aa5c12a72ba39162101
 60b725f10c9c85c70d97880dfe8191b3
 b0dcfcd427768ddc863f4cd34fb01de9
 9ffbf43126e33be52cd2bf7e01d627f9
 9e0b5b3061054e88eb8dd0053a00d245
 
 
	      # These are hashes of the # ~clay/classes/f05/071/projects/p3 directory
 # 10/21/2005 Clay Shields
 a.out 6cf502e4a3b2a92334b43c7ebbd5adec
 data_file 1938010c6305308c0ca8d8b0d8dc4969
 hashes 8eb379c256416aa5c12a72ba39162101
 hashes.cc 256ae342b7a3c47e44f42b86c06ff39c
 known_hashes 0972956ce73314bec610b69efc9870f0
 p3.html  28ae7f13bb9c0148dc746be63bfc9b23
 
 Program RequirementsWow, the same project again. I normally don't do this, but
	  because time is short, it is the best way to get you to
	  learn some new things without breaking your will to live in
	  the process. Once again we are doing the same project, only
	  this time using linked lists for keeping track of individual
	  files and hashes within the case_file class.  Again, this
	  means you have much less code to write, because I am going
	  to give you the outline of what the objects will look like;
	  you will just need to fill in the correct methods. You don't
	  even need a new main! To help you out, I have placed a .cc
	  file in my account that you can copy over and use. It
	  contains all the text that appears below, so that you can
	  just use and edit it instead of typing everything over
	  again. To get this file into your account, type:
	  
	  cp ~clay/list-hashes-blank.cc ./
 
	  Just like last time, I am going to give you a series of
	  steps to get to the end. I know that sometimes it is
	  frustrating to get stuck and that it is tempting to move on,
	  but it can be much worse to debug lots of code all at
	  once. I really recommend doing it step-by-step and getting
	  help at each step if needed. 
	  Step 1 
	  First, we are going to build an object that will hold the
	  file information. This will be a class
	  called file. All it really contains is a name, a
	  hash, and a pointer to the next file. 
	  
	  
	    
	      
class file{
			     
  // add friend function reading and writing
  friend ostream &operator<<(ostream&, file);
  friend istream &operator>>(istream&, file &);
public:
  file();
  file(string, string);
  void set_name(string);
  string get_name();
  void set_hash(string);
  string get_hash();
  void set_next_file(file *);
  file * get_next_file();
  ~file();
    
private:
  string filename;
  string hash;
  file * next_file;
};
			     
// Default constructor for the file class
file::file(){
}
// Constructor to allow setting initial values for name 
// and hash
file::file(string initial_name, string initial_hash){
}
// Set the name of the file
void file::set_name(string new_name){
}
// Find out what the name of the file is
string file::get_name(){
}
// Set the value of the hash
void file::set_hash(string new_hash){
}
// Find the value of the hash
string file::get_hash(){
}
// Set the next file
void file::set_next_file(file * new_file){
}
// Get the next file
file * file::get_next_file(){
}
// Desctructor - nothing really needed here
file::~file(){
}
// overload the output operator
ostream &operator<<(ostream &output, file f) {
  
}
// overload the input operator
istream &operator>>(istream &input, file &f){
}
	      
	    Your first task is to complete the file class and to make sure that it
	  works. You should use the main function below to do so.
///////////////////////////////                                                 
//                                                                              
// Main for testing the file class                                              
//                                                                              
                                                                                
int main (){                                                                    
                                                                                
  file * test = NULL;                                                           
                                                                                
  test = new file();                                                            
  test->set_name("Test");                                                       
  cout << "File name should be 'Test' and it is: " << test->get_name() << endl; 
  test->set_hash("9eb8c6d611097c8fba484d399d7d9e97");                           
  cout << "Hash should be 9eb8c6d611097c8fba484d399d7d9e97 and it is: "         
       << test->get_hash() << endl;                                             
                                                                                
  // Make sure the friend functions work                                        
  cout << "The next two lines should be the same: " << endl;                    
  cout << "Test 9eb8c6d611097c8fba484d399d7d9e97" << endl;                      
  cout << *test << endl;                                                        
                                                                                
  cout << "Type the two words 'foo bar' and hit enter:";                        
  cin >> *test;                                                                 
  cout << "The next line should say 'foo bar':" << endl;                        
  cout << *test << endl;                                                        
                                                                                
  // Make sure pointer operations work                                          
  file * foo = new file();                                                      
  test->set_next_file(foo);                                                     
  cout << "The next file value should be " << foo << " and it is: " <<          
    test->get_next_file() << endl;                                              
                                                                                
}                                                                               
                                                                                
			     Step 2 
	  
	  Now that the file class is working, we can rewrite our
	  case_file class to us it. The prime thing to remember is
	  that the two vectors for the file names and hashes will be
	  gone, replaced with the single linked list of file
	  objects. The class definition changes a little, and now
	  looks like this:
          
class case_file{
  
public:
  case_file();
  void load_case_file();
  void print_file_names();
  void print_file_hashes();
  void print_comments();
  void print_case_file();
  int number_of_files();
  void add_file(string, string);
  int find_hash_matches(hash_set);
private:
  file * file_list;
  void add_file_to_list(file *);
  vector comments;
};
// Constructor for the case file. Needed to make sure things
// are properly initialized. Woe happens if pointers are random.
case_file::case_file(){
}
// This method askes for a file name, then opens the file, reads the
// comments from the first three lines and the file names and hashes
// after that. It stores the comment results in order in the comments
// vector, and the hashes in a linked list of file objects
void case_file::load_case_file(){
};
// This method prints the comments from the file neatly
// on the screen
void case_file::print_comments(){
};
// This method prints the names of the files neatly
// on the screen
void case_file::print_file_names(){
  
};
// This method prints the hashes of the files neatly
// on the screen
void case_file::print_file_hashes(){
};
// This methos prints the entire case file on the screen, 
// comments first, and then traverse the file list and use cout
// to print each file object
void case_file::print_case_file(){
};
// this method returns the number of files in the case file
int case_file::number_of_files(){
  
};
// This method takes a file name and a file hash as parameters
// and then creates a new file object with those values and
// adds it to the file list.
void case_file::add_file(string new_name, string new_hash){
  
};
// This method takes a vector of strings that contains hashes as
// a parameter. It then finds all case hashes that match, and prints those
// to the screen. It returns the total number of matches.
int case_file::find_hash_matches(hash_set known_hashes){
};
// This is a private function that will add a pointer to a file to the
// file list. I add it to the head   of the list
void case_file::add_file_to_list(file * new_file){
}
Since the class definition hasn't changed, we can use the same main
from Project 4 to test it, and the same one for menuing. They are
included in the file you can copy over.  Resources As with the last projects, you can copy my solution to your
	  gusun account and play with it as needed. To do this,
	  type:
	   cp ~clay/list-hashes ./  
	  You can also copy over the small sample case
	    file or the  sample hash set by typing
	  the following two commands: 
	  cp ~clay/case ./cp ~clay/hash_set ./
 
 What to turn in
	  Include the following header in your source code. 
	   
	    //
	    // Project 5
	    // Name: <your name>
	    // E-mail: <your e-mail address>
	    // COSC 071
	    //
	    // In accordance with the class policies and Georgetown's
	    // Honor Code, I certify that I have neither given nor
	    // received any assistance  on this project with the
	    // exceptions of the lecture notes and those  items noted
	    // below.
	    //
	    //
	    // Description: <Describe your program>
	    //
	   
	  You will submit your source code using the submit
	  program. This is the .cc file. Do not submit the compiled version! I
	  don't speak binary very well.
	 
	  To submit your program, make sure there is a copy of the source code
	  on your account on gusun. You may name your program what you
	  like - let's assume that it is called hashes.cc. To submit
	  your program electronically, use the submit program like we
	  did in Homework 2 and Project 1, 2, and 3, but with the command:
	 
	  submit -a p5 -f listhashes.cc
	 
       |