Project 5 - Forensic Hash Analysis, Linked List Style
Assigned: November 28th, 2005
Program source code due: December 7th, 2005
|
Forensic Hash Analysis, Linked List Style
Computer geeks are lazy, but in a good way - they
generally try to avoid unnecessary work. In computer forensics,
they do this by using an automated process to find files that
they already know about. One mechanism commonly used to do this
is call hash analysis.
Simply put, a hash is a function that takes some data in
the form of bits and returns a fixed-length string that is
dependent on the data. Just to confuse things a little, the
output of a hash function is also commonly called
a hash. The cool thing about hash functions is that it is
incredibly rare for two different bunches of bits to have the
same hash. It is also insanely difficult to find data that will
match a given hash output. There are some ways to find two
different pieces of data that do produce the same hash output,
though.
For a hash analysis, the examiner will develop a library of
interesting hashes, or download databases if known hashes, such
as the one created
by NIST. They will then
compute the hash of some interesting or suspect files from the
case they are working on, and compare those with the known hashes
to identify ones that match.
Your job for this project will be to write a program to help
perform hash analyses. Because we have some specific stuff to
learn, namely using functions and vectors, I am
going to make you do a few specific things in your program, as
described below.
For this project, the files we are working with will have a
specific format. The hash files will start with three lines of
comments describing what the hashes are; each of these lines
starts with a # sign. . All following lines of
the file will have a hash, one to each line. An example is
shown below.
#This file contains hashes of hacking tools
#Last updated 10/21/2005
#Clay Shields
f4d5d0c0671be202bc241807c243e80b
9a8ad92c50cae39aa2c5604fd0ab6d8c
8eb379c256416aa5c12a72ba39162101
60b725f10c9c85c70d97880dfe8191b3
b0dcfcd427768ddc863f4cd34fb01de9
9ffbf43126e33be52cd2bf7e01d627f9
9e0b5b3061054e88eb8dd0053a00d245
The other type of files we will be working with will be a file
that contains a list of file names and hashes. These are the
files we want to compare to our hash set. They also have 3
lines of comments preceding them. A small example of this is:
# These are hashes of the
# ~clay/classes/f05/071/projects/p3 directory
# 10/21/2005 Clay Shields
a.out 6cf502e4a3b2a92334b43c7ebbd5adec
data_file 1938010c6305308c0ca8d8b0d8dc4969
hashes 8eb379c256416aa5c12a72ba39162101
hashes.cc 256ae342b7a3c47e44f42b86c06ff39c
known_hashes 0972956ce73314bec610b69efc9870f0
p3.html 28ae7f13bb9c0148dc746be63bfc9b23
Program Requirements
Wow, the same project again. I normally don't do this, but
because time is short, it is the best way to get you to
learn some new things without breaking your will to live in
the process. Once again we are doing the same project, only
this time using linked lists for keeping track of individual
files and hashes within the case_file class. Again, this
means you have much less code to write, because I am going
to give you the outline of what the objects will look like;
you will just need to fill in the correct methods. You don't
even need a new main! To help you out, I have placed a .cc
file in my account that you can copy over and use. It
contains all the text that appears below, so that you can
just use and edit it instead of typing everything over
again. To get this file into your account, type:
cp ~clay/list-hashes-blank.cc ./
Just like last time, I am going to give you a series of
steps to get to the end. I know that sometimes it is
frustrating to get stuck and that it is tempting to move on,
but it can be much worse to debug lots of code all at
once. I really recommend doing it step-by-step and getting
help at each step if needed.
Step 1
First, we are going to build an object that will hold the
file information. This will be a class
called file. All it really contains is a name, a
hash, and a pointer to the next file.
class file{
// add friend function reading and writing
friend ostream &operator<<(ostream&, file);
friend istream &operator>>(istream&, file &);
public:
file();
file(string, string);
void set_name(string);
string get_name();
void set_hash(string);
string get_hash();
void set_next_file(file *);
file * get_next_file();
~file();
private:
string filename;
string hash;
file * next_file;
};
// Default constructor for the file class
file::file(){
}
// Constructor to allow setting initial values for name
// and hash
file::file(string initial_name, string initial_hash){
}
// Set the name of the file
void file::set_name(string new_name){
}
// Find out what the name of the file is
string file::get_name(){
}
// Set the value of the hash
void file::set_hash(string new_hash){
}
// Find the value of the hash
string file::get_hash(){
}
// Set the next file
void file::set_next_file(file * new_file){
}
// Get the next file
file * file::get_next_file(){
}
// Desctructor - nothing really needed here
file::~file(){
}
// overload the output operator
ostream &operator<<(ostream &output, file f) {
}
// overload the input operator
istream &operator>>(istream &input, file &f){
}
Your first task is to complete the file class and to make sure that it
works. You should use the main function below to do so.
///////////////////////////////
//
// Main for testing the file class
//
int main (){
file * test = NULL;
test = new file();
test->set_name("Test");
cout << "File name should be 'Test' and it is: " << test->get_name() << endl;
test->set_hash("9eb8c6d611097c8fba484d399d7d9e97");
cout << "Hash should be 9eb8c6d611097c8fba484d399d7d9e97 and it is: "
<< test->get_hash() << endl;
// Make sure the friend functions work
cout << "The next two lines should be the same: " << endl;
cout << "Test 9eb8c6d611097c8fba484d399d7d9e97" << endl;
cout << *test << endl;
cout << "Type the two words 'foo bar' and hit enter:";
cin >> *test;
cout << "The next line should say 'foo bar':" << endl;
cout << *test << endl;
// Make sure pointer operations work
file * foo = new file();
test->set_next_file(foo);
cout << "The next file value should be " << foo << " and it is: " <<
test->get_next_file() << endl;
}
Step 2
Now that the file class is working, we can rewrite our
case_file class to us it. The prime thing to remember is
that the two vectors for the file names and hashes will be
gone, replaced with the single linked list of file
objects. The class definition changes a little, and now
looks like this:
class case_file{
public:
case_file();
void load_case_file();
void print_file_names();
void print_file_hashes();
void print_comments();
void print_case_file();
int number_of_files();
void add_file(string, string);
int find_hash_matches(hash_set);
private:
file * file_list;
void add_file_to_list(file *);
vector comments;
};
// Constructor for the case file. Needed to make sure things
// are properly initialized. Woe happens if pointers are random.
case_file::case_file(){
}
// This method askes for a file name, then opens the file, reads the
// comments from the first three lines and the file names and hashes
// after that. It stores the comment results in order in the comments
// vector, and the hashes in a linked list of file objects
void case_file::load_case_file(){
};
// This method prints the comments from the file neatly
// on the screen
void case_file::print_comments(){
};
// This method prints the names of the files neatly
// on the screen
void case_file::print_file_names(){
};
// This method prints the hashes of the files neatly
// on the screen
void case_file::print_file_hashes(){
};
// This methos prints the entire case file on the screen,
// comments first, and then traverse the file list and use cout
// to print each file object
void case_file::print_case_file(){
};
// this method returns the number of files in the case file
int case_file::number_of_files(){
};
// This method takes a file name and a file hash as parameters
// and then creates a new file object with those values and
// adds it to the file list.
void case_file::add_file(string new_name, string new_hash){
};
// This method takes a vector of strings that contains hashes as
// a parameter. It then finds all case hashes that match, and prints those
// to the screen. It returns the total number of matches.
int case_file::find_hash_matches(hash_set known_hashes){
};
// This is a private function that will add a pointer to a file to the
// file list. I add it to the head of the list
void case_file::add_file_to_list(file * new_file){
}
Since the class definition hasn't changed, we can use the same main
from Project 4 to test it, and the same one for menuing. They are
included in the file you can copy over.
Resources
As with the last projects, you can copy my solution to your
gusun account and play with it as needed. To do this,
type:
cp ~clay/list-hashes ./
You can also copy over the small sample case
file or the sample hash set by typing
the following two commands:
cp ~clay/case ./
cp ~clay/hash_set ./
What to turn in
Include the following header in your source code.
//
// Project 5
// Name: <your name>
// E-mail: <your e-mail address>
// COSC 071
//
// In accordance with the class policies and Georgetown's
// Honor Code, I certify that I have neither given nor
// received any assistance on this project with the
// exceptions of the lecture notes and those items noted
// below.
//
//
// Description: <Describe your program>
//
You will submit your source code using the submit
program. This is the .cc file. Do not submit the compiled version! I
don't speak binary very well.
To submit your program, make sure there is a copy of the source code
on your account on gusun. You may name your program what you
like - let's assume that it is called hashes.cc. To submit
your program electronically, use the submit program like we
did in Homework 2 and Project 1, 2, and 3, but with the command:
submit -a p5 -f listhashes.cc
|