COSC 071: Computer Science I

Project 4
Spring 2002

Due: Mon., Apr 15 @ 5 PM
9 points

In this project, we are going to develop a tool to help doctors store information about and diagnose patients with breast cancer. We will be using information from the Wisconsin Breast Cancer database.

Each record in this database corresponds to a woman who has been diagnosed with either a benign or a malignant tumor. There are 697 such records. In addition to the diagnosis and a seven-digit identifier, there are nine numeric diagnositic measures for cell size, cell shape, and the like. Each numeric measure is defined on the range [1, 10].

The senior software engineers of the team have already defined the requirements for this project. The program should guide the doctors through steps of (1) entering a case, (2) finding the most similar case in the database, (3) updating the diagnosis, and (4) adding the case to the database, if desired. They've provided a rough flowchart of how the system should work. (Be thankful it's not on the back of a napkin.) They have also provided a high-level design for the classes. They have also provided a sample run. Finally, here's the data set: wbc.dta. For the interested, here are the complete details of the data set: wbc.names.

When your program is working properly, enter the following two cases and find the most similar case for each. In the header comments of your program, type whether the most similar case was benign or malignant.

1017122 8 10 10 8 7 10 9 7 1 unknown
1276091 5 1 1 3 4 1 3 2 1 unknown

You can assume that the user is omniscient and benevolent, meaning that you don't have to worry much about error checking. Check for files, but don't worry about the measures being between 1 and 10.

Hints for Development

When I post the assignment, we won't have covered the C++ vector class, which is required to implement the PatientList class. However, you should have everything you need to start working on the Patient class. Start by implementing the methods of this class, and write main functions to test your class as you go. This will help build your intutions about how the class works and how it fits into the PatientList class once you start its implementation.

I would implement the default constructor and the print method of the Patient class first. Then write a short main function to test it:

int main()
{
  Patient p;
  p.print();
  return 0;
} // main
What should this print? Something like:
ID:
Clump thickness: 0
Uniformity of cell size: 0
Uniformity of cell shape: 0
Marginal adhesion: 0
Single epithelial cell size: 0
Bare nuclei: 0
Bland chromatin: 0
Normal nucleoli: 0
Mitoses: 0
Diagnosis:
Next, implement the enter method and test it using the main function:
int main()
{
  Patient p;
  p.enter();
  p.print();
  return 0;
} // main
Continue in this fashion until you've implemented the rest of the methods. Be sure to test as you go. The last test should involve your stream insertion operators and your similarity function. Create a data file with two patient records. Calculate the measure of similarity using pen and paper (and calculator). Then execute the following program. You should get the contents of the file printed to the console and the same similarity measure:
int main()
{
  Patient p1, p2;
  ifstream fin ("test.dta");
  fin >> p1 >> p2;
  cout << p1 << endl;
  cout << p2 << endl;
  cout << p1.similarity(p2) << endl;
  return 0;
} // main
Once the Patient class is finished and tested, then start working on the PatientList class in the same manner. By then, you'll know about the C++ vector class.

Although you may add private methods to the classes, you cannot add functions or public methods. You also cannot change the interfaces of any of the public methods (because others are coding to this specification).

Instructions for Electronic Submission: At the top of the file containing your source code (i.e., the file containing the C++ instructions), place the following header comment, with the appropriate modifications:

//
// Project 4
// Name: <your name>
// E-mail: <e-mail address>
// Instructor: Maloof
// TA: <TA's name>
// COSC 071
//
// In accordance with the class policies and Georgetown's Honor Code,
// I certify that, with the exceptions of the lecture notes and those
// items noted below, I have neither given nor received any assistance
// on this project.
//
// Description: <Describe your program>
//

Although you may use any C++ compiler to develop your program, it must run under UNIX and must compile using GNU g++. When you are ready to submit your program for grading, if necessary, use ws-FTP to transfer your source and data file from your PC to gusun. Use SSH to logon to gusun, and use pine to e-mail it to your TA. Use your netid and the suffix ``.cc'' as the subject.

gusun% pine

When the menu appears, select the item for composing e-mail. Assume that your netid is ab123, the name of your source file is proj1.cpp, and your TA's e-mail address is ``imagoodtamaloof@cs''.

Type your TA's e-mail address in the To field, and type your netid with the .cc suffix in the Subject field (no spaces before or after). Move the cursor down into the MESSAGE TEXT screen, and type the ^R command. Pine will ask for a file name (e.g., proj1.cpp), which it will then load as your message text. At this point, your screen should look something like the following:

Finally, type ^X to send the e-mail to your TA.

IMPORTANT: Do not send your source code as an attachment. Do not use a mail client other than pine.

If you need to include a message to you TA about your submission, then type the message as a comment in the program.

Once you've submitted your project, it is important to keep an electronic copy on a university machine (e.g., gusun or cssun) that preserves the modification date and time. If we lose your project or the e-mail system breaks, then we will need to look at the modification date and time of your project to ensure that you submitted it before it was due.

The TAs who will be grading your projects this semester are listed on the main page. You must e-mail your project before 5 PM on the due date.