Srping 2010

Clay Shields

front | classes | research | personal | contact

Project 2 - Paging Algorithm Performance

Due before class April 26th, 2010

As we have discussed in class, the choice of a page replacement algorithm can greatly affect the performance of a computer system. In this project, you will simulate several different algorithms and test their performance on memory trace files of programs running on a Linux system.

The Simulator

You first task will be to write a simulation of a single-level page table that runs on your account on spoof. Assume that the page size is 4k, and that it is a 32 bit processor.

Program speed matters, so I suggest C or C++, though Java is acceptable though not preferred. Interpreted languages such as Perl or Ruby are right out.

Running the program

The program should be contained in a single file named with your netid. For example, my simulation would be named clay.c. Your program should accept the following arguments:

<yournetid.c> <number of frames> <algorithm to use> [verbose]

where:

number of frames is the number of physical frames of RAM available to the process.

algorithm to use is one of the following (see below for more information about which to implement):

FIFO: First In, First Out
GC: Global Clock
LFU: Least Frequently Used
LRU: Least Recently Used
MFU: Most Frequently Used
OPT: Belady's Optimal Algorithm
RAND: Random

verbose is an optional flag that will provide additional output to show how your simulator is running.

Your simulator should read input lines from standard input of the form:

0x40000c35 W 0x40000c36 W 0x40011ac7 R 0x40011aca R 0x40000c44 W

where the first hex value is the address being accessed, and the second value indicates a read or write at that memory address. We are using standard input because the trace files are very large when uncompressed. We won't uncompress them. Instead, we will use gzcat to spew the uncompressed trace file, and pipe the output into our simulations. For example, a run might look like:

gzcat gpp.gz | clay 10 opt

Algorithms to Implement

You will have to implement four different page replacement algorithms. While this sounds awful given it is nice out and you are about to graduate (assuming you pass this class), with careful planning of how you store and update your page table it isn't really very difficult to implement additional algorithms. You can choose which to implement from this menu:

At least three from this list:

FIFO: First In, First Out
GC: Global Clock
LRU: Least Recently Used
OPT: Belady's Optimal Algorithm

and one of the following:

LFU: Least Frequently Used
MFU: Most Frequently Used
RAND: Random

and one algorithm of your own design.

Output

Your program should simulate the algorithm specified using the number of frames specified. When complete, it should output statistics in the following format:

Number of memory accesses: Number of disk reads: Number of disk writes:

Additionally, if the verbose flag is set, it should output a message every time a page is replaced. This message should indicate the hex address of the page being replaced, the page replacing it, and whether the replaced page is overwritten or written to disk first. For example:

Page 60970 overwritten by 60929
or
Page 60970 swapped, replaced by 60929

Trace Files

These traces files were gathered on a Linux system using pin. They have been reduced in terms of the number of references, however, because full traces are prohibitively large. They are listed below in order of number of memory accesses.

g++ compilation of an 071 final project, ~320K references: gpp.gz
A portion of an emacs run, ~2M references: emacs.sm.gz
A few seconds of the top utility running, ~5M references: top.gz
A run of the 071 infamous starfish solution, ~7M references: stars.gz
A longer part of an emacs run, ~10M references: emacs.gz
Latex run of a single page letter of recommendation, ~31M references: latex.gz

Instead of having to download everything to your own directory (though you can if you want) you can reach each of these files on spoof using the path: ~clay/

The need for speed

Speed counts. We want our systems to be fast. We will therefore be using the command /usr/bin/time to get a measure of how fast our programs are. Make sure that you time your program, not gzcat. Do not use the verbose flag for timed runs.

You can time your programs like this:

gzcat emacs.sm.gz | /usr/bin/time clay 10 opt

You will get additional output that looks like this:

real 5.8 user 4.3 sys 0.7

The time we are most interested in is the user time, since that is how much CPU time your program took to run on that trace. There will be a bonus for fast programs, and a deduction for really slow ones.

What to turn in:

You will e-mail me two things prior to class:

The code of your simulator, with a spoof command line on how to compile it included in the header comments.
A nicely formatted (meaning printable) report that shows the results of each of your implemented algorithms on each trace file, including timing information. Use 50 frames for this analysis. A table showing all this would be perfect.
A description of your page replacement algorithm; why you think it should work well, and how it met your expectations.