======================================
   Lec-1-HW-5-C-conventions.html
======================================

What to turn in:
A cover sheet w/ the usual info (name, course, HW title, date). Also
include notes about what you worked on, and where the files are in
your branch. Attached to the coversheet or on it, include your 
answers to the questions below.

======================================

------------------------
-- Step 1. Get code.
------------------------
In src/ there is a directory, src/os-C-sources/. It contains
some program skeletons for a C-based implementation of our OS:

   os.c              (base C code for the OS)
   utils.asm         (OS low-level utilities)
   lib.asm           (user library)
   Makefile          (commands for assembling, compiling, and linking)

Some of the assembly routines in lib.asm and utils.asm are called by 
C code and need to have C prototypes, which are defined in .h files.
We will explore the above source code and the compiler's output.
You probably already have a copy of os-C-sources in your src2/. Otherwise,
copy it and add it to your branch. If you already have it, you will
want to copy updated files from src/. If you have src/ checked out, do
   svn up
in src/ to get the newer files. Otherwise, copy using a web browser.

There are also some link objects (.asm for lcc), load objects (.obj), 
and associated symbol tables (.sym). The C compiler was run on os.c 
and it linked in the .asm files, producing a.asm. That was edited
to fix its .ORIG directive* resulting in os.asm. Assembling os.asm 
produced os.obj and os.sym. If you want to do this processing 
yourself, you will need to build lcc. See src/Makefile.

* lcc always uses .ORIG x3000 when it produces a.asm. We want
it to have .ORIG x0200 (see, src2/Makefile). 

------------------------
-- Step 2. Inspect the code.
------------------------

Let's begin by looking at os.c. It is a skeleton of a potential OS. 
It should all look familiar from our discussions of our OS.

We next take a quick look at our assembly source code. You will see 
some unfamiliar notation in these files. These are link directives 
for lcc's linker. Linking in lcc is done at the assembly level, and 
these files are actually written as lcc link objects.

First, take a look at utils.asm. These are our standard OS functions,
extended a bit by adding in lcc linking directives and stack-protocol
based parameter passing. Obviously, the operations in utils.asm cannot 
be expressed in C nor called directly by C because (1) they operate
directly on the hardware, (2) they do not obey stack protocol, and 
(3) they are meant to be called via the TRAP instruction. (The exception 
is setVTentry, which is C callable and is there as a convenience to 
make simulating booting the OS easier.)

Next, look at lib.asm. These library routines are called via C function 
calls, and are simply there to make TRAP calls, which are impossible to 
express in C. They also handle converting from stack-based parameter
passing to register-based parameter passing that TRAP calls expect. 
In real systems, the lib.asm library routines would be available to 
users, the OS utility code would not.

The lcc compiler first processes os.c, producing an .asm link object.
It then links in lib.asm and utils.asm, producing a.asm. The Makefile
shows how we edit a.asm to fix the .ORIG. You will see some funny 
looking code in os.asm: lcc is a stupid compiler; so, it produces some 
stupid code. It gets the job done, but in some weird ways. The
compiler then runs lc3as on os.asm to get os.obj and os.sym.

Also linked in is library code for printf, which can be found in 
lcc's library in bin/lcc-1.3/. Scan through os.asm to see
how it is layed out. Go all the way to the end of the file. You
will see many labels that were generated by lcc. When you get to 
the Global Data Table, you will see assembly directives like this,
    .STRINGZ "hi"
The result is equivalent to this,
    .FILL x0068     ;-- ASCII 'h'
    .FILL x0069     ;-- ASCII 'i'
    .FILL x0000     ;-- ASCII NUL
which is a "C-style, null-terminated string".

-------------------------
-- Step 3. Trace execution and answer questions.
-------------------------

Put a copy of PennSim.jar in your src2/os-C-sources/, start it
running, and load os.obj. Use "Step" to execute the code one instruction 
at a time. Step through the program up to, but not including, execution of
the first JSSR. You are stepping through the PREAMBLE.

Q. What is the stack pointer, SP aka R6, initially set to by the PREAMBLE?
What is the base pointer, BP aka R5, set to? What is the Global Data
Pointer, GDP aka R4, set to?

Now, execute ("Step") that JSSR.

Q. What function did it jump to? Where is the source code for that function?
At what address does this function begin? When the function finishes and
does RET (aka JMP R7), where will it return to? What will happen
immediately after that and what is its the purpose?

The next 7 instructions are lcc's C protocol for entering a function
and setting up its call frame. R7 is pushed, and BP/R5 is pushed.
After that, space is allocated for local variables on the stack.
The BP is set to the first local variable and SP/R6 is set to the 
last local variable (which might be the same location if there are none
or only one local variable: one word is always allocated). You will find
it informative to look as os.asm at this point as it includes comments,
which you cannot see in PennSim, of course.

Q. Draw a picture of the stack. Show the bottom of the stack, its
address, and all the words that have been allocated so far.
Show the current words pointed to by the SP and BP. What addresses do 
those words contain? What are those addresses? How many local variables 
were allocated? Show the values that were pushed. You can 
use PennSim's "l" command to see the memory content of the stack. 
Just "l " where "addr" is the 16-bit memory address you want to 
see. Or, you can simply scroll through memory to see the stack. You
can get back where you were by editing the PC's value subtracting 1,
click any other PennSim area, then restore the PC's old value. You
can hop around in memory this way, i.e., by altering the PC value.
Just remember what it was so you can restore it before you do your
next execution "Step".

The next two instructions fetch a value from the Global Data table 
(GDT) addressed via R4 (the Global Data Pointer, GDP). The first 
instruction just gets something into R7. "Step" it. 

Q. What address is in R7? What label is associated with that address?
You can figure this out by looking at the address in R4 and the offset 
that was added to it. Scroll down in PennSim's memory display and look 
at what label is found there. NOTE: in a real machine, no symbolic 
information is availabe, only machine code. PennSim uses os.sym to figure
out the labels associated with addresses. The C compiler generates most of
the labels you see.

Q. Because we are looking at the assembled code, labels that were in 
os.asm might not be available or used by PennSim. Take a look at 
os.asm. Find that label you just found. There is a second label
associated with the same address. What is it?  What is the last label of 
the Global Data Table (GDT)? What value is in the last GDT entry?

The next instruction actually does the data fetch using R7's content
as an address and storing what is fetched into R7. "Step" that
instruction.

Q. That value is associated with some C statement in os.c. Which statement?
Recall which function we are currently in. Hint: this compiler does not
produce executable code for a C variable declaration. Is there a variable
referenced or a constant value in this statement? The value in R7 is
either a constant value or the value of a variable. Which is it?

The next 11 instructions use R4 to fetch something into R3. The first
10 instructions load R3 with the value of R4 plus a constant. R3
then has an address in it. The way the constant gets added looks 
really stupid: it takes 10 instructions to get the constant (#150 or
x96) added. "Step" through those 10 instructions.

Q. What address is now in R3? What value is in that location?

Q. R3 now contains an address. Look at that address in PennSim's memory
and find the associated label. What label? Look in os.asm to find that
label. Is this part of the GDT? How many words are in the GDT according
to os.asm's content? Is the constant added into R3 to form an address
bigger than the number of words in the GDT in os.asm? Hint: the
strings roughly average 12 chars each. What is your estimate of the
size of the GDT?

Q. The next 3 instructions perform an operation using R7 and R3. What is
the operation? This is immediately followed by a BR instruction. What
is being tested here? Hint: look at os.c, is there an "if" statement?
According to the BR instruction, if we do not take the BR, what condition
is true? Would that be the TRUE or FALSE (THEN or ELSE) part of the
if() statement? Note the 3 instructions immediately following the BR.
Look in the GDT to see what address the RET (JMP R7) will jump to
if the BR is not taken. Set a breakpoint on that instruction (click 
the box to the left of the where the address for that instruction is 
displayed; it will turn red).

The 17 instructions below the RET prepare the stack for a function call.
(Actually, below that are also several more function calls.)

Q. As you "Step" through these 17 instructions, notice what values are 
being pushed on the stack. Draw the stack's contents as it is after you
have done that, showing where the SP is pointing.

The next 3 instructions prepare to jump to a subroutine: R0 is loaded
from the GDT and then "JSSR R0" makes the jump. Step through the first
two, but do not execute the "LDR R0, R0, #0" instruction.

Q. R0 gets an address of an entry in the GDT. What is the address
of the entry? Look at the content of that entry in the GDT. What
value is stored there? What label is associated with that entry?
Look at the memory location addressed by that entry's value. What
label do you find? Now "Step" the next instruction. It loads R0
with what content? Now "Step" the JSSR instruction. What function
have we entered? In which source file is this function defined?
Note: we have just dereferenced a function pointer variable. As we
noted in class, array names and function names are pointer variables
in C.

We have jumped into a function. The function call in os.c shows two
arguments. The first is a function pointer, the second is a memory
address. Two arguments were pushed to the stack just before
we jumped to this function. The first C argument was pushed last. What
was pushed for that argument was the address of a function pointer
variable. The next few instructions will save R0 and R1 on the stack,
then get the arguments into R0 and R1. Watch R0 as it gets loaded.

Q. First R0 gets the pointer variable's address. Then what does it get?
Look in memory at that location. What label is associated with the address
in R0? 

We are executing setVTentry. If you look at the comments in utils.asm 
for this function, you will see a C prototype declaration. The first 
argument is a pointer to a function whose type is "void f( void )". 
A function pointer in an argument list of a C routine would look like,

    int foo (  void (*f_ptr)( void )  ) { ... body of foo ... }

and from that we would assume the formal argument, f_ptr, is a pointer
variable that when deferenced provides the address of a function. We 
would invoke or jump to that function using this syntax in the body of
foo:

    (*f_ptr)();

which says, find the memory location associated with the variable f_ptr, 
get its content, and use that as a jump address.

Stepping execution a bit more, the content of R0 is written to memory.

Q. What memory location is written into? What value is written to that
location? What have we just accomplished?

Continuing stepping until you hit a RET, and Step it. We are back
in main(). There are 3 more setVTentry() function calls, but we will
not step through them. Instead, "Continue" execution. This will run
until it hits the breakpoint we set earlier. Now Step until we make
a function call.

Q. What function have we jumped into? In which source file is it
defined?

Keep Stepping until you reach the next JSSR. As you do this, watch
what gets pushed to the stack. Notice that we have just entered a
function; so, the function's startup code executes, setting up its
call frame. Some space is allocated on the stack for a local variable.
The stack handling by lcc is a bit buggy; so, don't worry if it
doesn't look quite right.

It so happens the our stack is located in PennSim's memory-mapped
video VRAM area. You can see the pixels in PennSim's Devices area have
changed color as non-zero values were pushed. This is a programming
error, given PennSim's architecture model. But, the standard LC3
does not have VRAM.

We are about to jump to do_printc_trap(). This is a library routine
that executes TRAP x21. The argument was pushed to the stack. Before
that, the return address was pushed to the stack. 

Q. Show the stack as it is now. How many call frames are on the 
stack? What is the current return address on the stack (i.e., the
address the do_printc_trap() will use to exit and return)?
The argument on the stack has what value? What does that value
represent?

Step into do_printc_trap(). It saves a couple of registers,
gets its argument off the stack, puts it into R0, then does TRAP x21.
Step through that and make the TRAP call. Step through the TRAP
routine, watching how it saves registers, gets its argument, and
make its return.

Q. Does TRAP x20 (aka printc_trap in utils.asm) use the stack at
all? Did the LC3 display change (see PennSim's Devices)? How does
it make its return and where did it get its return address?

Continue Stepping until you exit do_printc_trap(), watching how
it gets its return address just before exiting.

Q. Does do_printc_trap() get its return address the same way
that printc_trap() did?

The lcc library includes printf(), which uses TRAP x21 to output
characters. If you Step far enough, you will enter the PRINTF
lcc library routine.

Q. Does PRINTF use do_printc_trap()? What are the arguments to
the first call to PRINTF? How does PRINTF get its arguments?
How many times will this first call to PRINTF call TRAP x21?

Running this code long enough will call SCANF to get keyboard input.
If you click on the text output area of PennSim's Devices, you can
then type and the keystrokes will be read by TRAP x20.

Q. Does SCANF ever jump into code in utils.asm?

Q. (Extra Credit) How does SCANF return its return value to the 
calling code?