====================================== Lec-1-HW-5-C-conventions.html ====================================== What to turn in: A cover sheet w/ the usual info (name, course, HW title, date). Also include notes about what you worked on, and where the files are in your branch. Attached to the coversheet or on it, include your answers to the questions below. ====================================== ------------------------ -- Step 1. Get code. ------------------------ In src/ there is a directory, src/os-C-sources/. It contains some program skeletons for a C-based implementation of our OS: os.c (base C code for the OS) utils.asm (OS low-level utilities) lib.asm (user library) Makefile (commands for assembling, compiling, and linking) Some of the assembly routines in lib.asm and utils.asm are called by C code and need to have C prototypes, which are defined in .h files. We will explore the above source code and the compiler's output. You probably already have a copy of os-C-sources in your src2/. Otherwise, copy it and add it to your branch. If you already have it, you will want to copy updated files from src/. If you have src/ checked out, do svn up in src/ to get the newer files. Otherwise, copy using a web browser. There are also some link objects (.asm for lcc), load objects (.obj), and associated symbol tables (.sym). The C compiler was run on os.c and it linked in the .asm files, producing a.asm. That was edited to fix its .ORIG directive* resulting in os.asm. Assembling os.asm produced os.obj and os.sym. If you want to do this processing yourself, you will need to build lcc. See src/Makefile. * lcc always uses .ORIG x3000 when it produces a.asm. We want it to have .ORIG x0200 (see, src2/Makefile). ------------------------ -- Step 2. Inspect the code. ------------------------ Let's begin by looking at os.c. It is a skeleton of a potential OS. It should all look familiar from our discussions of our OS. We next take a quick look at our assembly source code. You will see some unfamiliar notation in these files. These are link directives for lcc's linker. Linking in lcc is done at the assembly level, and these files are actually written as lcc link objects. First, take a look at utils.asm. These are our standard OS functions, extended a bit by adding in lcc linking directives and stack-protocol based parameter passing. Obviously, the operations in utils.asm cannot be expressed in C nor called directly by C because (1) they operate directly on the hardware, (2) they do not obey stack protocol, and (3) they are meant to be called via the TRAP instruction. (The exception is setVTentry, which is C callable and is there as a convenience to make simulating booting the OS easier.) Next, look at lib.asm. These library routines are called via C function calls, and are simply there to make TRAP calls, which are impossible to express in C. They also handle converting from stack-based parameter passing to register-based parameter passing that TRAP calls expect. In real systems, the lib.asm library routines would be available to users, the OS utility code would not. The lcc compiler first processes os.c, producing an .asm link object. It then links in lib.asm and utils.asm, producing a.asm. The Makefile shows how we edit a.asm to fix the .ORIG. You will see some funny looking code in os.asm: lcc is a stupid compiler; so, it produces some stupid code. It gets the job done, but in some weird ways. The compiler then runs lc3as on os.asm to get os.obj and os.sym. Also linked in is library code for printf, which can be found in lcc's library in bin/lcc-1.3/. Scan through os.asm to see how it is layed out. Go all the way to the end of the file. You will see many labels that were generated by lcc. When you get to the Global Data Table, you will see assembly directives like this, .STRINGZ "hi" The result is equivalent to this, .FILL x0068 ;-- ASCII 'h' .FILL x0069 ;-- ASCII 'i' .FILL x0000 ;-- ASCII NUL which is a "C-style, null-terminated string". ------------------------- -- Step 3. Trace execution and answer questions. ------------------------- Put a copy of PennSim.jar in your src2/os-C-sources/, start it running, and load os.obj. Use "Step" to execute the code one instruction at a time. Step through the program up to, but not including, execution of the first JSSR. You are stepping through the PREAMBLE. Q. What is the stack pointer, SP aka R6, initially set to by the PREAMBLE? What is the base pointer, BP aka R5, set to? What is the Global Data Pointer, GDP aka R4, set to? Now, execute ("Step") that JSSR. Q. What function did it jump to? Where is the source code for that function? At what address does this function begin? When the function finishes and does RET (aka JMP R7), where will it return to? What will happen immediately after that and what is its the purpose? The next 7 instructions are lcc's C protocol for entering a function and setting up its call frame. R7 is pushed, and BP/R5 is pushed. After that, space is allocated for local variables on the stack. The BP is set to the first local variable and SP/R6 is set to the last local variable (which might be the same location if there are none or only one local variable: one word is always allocated). You will find it informative to look as os.asm at this point as it includes comments, which you cannot see in PennSim, of course. Q. Draw a picture of the stack. Show the bottom of the stack, its address, and all the words that have been allocated so far. Show the current words pointed to by the SP and BP. What addresses do those words contain? What are those addresses? How many local variables were allocated? Show the values that were pushed. You can use PennSim's "l" command to see the memory content of the stack. Just "l" where "addr" is the 16-bit memory address you want to see. Or, you can simply scroll through memory to see the stack. You can get back where you were by editing the PC's value subtracting 1, click any other PennSim area, then restore the PC's old value. You can hop around in memory this way, i.e., by altering the PC value. Just remember what it was so you can restore it before you do your next execution "Step". The next two instructions fetch a value from the Global Data table (GDT) addressed via R4 (the Global Data Pointer, GDP). The first instruction just gets something into R7. "Step" it. Q. What address is in R7? What label is associated with that address? You can figure this out by looking at the address in R4 and the offset that was added to it. Scroll down in PennSim's memory display and look at what label is found there. NOTE: in a real machine, no symbolic information is availabe, only machine code. PennSim uses os.sym to figure out the labels associated with addresses. The C compiler generates most of the labels you see. Q. Because we are looking at the assembled code, labels that were in os.asm might not be available or used by PennSim. Take a look at os.asm. Find that label you just found. There is a second label associated with the same address. What is it? What is the last label of the Global Data Table (GDT)? What value is in the last GDT entry? The next instruction actually does the data fetch using R7's content as an address and storing what is fetched into R7. "Step" that instruction. Q. That value is associated with some C statement in os.c. Which statement? Recall which function we are currently in. Hint: this compiler does not produce executable code for a C variable declaration. Is there a variable referenced or a constant value in this statement? The value in R7 is either a constant value or the value of a variable. Which is it? The next 11 instructions use R4 to fetch something into R3. The first 10 instructions load R3 with the value of R4 plus a constant. R3 then has an address in it. The way the constant gets added looks really stupid: it takes 10 instructions to get the constant (#150 or x96) added. "Step" through those 10 instructions. Q. What address is now in R3? What value is in that location? Q. R3 now contains an address. Look at that address in PennSim's memory and find the associated label. What label? Look in os.asm to find that label. Is this part of the GDT? How many words are in the GDT according to os.asm's content? Is the constant added into R3 to form an address bigger than the number of words in the GDT in os.asm? Hint: the strings roughly average 12 chars each. What is your estimate of the size of the GDT? Q. The next 3 instructions perform an operation using R7 and R3. What is the operation? This is immediately followed by a BR instruction. What is being tested here? Hint: look at os.c, is there an "if" statement? According to the BR instruction, if we do not take the BR, what condition is true? Would that be the TRUE or FALSE (THEN or ELSE) part of the if() statement? Note the 3 instructions immediately following the BR. Look in the GDT to see what address the RET (JMP R7) will jump to if the BR is not taken. Set a breakpoint on that instruction (click the box to the left of the where the address for that instruction is displayed; it will turn red). The 17 instructions below the RET prepare the stack for a function call. (Actually, below that are also several more function calls.) Q. As you "Step" through these 17 instructions, notice what values are being pushed on the stack. Draw the stack's contents as it is after you have done that, showing where the SP is pointing. The next 3 instructions prepare to jump to a subroutine: R0 is loaded from the GDT and then "JSSR R0" makes the jump. Step through the first two, but do not execute the "LDR R0, R0, #0" instruction. Q. R0 gets an address of an entry in the GDT. What is the address of the entry? Look at the content of that entry in the GDT. What value is stored there? What label is associated with that entry? Look at the memory location addressed by that entry's value. What label do you find? Now "Step" the next instruction. It loads R0 with what content? Now "Step" the JSSR instruction. What function have we entered? In which source file is this function defined? Note: we have just dereferenced a function pointer variable. As we noted in class, array names and function names are pointer variables in C. We have jumped into a function. The function call in os.c shows two arguments. The first is a function pointer, the second is a memory address. Two arguments were pushed to the stack just before we jumped to this function. The first C argument was pushed last. What was pushed for that argument was the address of a function pointer variable. The next few instructions will save R0 and R1 on the stack, then get the arguments into R0 and R1. Watch R0 as it gets loaded. Q. First R0 gets the pointer variable's address. Then what does it get? Look in memory at that location. What label is associated with the address in R0? We are executing setVTentry. If you look at the comments in utils.asm for this function, you will see a C prototype declaration. The first argument is a pointer to a function whose type is "void f( void )". A function pointer in an argument list of a C routine would look like, int foo ( void (*f_ptr)( void ) ) { ... body of foo ... } and from that we would assume the formal argument, f_ptr, is a pointer variable that when deferenced provides the address of a function. We would invoke or jump to that function using this syntax in the body of foo: (*f_ptr)(); which says, find the memory location associated with the variable f_ptr, get its content, and use that as a jump address. Stepping execution a bit more, the content of R0 is written to memory. Q. What memory location is written into? What value is written to that location? What have we just accomplished? Continuing stepping until you hit a RET, and Step it. We are back in main(). There are 3 more setVTentry() function calls, but we will not step through them. Instead, "Continue" execution. This will run until it hits the breakpoint we set earlier. Now Step until we make a function call. Q. What function have we jumped into? In which source file is it defined? Keep Stepping until you reach the next JSSR. As you do this, watch what gets pushed to the stack. Notice that we have just entered a function; so, the function's startup code executes, setting up its call frame. Some space is allocated on the stack for a local variable. The stack handling by lcc is a bit buggy; so, don't worry if it doesn't look quite right. It so happens the our stack is located in PennSim's memory-mapped video VRAM area. You can see the pixels in PennSim's Devices area have changed color as non-zero values were pushed. This is a programming error, given PennSim's architecture model. But, the standard LC3 does not have VRAM. We are about to jump to do_printc_trap(). This is a library routine that executes TRAP x21. The argument was pushed to the stack. Before that, the return address was pushed to the stack. Q. Show the stack as it is now. How many call frames are on the stack? What is the current return address on the stack (i.e., the address the do_printc_trap() will use to exit and return)? The argument on the stack has what value? What does that value represent? Step into do_printc_trap(). It saves a couple of registers, gets its argument off the stack, puts it into R0, then does TRAP x21. Step through that and make the TRAP call. Step through the TRAP routine, watching how it saves registers, gets its argument, and make its return. Q. Does TRAP x20 (aka printc_trap in utils.asm) use the stack at all? Did the LC3 display change (see PennSim's Devices)? How does it make its return and where did it get its return address? Continue Stepping until you exit do_printc_trap(), watching how it gets its return address just before exiting. Q. Does do_printc_trap() get its return address the same way that printc_trap() did? The lcc library includes printf(), which uses TRAP x21 to output characters. If you Step far enough, you will enter the PRINTF lcc library routine. Q. Does PRINTF use do_printc_trap()? What are the arguments to the first call to PRINTF? How does PRINTF get its arguments? How many times will this first call to PRINTF call TRAP x21? Running this code long enough will call SCANF to get keyboard input. If you click on the text output area of PennSim's Devices, you can then type and the keystrokes will be read by TRAP x20. Q. Does SCANF ever jump into code in utils.asm? Q. (Extra Credit) How does SCANF return its return value to the calling code?