=========================================== Lec-1-HW-2-Linking.html Load objects, executable objects, low-level tools ========================================== This is about object files, C conventions, and linking. The first part has you look at a link object "f.o" and a load object "f.out" produced by gcc. What to turn in: 1. A cover sheet w/ the usual info (name, course, date). Also include notes about what you worked on, and where the files are in your branch. Attached to the coversheet or on it, include your answers to any questions. 1. Create this C program (save as "f.c"): int main( void ){ return(0); } 2. Compile it with these commands: gcc -Wall -Wextra -c f.c gcc -o f.out f.o gcc -S f.c 3. Using "ls -l", find the sizes of the objects f.c, f.o, f.out, and f.s. Q. What size is the link object (f.o)? Q. What size is the load object (executable, f.out)? (Hint, you may need to "man ls" or "info ls" to understand its output.) Q. Do "cat f.s". How many instructions did gcc produce from f.c? Q. Aside from instructions, what else is there in f.s? 4. Dump the bytes of f.c as ASCII "od -t a f.c", and as hex bytes "od -t x1 f.c". NB--In "-t a" mode, non-printable chars are shown by their ASCII name: e.g., the newline character (ASCII code x0a) is shown a "nl" and the control character "cancel" (ASCII code x18) is shown as "can". Other 8-bit values that do not represent any character are shown as 2-digit hex numbers or "?". NB--Lines that are duplicates are not printed, "*" is shown for blocks of such lines. Use the "-v" flag to see all the lines, ie., every byte. Q. What is the hex code for ASCII space, and at what byte does it first appear? NB--Use the "-A d" flag to see the byte offsets in decimal. 5. Dump f.o as hex-coded bytes and ASCII chars: od -t x1 f.o od -t a f.o Q. Find the first ascii space. Is that in a string, or is it just coincidence that some byte in the file matches that code? Q. Do you see any interesting character strings? In which output? What do you think they refer to? Q. What do you think follows them? 6. Dump f.o as hex-coded 4B words: "od -t x4 f.o" Q. Do you see any interesting 4B hex numbers? 7. Dump f as chars: "od -t c f.out" and "od -t a f.out". Q. See anything interesting? Any guess what you've found? Q. How many sections do you guess the f.out format has? NB--For more on object files, see http://en.wikipedia.org/wiki/Object_file_format. 8. Dump the load object's text (executable code) segment: on unix/cygwin: objdump -j .text -d f.out on osx: otool -tV f.out Although the text's machine instructions have been disassembled into Intel assembly language (on a Intel Mac or PC), or PowerPC assembly (non-Intel Mac), the language is not too difficult. Register references may look a little odd, but you can hunt for things like "sp", "bp", "add", "mov", "call", and so forth, and guess their meaning. Intel and ATT assembly syntax differ in whether the destination is on the left (Intel) or the right (ATT). You can guess which by looking for instructions with immediate data, it never is a destination. The machine code itself is the same whether it is represented as Intel or as ATT assembly language, provided it is compiled for the same ISA. Q. How many function calls can you see? Q. How many instructions in "main"? Q. Identify the instruction that jumps back to the OS, as best you can guess. Q. Labels containing "dyld" appear. What do you think these section of code have to do with? 9. Dump the load object's data and relocation sections: otool -dv f.out or objdump -j .data f otool -rv f.out or objdump -r Q. Per our discussion of how memory is layed out at runtime, what do you think should be in the data section for this program? Q. Are there any relocatable items, what are they, or why aren't there any? Relocatable means, "needs address fixed at load time".