by _dosearf arf arf
/* first.c */
#include <stdio.h>
int function1() {
printf("\tIn function 1.\n");
sleep(2);
printf("\tCalling function 2.\n");
function2();
printf("\tReturned from funtion 2.\n");
sleep(2);
printf("\tReturning from function 1.\n");
return(0);
}
int function2() {
printf("\t\tIn function 2.\n");
sleep(2);
printf("\t\tReturning from function 2.\n");
sleep(2);
return(0);
}
int main() {
printf("In main()..\n");
printf("Calling function 1.\n");
function1();
printf("Returned from function 1.\n");
sleep(2);
return(0);
}
Why this silly and pointless program? - Well, for illustration. Compile it
the usual way and dasm it. We'll also start using gdb and usually you'll
want to compile your code with '-g' to so the compiler adds debugging
symbols to your object. However, we like to make life difficult for
ourselves, so skip that.
Possible reference to string:
"In main().."
080484d5 <main+0x9> push $0x80485ff
Reference to function :
printf@@GLIBC_2.0
080484da <main+0xe> call 08048340 <_init+0x84>
080484df <main+0x13> add $0x10,%esp
Now obviously, the number being push'd at 0x9 does not contain the string
"In main()..". It's the address where the string is stored. And the call at
0xe hooks to the printf function in libc. Look at the dynamic symbol table
in the dasm file...
DYNAMIC SYMBOL TABLE:
08048300 w DF *UND* 0000007d GLIBC_2.0 __register_frame_info
08048310 w DF *UND* 000000a9 GLIBC_2.0 __deregister_frame_info
08048320 DF *UND* 0000016e GLIBC_2.0 sleep
08048330 DF *UND* 00000118 GLIBC_2.0 __libc_start_main
08048340 DF *UND* 0000002f GLIBC_2.0 printf
0804856c g DO .rodata 00000004 Base _IO_stdin_used
00000000 w D *UND* 00000000 __gmon_start__
This shows us the 'off-shore' routines called from within our program. We
could compile the program as static, then the code for the sleep() and
printf() routines would be compiled into the program (and no longer appear
in the dynamic symbol table..)
$ gdb ./first
(no debugging symbols found)...
(gdb) br *0x080484d5
Breakpoint 1 at 0x80484d5
(gdb) run
Starting program: /home/dose/work/start/first
(no debugging symbols found)...
(gdb) disassemble 0x80484d5 0x80484df
Dump of assembler code from 0x80484d5 to 0x80484df:
0x80484d5 <main+9>: pushl $0x80485ff
0x80484da <main+14>: call 0x8048340 <printf>
End of assembler dump.
(gdb) x/s 0x80485ff
0x80485ff <_IO_stdin_used+147>: "In main()..\n"
Here we've set a breakpoint at address 0x80484d5. When run, the program
breaks when the instruction flow reaches this address and we get a gdb
prompt again. Disassembling the next two instructions shows us the same as
in the dasm dump. The command x/s address literally means, 'examine the
string at address'. We can use a repeat count to show other strings.
(gdb) x/5s 0x80485ff
0x80485ff <_IO_stdin_used+147>: "In main()..\n"
0x804860c <_IO_stdin_used+160>: "Calling function 1.\n"
0x8048621 <_IO_stdin_used+181>: "Returned from function 1.\n"
0x804863c: ""
0x804863d: ""
(gdb)
The strings are all NULL terminated, which we can see like this..
(gdb) x/14xb 0x80485ff
0x80485ff <_IO_stdin_used+147>: 0x49 0x6e 0x20 0x6d 0x61
0x69 0x6e 0x28
0x8048607 <_IO_stdin_used+155>: 0x29 0x2e 0x2e 0x0a 0x00
0x43
(gdb)
x/14xb 0x80485ff means 'examine 14 bytes in hex from address 0x80485ff'.
The string "In main()..\n" is 12 characters long ('\n' is one character),
so at position 13 we expected a 0x00 (NULL) and there it is. At position 14
we see the first character of the second string, whose address is
0x804860c. (Yes, 0x43 is ASCII for 'C'). Try a 'help x' for more possible
usages for it.
Next, we'll have a look at the stack by setting a few more breakpoints.
We can also set breakpoints by simply using the function name instead of
using the address. If the program was compiled with debugging symbols, you
can also break on source line number. We're still in the same gdb session,
by the way..
(gdb) br function1
Breakpoint 2 at 0x8048416
(gdb) br function2
Breakpoint 3 at 0x804848a
(gdb) stop
(gdb) info breakpoints
Num Type Disp Enb Address What
1 breakpoint keep y 0x080484d5 <main+9>
breakpoint already hit 1 time
2 breakpoint keep y 0x08048416 <function1+6>
3 breakpoint keep y 0x0804848a <function2+6>
(gdb) run
Starting program: /home/dose/work/start/first
Breakpoint 1, 0x80484d5 in main ()
(gdb) bt
#0 0x80484d5 in main ()
#1 0x40031a12 in ()
'br' is short for 'breakpoint'. As you can see, the breakpoint on a
function is set straight after the function prolog, which is 3 instructions
or 6 bytes long. (See 'function+6' in the breakpoint list).
'bt' is short for 'backtrace'. It shows the current stack frame as '0' and
previous stack frame as '1', etc. The function name, where known, is also
shown and the address shown is the Instruction Pointer of that
Stack Frame (i.e. the instruction right after the 'call' for most previous
stack frames). In this case it's the address of the breakpoint we've set.
We can verify this by x'ing the address displayed. Or x'ing $eip.
(gdb) x/2i 0x80484d5
0x80484d5 <main+9>: pushl $0x80485ff
0x80484da <main+14>: call 0x8048340 <printf>
(gdb) c
Continuing.
In main()..
Calling function 1.
Breakpoint 2, 0x8048416 in function1 ()
(gdb) bt
#0 0x8048416 in function1 ()
#1 0x80484f7 in main ()
#2 0x40031a12 in ()
(gdb) x/2i 0x80484f7
0x80484f7 <main+43>: addl $0xfffffff4,%esp
0x80484fa <main+46>: pushl $0x8048621
(gdb) x $eip
0x8048416 <function1+6>: addl $0xfffffff4,%esp
If you remember some of the stack building stuff from the previous
document, you'll know that the %esp register holds the Stack Pointer, which
points to the current top of the stack. The %ebp register holds the Base
Pointer or Frame Pointer. It's also been called the Frame Base Pointer. I
use all of these terms .. The frame pointer holds the address of the
beginning of the current stack frame. It is primarily used to reference
local variables (for which room is made on the stack first) and arguments
to the function, (which are push'd onto the stack before the call). The
compiler could of course reference each of these relative to the current
stack pointer, but that would involve a lot of overhead as it keeps
changing. Is this for real, you may ask. Well, see for yourself..
(gdb) bt
#0 0x8048416 in function1 ()
#1 0x80484f7 in main ()
#2 0x40031a12 in ()
(gdb) x/a $ebp+4
0xbffffd00: 0x80484f7 <main+43>
As you can see, we're in function1() and the address 4 bytes below the
address of our Frame Base Pointer (%ebp) contains the location of our
return address, where execution will continue once we return from
function1() back into main(). Of course, we're not that far yet. First
function1() is going to call function2() which will break after the
function prolog.
(gdb) c
Continuing.
In function 1.
Calling function 2.
Breakpoint 3, 0x804848a in function2 ()
(gdb) bt
#0 0x804848a in function2 ()
#1 0x8048448 in function1 ()
#2 0x80484f7 in main ()
#3 0x40031a12 in ()
(gdb) x/a $ebp+4
0xbffffcf0: 0x8048448 <function1+56>
This shouldn't surprise you. The same result as at breakpoint 2, only now
we're in yet another stack frame - the one belonging to function2(). If a
function is recursive, a seperate stack is created for each instance of
that function, by the way. After this, the program will exit without
breaking any more. Even though we set 2 breakpoints on functions, these
breakpoints are set by gdb on exact memory addresses (right after the
prolog...) and these addresses won't be reached again.
(gdb) c
Continuing.
In function 2.
Returning from function 2.
Returned from funtion 2.
Returning from function 1.
Returned from function 1.
Program exited normally.
(gdb)
finale
Well, there you have it. You did need gdb for this one. Nothing very
special was discussed - but I warned you for that in the first paragraph.
Hope you enjoyed it and a big 'Hi there!' to the same people as in part I.
_dose
02/2000