by _dosenow you see it
/* anti1.c */
int main()
{
__asm__("
jmp antidebug1 + 2
antidebug1:
.short 0xc606
movl $0x0, %eax
xorl %eax, %eax
");
return(0);
}
As you can see, we jump straight to the third byte of antidebug1:
when disassembling, the disassembly will start at the first byte (0x06) and
will no longer be aligned with the execution. The two statements setting
eax to zero I've just added for reference.
$ gcc -o anti1 anti1.c
$ ./anti1;echo $?
0
$ objdump -d --show-raw-insn ./anti1 > ./anti1.dump
In the dump file we find the following.
080483b0 <main>:
80483b0: 55 push %ebp
80483b1: 89 e5 mov %esp,%ebp
80483b3: eb 02 jmp 80483b7 <antidebug1+0x2>
080483b5 <antidebug1>:
80483b5: 06 push %es
80483b6: c6 b8 00 00 00 00 31 movb $0x31,0x0(%eax)
80483bd: c0 (bad)
80483be: 31 c0 xor %eax,%eax
80483c0: eb 00 jmp 80483c2 <antidebug1+0xd>
80483c2: c9 leave
80483c3: c3 ret
80483c4: 90 nop
Now, we can see that the flow jmp's from 0x80483b3 to 0x80483b7 and that
0x80483b7 is not an instruction starting point in the disassembly dump.
Now to verify the code starting at 0x80483b7. There are several ways to do
that. One of them is to dump the bytes starting at 0x80483b7 to a file and
then do a raw disassembly. I'm using nasm here, so watch out as nasm (and
ndisasm) use Intel syntax.
$ hexdump raw.hex
0000000 00b8 0000 3100 31c0 ebc0 c900 90c3 9090
0000010 9090 9090 0a90
$ ndisasm -u raw.hex
00000000 B800000000 mov eax,0x0
00000005 31C0 xor eax,eax
00000007 31C0 xor eax,eax
00000009 EB00 jmp short 0xb
0000000B C9 leave
0000000C C3 ret
0000000D 90 nop
0000000E 90 nop
Now this looks more familiar. We can recognize our original statements (as
well as the fact that this code isn't optimized..) followed by a setting of
eax to 0 and exit.
080483b5 <antidebug1>:
80483b5: 06 push %es
80483b6: c6 b8 00 00 00 00 31 movb $0x31,0x0(%eax)
80483bd: c0 (bad)
Now the instruction at 0x80483b5 (0x06 - push %es) is a one byte opcode and
is never executed. The byte at 0x80483b6 (0xc6) is the beginning of an
instruction that is longer than one byte and therefore breaks the
disassembly. If we replace these two with the 'nop' (No OPeration)
instruction (0x90), which is one byte long, we should have a correct
disassembly. So, fire up the hexeditor and change the two bytes to 0x90.
Disassemble it with objdump, same as before and look at the position
0x80483b7.
080483b5 <antidebug1>:
80483b5: 90 nop
80483b6: 90 nop
80483b7: b8 00 00 00 00 mov $0x0,%eax
80483bc: 31 c0 xor %eax,%eax
Et voila, a correctly aligned disassembly.
/* anti2.c */
int main()
{
__asm__("
movl $(antidebug2+2), %eax
jmp *%eax
antidebug2:
.short 0xc606
movl $0x0, %eax
xorl %eax, %eax
");
return(0);
}
This one is almost the same as anti1.c. However, it uses a nasty trick
called 'indirect jumping' (the 'jmp *%eax' instruction) and the disassembly
is broken in a similar manner. Here's a partial dump.
In this example it is of course easy to see what %eax contains at
0x80483b8. In many cases the location of the jump point will be built
during a procedure, making it a lot harder to find. So we use GDB to find
out what's in the register.
80483b3: b8 bc 83 04 08 mov $0x80483bc,%eax
80483b8: ff e0 jmp *%eax
080483ba <antidebug2>:
80483ba: 06 push %es
80483bb: c6 b8 00 00 00 00 31 movb $0x31,0x0(%eax)
Notice the difference between the disassembly created by the 'disas'
command (which even if called with 'disas 0x80483bc' will disassemble the
antidebug2 routine starting at 0x80483ba...) and the examination of data at
location 0x80483bc as decoded instructions.
(gdb) disas main
Dump of assembler code for function main:
0x80483b0 <main>: pushl %ebp
0x80483b1 <main+1>: movl %esp,%ebp
0x80483b3 <main+3>: movl $0x80483bc,%eax
0x80483b8 <main+8>: jmp *%eax
End of assembler dump.
(gdb) br *0x80483b8
Breakpoint 1 at 0x80483b8
(gdb) r
Starting program: /home/dose/work/anti/a2/anti2
(no debugging symbols found)...
Breakpoint 1, 0x80483b8 in main ()
(gdb) x/i $eax
0x80483bc <antidebug2+2>: movl $0x0,%eax
(gdb) disas antidebug2
Dump of assembler code for function antidebug2:
0x80483ba <antidebug2>: pushl %es
0x80483bb <antidebug2+1>: movb $0x31,0x0(%eax)
0x80483c2 <antidebug2+8>: (bad)
<i>[stuff deleted]</i>
(gdb) x/6i $eax
0x80483bc <antidebug2+2>: movl $0x0,%eax
0x80483c1 <antidebug2+7>: xorl %eax,%eax
0x80483c3 <antidebug2+9>: xorl %eax,%eax
0x80483c5 <antidebug2+11>: jmp 0x80483c7 <antidebug2+13>
0x80483c7 <antidebug2+13>: leave
0x80483c8 <antidebug2+14>: ret
Update: mammon_ pointed out to me that the indirect jumping is actually a pretty common construction in object oriented code. Being lazy, I'll just quote him.The register trick brings up an interesting point: OOP code. Many oop handlers use 'call [eax]' to call object functions, often with the 'this' pointer passed in EDI [ESI?] or on the stack. In relation to your essay, this should not be confused with an anti-debugging trick; the value in the register will be different with every call.So that's work for later. Thanks again, _m.
How to do this when you haven't written this yourself? Well, there are two
ways, as far as I can see. One is to look for jmp and call instructions
that act on addresses contained in a register. Using grep to locate them in
the dead listing ([jmp|call] && \*\%) and setting breakpoints on these
locations, after which you examine the registers each time and compare the
address in there to the instruction offsets in the dead listing.
If there are too many of these 'call/jmp *%register' constructions, the alternative is to break the program at various times (when control is not inside a library function). Then do a backtrace and a comparison of $eip with the instruction offsets in the dead listing. Once they are out of sync it becomes a process of narrowing down the possible code, maybe combined with the grep listing of the first approach.
_dose 03/2000