MOV Before You JMP

by Vuk Ivanovic (vuk.ivanovic9000@gmail.com)

Get it?  Never mind.  Long time ago, when I started my journey into hacking, one of the best pointers that I read on one of the many hacking related websites was the importance of learning to code in order to understand how programs work and how they can be made to do things that they weren't initially meant to do.

I followed the advice with some mistakes; I started with QBasic and Visual Basic.  And, as some may know, QBasic/Visual Basic and hacking have nothing in common (mostly).  If one wants to learn about discovering vulnerabilities and developing exploits, those two programming languages are not a way to go - at all.  On the other hand, coding in those (especially Visual Basic) got my creative juices flowing, until I realized that every program that I wanted to make already existed.  And then I remembered why I started coding in the first place.  It was all about understanding what makes various things tick: clocks, computers, video games, TV sets, humans and so on.  In order to better understand that aspect, I decided to go back to the well and to seek deeper.  What I learned was that I started with the wrong programming languages.  The right path was, and still is, C and assembly, especially assembly (for exploits, shell codes, and pretty much every thing).

This isn't a crash course in any of the programming languages; this is about the importance of assembly and knowing the building blocks of whatever the big picture you are interested in may be.  In order to better demonstrate what I mean by the title, following are some really rough and really basicexamples of code to compare two numbers:

In Assembly (32-bit, Linux):     In C:                In PHP:

mov eax, 1                       int x = 1;           $x = 1;
mov ebx, 2                       int y = 2;           $y = 2;
cmp eax, ebx                     if (x<>y) {          if ($x<>$y) {
jne not_equal_function             goto not_equal;      not_equal_func();
                                 }                    }

For some, perhaps many, who are into coding, all of the examples above make perfect sense (except for using goto, but it's just an example), and for others who haven't dealt with assembly before, the assembly example may be confusing (and even if it's not, take this under consideration: assembly codingis different in Windows - note also the 32-bit part because 64-bit code differs from 32-bit).  Furthermore, unlike PHP, even doing a simple output of a string requires the following three lines of assembly:

mov ebx, 1   ; write to the STDOUT file
mov eax, 4  ; invoke SYS_WRITE (kernel opcode 4)
int 80h

There's also the thing about how words, sentences, and numbers are defined in assembly.  It's somewhat easier in C and pretty much a joke in PHP.  And, to be honest, after understanding the logic of assembly and the somewhat similar approach in C, every other high-level programming language is easier to understand by just looking at the code and what it does when compiled/executed.

When I started getting deeper into vulnerability research and exploit development, I had to learn about fuzzers.  The most popular and yet easy to use fuzzers are in Python.  Here's the catch: I haven't read a single "hello world" example in Python, let alone messing around with sockets and networking, and yet by just reading the Python code it all made perfect sense.

In truth, I did have to look up how to specify different types of network sockets (UDP instead of TCP), but that was it.  And, yes, I did get confused a couple of times when I got errors regarding indentation - that's how little I was aware of anything Python (other than Monty Python, Ni!).  And since then, I have managed to go through PHP, JavaScript, Ruby, MEAN stack, and probably whatever comes up next.  Granted, MEAN stack and any framework-based coding does require looking into tutorials because of how various files/modules/views/what ever are organized, but the coding part is pretty much as logical as it has always been.

Now, to turn it all toward hacking, while it's true that programming syntax is constantly evolving and its goal is to make coding easy for anyone, the most important and most fun programs/services to exploit are still coded in C (most recent: OpenSSL = Heartbleed and Bash = Shellshock, and whatever comes up by the time this issue gets out).

In order to find a vulnerability and write an exploit, one needs to know assembly (at least the basics of it), and then there are times when one needs to know more about it (when it comes to shell size because size matters a lot when it comes to exploit development).  While it's true that there are ways to go around assembly, in the long run it's invaluable to know at least some of it.

Return to $2600 Index