Tuesday, October 24, 2006

Defeating Dean Edwards' Javascript Packer

Today a friend passed me some obfuscated javascript and asked if I would help him decode it. I had a quick look at it and saw the following code fragment:

eval(function(p,a,c,k,e,d){ ...

This made me smile. p,a,c,k,e,d... Cute. After a little research, I found this to be a signature function of Dean Edwards javascript packer. I found 2 very quick ways to deobfuscate code packed with this packer, and I have decided to share them with you. By presenting this my intention is NOT to insult Dean's work in any way, when it comes to javascript, and a few other topics, Dean is lightyears ahead of me. I present this to provide the community with a few methodologies they might use to approach these scenarios in the future, should it be necessary.

Here's a link to the packer:


The first method does not attack the obfuscation technique, but a feature of the packer interface. Built RIGHT IN to the interface is a decode button. This has been designed to only allow decoding of your own code. If you look at the interface, you can't paste into the top textarea (because of the readonly attribute) and the decode button is disabled until you've actually packed something. This design is inherently flawed as it depends on the browser.

I wrote a bookmarklet called reEnable that will remove the readonly attribute from textfields and the disabled attribute from other html elements.

reEnable: Bookmark this :)

Using reEnable you can paste obfuscated code into the interface and decode it with a click.

The second method attacks a weakness in the obfuscation technique itself. Here is some code


document.write('Hello World!');

and packed:

return c};if(!''.replace(/^/,String)){while
p}('3.0(\'1 2!\');',4,4,'write|Hello|World

(I had to force multiple lines to avoid it breaking my layout lol)

it looks like a lot of randomness, but the weakness is pretty easy to spot if you know what you're looking for.

Notice near the bottom all the code is there, just rearanged with the syntax punctuation removed? if you look even closer the syntax punctuation is there as well, only seperate from the strings. If we looked at this long enough, we might be able to write a utility to unpack it... but there's an even easier way. Notice how everything is wrapped in an eval()? well, we don't want the code to be evaluated, we want to SEE it... If you replace the opening eval( with document.write( ... all the code is dumped to the screen. If some of it is being interpereted as html rather than being displayed consider forcing your document.write to write between <textarea></textarea> tags.


Friday, September 15, 2006

Chasing Wild Geese? ...Keep Chasing.

I'm BACK! Sorry for the hiatus, I was preparing for and starting school. Now that things have gotten into a bit of a groove, I can get back on the HACK. The title for todays post is sort of tongue-in-cheek. I spent a lot of time chasing my tail on this particular project, and what fun would I be if I couldn't have a laugh at my own expense? The main point is that, even though I got stopped up a few times... I didn't give up, and in the end I got positive results.

My friend Dustin passed me the source for a c program he developed. Its purpose is to encrypt files. After looking it over, I told him I could defeat his encryption and he told me to give it a try, here's how the project played out. (keep in mind, I'm not a cryptanalyst... and I might very well be missing something or even dead wrong, therefore I'd LOVE to hear from anyone with advice or constructive criticism.)

The program is shush.c, and this is how it works. You provide the program with two pieces of information, the name of the file to encrypt and an integer in the set 0 - 3435973836. This integer becomes your KEY. Shush then encrypts the file using an XOR encryption scheme. If you don't know what XOR is, or you don't have a very good grasp of XOR read wikipedias entry for truth tables. Right away, you're probably seeing the #1 problem with this encryption scheme. Too small a keyspace, or in other words, there are so few possible keys that this can be easily brute forced.

Lets examine: there are approximately 3.4 billion possible keys (0-3435973836). Compare this to a simple 8 character password using only characters from the set [a-z]. The keyspace here is 26^8 or 208827064576 which is over 208 billion possible keys. How long would it take your password cracker to incrementally crack a password that was known to be 8 characters in the set [a-z]? Not very long, and that password is over 65 times stronger than any key for this program.

Now here's the difference between hacking and ...cracking or whatever other awkward label you use. I could stop here, write a little program to implement the brute forcing and be done with it, or I could keep hacking at it to see if there's an even more efficient way. You know me, of course I'll keep going... And heres where all the trouble started ;)

Lets take a look at how shush uses the key to encrypt the file (the shush.c source is available below this description, use it as a reference). When you give shush your integer, shush passes it to a function called srand(). This function uses the integer as an initial value for generating "random" numbers with the rand() function. This concept is called seeding. To be as basic as I possibly can... The SEED is a number that the random number generator does weird math stuff with. Once its done its weird math stuff, the result is the random number thats been generated, that result ALSO becomes the new seed. Here's some psuedo code to represent this:

seed = 31337

Generate random number:
random_number = seed + 1234 * 4321 / 7
seed = random_number

Now this is VERY arbitrary. The "weird math stuff" is actually quite technical, but far from within the scope of this post. For more information on this stuff research Linear congruential generators.

The point to note here is that this "weird math stuff" is an algorithm, and if you think about it... there is nothing truly "random" about mathematical algorithms. Thats why we call this algorithmic randomness "psuedorandom". And we call this generator a "psuedorandom number generator" (hereafter PRNG).

It might seem like i'm on a tangent, but trust me its relevant.

So, basically You give shush an integer, this integer becomes a SEED. Shush then gets into a loop. This loop reads in one byte, and performs an XOR operation on that byte against the result of a call to the PRNG. This PRNG is actually the function rand(). Keep that psuedo code in mind, everytime rand() is called, the seed changes, and so, the next time its called the result is different. The result of this XOR operation becomes the value for the encrypted byte, and so is written out to the encrypted file.

Here is shush.c

#include <stdio.h>
#include <stdlib.h>

void XOR(char filename[], unsigned key)
FILE * in, * out;
unsigned ascii;
char tempfilename[9999];
if ((in = fopen(filename, "rb")) == NULL) printf("Error: Could not open: %s\n",filename);
else if ((out = fopen(tempfilename, "wb")) == NULL) printf("Error: Could not open: %s\n",tempfilename);
while((ascii = fgetc(in)) != EOF) fputc(ascii^rand(),out);

void main(unsigned argc, char * argv[])
printf("XOR File Encryption v1.0\nProgrammed by xxx\nEmail: xxx\n\n");
if (argc==1) printf("Error: No file(s) supplied.\n");
unsigned key, counter=1;
while (counter<argc)
printf("[%u] %s\n",counter,argv[counter]);
printf("\nKey [0:3435973836]: ");
scanf("%u", &key);
printf("\nOne moment please. Processing...\nTo cancel, press ctrl + c\n\nStatus:\n");
while (counter<argc)
printf("[%u] %s...",counter,argv[counter]);

If you're following me, this will make sense. Based on the fact that rand() uses a static algorithm, the sequence of numbers it returns will always be the same sequence for a particular seed. With a symmetrical stream cipher like this XOR scheme, this is useful. I'll demonstrate.

Lets look at this sequence of numbers: 1, 2, 3, 4. We'll use them to "XOR encrypt" these 4 values: 5, 5, 5, 5. In a realworld example, these numbers would be numerical representations of the byte's from the file.

as XOR is a bitwise operator, Lets see these in binary form:

00000101 00000101 00000101 00000101 (5, 5, 5, 5: unencrypted)
00000001 00000010 00000011 00000100 (1, 2, 3, 4: KEY)
00000100 00000111 00000110 00000001 (4, 7, 6, 1: encrypted)

This is what happens when you run shush on a file... Now, you send that file to someone and they run shush with the same key, and this happens:

00000100 00000111 00000110 00000001 (4, 7, 6, 1: encrypted)
00000001 00000010 00000011 00000100 (1, 2, 3, 4: KEY)
00000101 00000101 00000101 00000101 (5, 5, 5, 5: decrypted!)

This is the concept of a symetrical encryption scheme. Now lets review what we know. The user supplies an integer. This integer seeds rand(). Rand generates a sequence of numbers against which each byte in the file is XOR'd, the results make up the encrypted file.

Now that we understand shush more thoroughly, we can design an attack. My theory was this:

The user supplies a key that seeds rand(), rand() then returns a predefined algorithmic series of numbers. Since this is algorithmic, given any number in the series, I will be able to predict the next number, or even the previous number in the series. Therefore if ANY of those numbers are known to me, I can reverse the algorithm to get the key. With that key, I can decrypt the file. This is pretty standard, the question is how can I find any of these numbers? Well, really I shouldn't be able to... but a weakness in the implementation gives me a leg up. When a file is encrypted with shush, it keeps its filename. This means that, for a lot of file formats, SOME of the plaintext characters will be known. For instance, every valid JPG will have the same first 4 bytes (have a look if you want to verify). These are file headers. The nature of XOR allows us to use these known values to extract part of the key!

(OKAY WAIT. Here is where we're going to get mixed up with our terminology... when I say "key" I'm talking about the integer the user gave to shush. When I say "part of the key" i'm not talking about part of that number, I'm talking about one value returned by rand()... This might be confusing, but assume the integer supplied by the user to be 'the same thing' as the series of numbers returned by rand(). )

Lets look at how this works. Lets say we're looking at the first four bytes of the encrypted file (same values as the last examples) and in this case we KNOW (based on file format specifications) that these bytes were originally 5, 5, 5, 5 before being encrypted.

00000100 00000111 00000110 00000001 (4, 7, 6, 1: encrypted)
00000101 00000101 00000101 00000101 (5, 5, 5, 5: unencrypted)
00000001 00000010 00000011 00000100 (1, 2, 3, 4: KEY!)

Now that we've extracted some of the numbers returned by rand(), all we need to do is reverse the algorithm in rand() to determine what key was given to srand(), and the encryption is defeated. This is A LOT faster than brute forcing... Everything looks good Right? hmm....

If anyone see's a problem here you get props. I didn't see a problem until I started experimenting. My theory seems pretty solid, but (my c++ prof loves to call me on this) I'm making an assumption. XOR is bitwise. Shush takes 1 byte from the file, and XOR's it against 1 integer returned by rand(). Until now, we've been assuming that both of these peices of data were of the same size. Thats assuming that rand() can only generate 8bit integers, making the maximum value 255 (unsigned). This isn't the case, after experimenting with rand() for a bit, its clear that rand() returns values outside that range. So lets see what happens when we XOR two values with different bit lengths, we'll use the same values as before, but we'll change one number from the key:

00000101 00000101 00000101 000000101 (5, 5, 5, 5 : unencrypted)
00000001 00000010 00000011 110010000 (1, 2, 3, 400: KEY)
00000100 00000111 00000110 110010101 (4, 7, 6, 405: encrypted)

Notice that the fourth value returned by rand() this time is to 400, this makes the bit length 9 instead of 8. Since the key changed in bit length, a 0 was padded to the beginning of the value it is being XOR'd against, and the encrypted value ends up being 9 bits as well. Since we're writing byte values out to a file, only 8 of those bits will actually be used, 1 will be discarded. Lost Forever. This complicates our methodology, watch (remember the encrypted value was 110010101, or 405, but when the most significant bit is discarded, it becomes 10010101, or 149):

10010101 (149: encrypted)
00000101 (5 : Known unencrypted value)
10010000 (144: key? ...)

Flawed logic gave us a false positive. the key is 400, not 144... So if rand() returns a value greater than 255, our attack fails. Right? well... To an extent. Think of it on a bit level... The value of the actual key, versus the value or the key we got are actually pretty close when we look at it this way:


Basically, we still know the last 8 bits of the key no matter what. So we could build a set of potentials based how many values rand() can return that end in our 8 bits. In order to do this we need to know the largest possible value rand() can give us. This value changes from compiler to compiler (remember this, I'll come back to it.) and its set in cstdlib/stdlib.h as RAND_MAX. In the compiler I use (mingw), it's set to 0x7fff, or 32767 which is 15 bits. So how many potential values exist when the last 8 bits are known? 2^7 == 128 possible values. Thats a pretty small keyspace, compared to the 3.4 billion we were dealing with before...

Incase anyones lost, quick review:

We xor a byte from the encrypted file against what we KNOW was the original byte before it was encrypted. The result becomes the final 8 bits of what that byte was originaly xor'd with by shush. This gives us a set of 128 potential values. Now we can reverse the algorithm in rand(), and determine 128 potential keys one of which was originally supplied to shush by the user.

If anyone see's a problem HERE, again... props. There's something I didn't take into account. We've been assuming that in rand() is a symmetrical algorithm that can be reversed. I think the reason I thought this was because in most descriptions of rand(), they talk about algorithmic randomness, and in pure mathematics algorithms are symmetrical but in programming, not necesarilly. Here is what rand() does (this is the implementation used by mingw, it also changes between compilers, I've also changed how it looks a bit for simplicity, it still functions the same.)

return((seed = seed * 214013L + 2531011L) >> 16);

As you can see, it takes the seed (your key) and multiplies it by 214013, the product of that is added to 2531011. If this was all, it could be trivially reversed. We'd take the result, subtract 2531011 and divide the difference by 214013. But there's more. >> 16 indicates that 16 bits are dropped off the end of the value... And again, lost forever. So this introduces a whole new set of potentials... How many? well... remember the maximum allowed seed (as indicated by shush) is: 3435973836, which is a 32bit value. 16 of those bits are discarded from the seed. This means that for every number from our list of 128 potentials there is 2^16 or 65536 additional potentials. This gives us 8388608 potentials in all. Still beats brute-forcing.

Their's a question to ask yourself at this point though. A modern computer can crunch 3.4 billion numbers in seconds. So does the difference in efficiency and execution time justify the complexity of the code required to accomplish it? Probably not to the average user. If this was a professional developer project, your boss would kick you in the ass for wasting so much time. But it was a lot of fun anyways.

One side note is that shush is depending on rand(). Through my research, I learned that rand() has no standard. Its implemented differently on many different platforms, and in different compilers, and even changes from version to version. This program would be mostly useless if it was released opensource, as if two users compiled it with different compilers, chances are it would fail.

Anyways, hope SOMEONE had fun reading this lol

Monday, August 07, 2006

Authentication bypass.

This is an example of a far too common problem. Developers have a tendency to assume that client applications will always act how they were designed to act. This is fine if you're depending on them for functionality, but NOT if you're depending on them for security.

Recently I was asked to take a peek at a content management system currently in developement. A lot of it seemed relatively stable, except for this one short snippet of code. Whenever a protected page is loaded, one that you have to be logged in to view, a function containing this code is called:

if(!isset($_SESSION['session']["privLvl"])) {
header("Location: login.php");

It grabs a variable called 'privLvl' registered to the users session. If the user is not logged in, this variable is unset and the browser is redirected to the login page. So then, what is the problem? The problem exists, because our developer is putting his trust in a function that depends on the browsers response. header("Location: login.php"); will force a browser to redirect to login.php, but only because that is how a browser is programmed to react. watch what happens when I load this page in ie, then in netcat:

first, here is the code for admin.php:


if(!isset($_SESSION['session']["privLvl"])) {
header("Location: login.php");

echo "BIG SECRET!";


Here is an image of what loads in IE:

that form is what is stored in login.php, we've been happily redirected, as intended. And now, in netcat:

Whoops. Netcat has no idea what to do with the Location: header, its not a browser. A secure way to implement this would be as follows:


if(!isset($_SESSION['session']["privLvl"])) {
header("Location: login.php");

echo "BIG SECRET!";


Here we force exit of the script after redirect, incase the client doesn't listen.
There are plenty of other vulnerabilities based on this same flawed thinking. Using javascript to sanatize input is a common one. As a developer you should always be concious of what your functions depend on, and whether or not those dependencies are under your control.

Thursday, August 03, 2006

New Bookmarklets

I developed a few new web app pentesting bookmarklets this afternoon. If anyone has any requests, or bookmarklets of their own to share, please leave me a comment.

Here are the new ones:

Password2text: for quickly viewing whats in a password input.
Form Report: Detailed report of all forms on the current page.

Here is a running list of all my security marklets so far:

modcookie: For on-the-fly cookie modification.
methodToggle: for toggling the method of a form, to test method strictness
noMax: removes the maxlength property of text inputs.
hidden2text: displays hidden inputs.

I'll be looking forward to requests, and other security marklets from all of you ;)

Sunday, July 16, 2006

Weekend Pentest

A friend asked me to have a look at some php he was working on yesturday and I found a few interesting little security weaknesses I'd like to discuss.

First of all, maybe someone could help me out here, I'm not sure why this practice caught on. A lot of modular php projects use the file extension '.inc' for external modules, rather than '.php'. This is problematic. If a .inc is called directly, it wont be processed by the php interpereter so the code will be sent to the client rather than the expected output. Some will undoubtedly argue that these .inc modules are hidden to endusers and therefore will never be seen. This is a classic case of security through obscurity, and a very BAD stance. Here is and example of why:

.inc Google Dork. <- this is a google search for every .inc file containing mysql_connect. You might be surprised to see how many sql usernames/passwords show up.

Robust security is hardening, not hiding.

The other weakness I found sort of made me smirk. This is a good example of why secure code will only allow good input, rather than disallowing bad input. My friend was reusing some old code he had written to allow a user to upload files. He had a list of extensions that users were not allowed to upload. These included: php, php2, php3, php4, phtml. When the code was written .php5 was not a valid extension but since then, he had updated to php 5. I simply wrote a php passthru() shell and uploaded it as x.php5. This is classic. When a developer allows the environment to take care of security, rather than building security into the app he's developing. The same thing happens when a developer trusts php settings like magic quotes or register globals to take care of security. If the application is migrated, or the configuration is changed, the code might become a gaping hole.

The point I'm trying to make is, write secure code. Don't trust your environment to be secure.

Wednesday, July 12, 2006

Packed Executables

I have to apologize again for the lack of recent updates. I've been spending a lot of time looking for work and havn't had a lot of time to spend on this blog. Over the last little while I've been getting a lot more traffic so I'm vowing to start paying more attention to my readers.

The other day Didier Stevens posted on his blog about viewing strings in executables. Its an excellent explanation of how printable data is stored in binary files, and how an enduser can find that data, and even modify it. In his post he mentions that this technique will almost always fail against malware, as malicious executables are often 'packed or encrypted'. This post will elaborate on reverse engineering 'packed' executables.

    "BTW, this technique to dump files will almost always fail when analyzing malware, because these files are often packed or encrypted. A simple trick to view the strings of such malware code is done with Process Explorer by Sysinternals. When the malware is running (you’ll want to run it on an isolated machine, like a virtual machine), start Process Explorer and display the properties of the running malware. The Strings tab will show you the strings in the file (Image) and in memory (Memory). Since the malware as unpacked/decoded itself when it started, you’ll be able to view the strings in memory."

Packing an executable file is a way of compressing executable code firstly to minimize filesizes, but often it is also used to complicate the reverse engineering process. Now, packing is quite different from standard compression like zip or rar. Zipped files exist in archives, from where they must be decompressed. Packed executables are standalone files that can be executed while still compressed. A packer will use standard compression techniques on the file, of course these modifications make the binary code unrecognizable to the OS, but the packer prepends an unpacking routine to the executable as well. When the file is run, the unpacking routine 'unpacks' the executable code, and loads it into ram in its original state.

Didier mentions using Mark Russinovich's Process Explorer to dump strings from the process as its running. This will work because the program has been unpacked into ram. This is an excellent method, but incase you're only planning on running a preliminary/pre-execution analysis and you don't have an isolated testbed machine/virtual machine, I'll show you a few other ways.

The first thing to know is that not ALL the data/code in a packed executable is compressed. Some of it, namely the unpacking routine, remains unchanged. There are many public packers available on the internet, and most of them leave a very recognizable signature in the unpacking routine. This makes it trivial to determine what packer was used to pack the code.

Following are some examples:

C:\utils\strings>strings strings.exe | more

Strings v2.2
Copyright (C) 1999-2005 Mark Russinovich
Sysinternals - www.sysinternals.com

((((( H
!This program cannot be run in DOS mode.

Here I've run strings against itself as an example of an unpacked executable. You'll notice near the top under a string from the DOS stub, (!This program cannot be run in DOS mode.) and Rich8 is a list of the various segments of the binary file: .text, .rdata, .data, etc. Here is the same exe packed using UPX, a very common packer:

C:\utils\strings>strings strings.exe | more

Strings v2.2
Copyright (C) 1999-2005 Mark Russinovich
Sysinternals - www.sysinternals.com

!This program cannot be run in DOS mode.

As was mentioned, the packer has left a recognizable signature in the unpacking routine. This would be a clear indication that we should look into unpacking UPX. You'll find that the upx packer comes with an unpacking feature built right in. Other packers might not be so nice, but you'll likely find unpackers for most packers available online.

If a simple strings analysis isn't enough to determine the packer used on the file, you might try a PE analysis tool such as PEID available from peid.has.it. This application will recognize MOST packers.

Hope this helps someone somewhere.

Wednesday, June 14, 2006

A Study in Reverse Code Engineering (RCE)

For quite a while I've wanted to learn RCE. The ability to learn the subtleties of how an executable operates through runtime disassembly is an art and an incredible challenge. I've learned a few things that I think will prove useful to other novice reversers, and as such I will be documenting them here. This is not a tutorial. Its just an explanation of a reverse engineering task I've been hacking at. We're going to try to cheat at poker!

One point I want to emphasize is that I know very little assembly. I'm learning as I go. If you spot any mistakes, or incorrect assumptions I'd love for you to point them out. I could use the help! :)

I think one of the coolest things I learned in the process of this RCE project is how easily certain parts of the program can be recognized in the disassembler even with very little knowledge of or experience with x86 assembly.

Okay, the program I reversed is Net Poker by netintellgames. (http://www.netintellgames.com/poker.htm) The only tool I used is a shareware disassembler/debugger called Ollydbg (http://www.ollydbg.de/).

The first step to any task is to define the task. You should always approach a problem with a solid understanding of what you're trying to do. Netpoker is a 2 player poker game. You can play either against the computer, or against an opponent. If you're playing against an opponent one host acts as the server, the other connects over the network. This is interesting to me. If there are only 2 parties, in order to keep the deck of cards true at least one player will have to know what cards remain in the deck after some have been dealt to the other player. This means that, theoretically, at least one host will be able to determine which cards have been dealt to both hands. Another interesting thing to note is that playing against the computer is functionally identical to playing against a human. When playing against the computer a second program executes invisibly and connects to you as a 'netpoker bot'. Our task, then, will be to try to determine which cards are in the opposing players hand. (Also, I may have been able to figure all this out quicker by monitoring network communications but thats not what I'm trying to learn :P.)

Okay, so I open up netpoker in olly. The first problem for me is always trying to get into the meat of the program with the debugger. There's probably a lot of tricks for doing this, but here's what I came up with for this project. Various sounds are played throughout the game. An easy way for applications to play audio like this is to make a call to the function sndPlaySoundA in the winmm.dll module. This is relevant because every time a card is dealt, a particular sound is played. So, locating calls to this module within netpoker might be a good way to find the function for dealing cards within the code listing.

I right click in the dissassembler panel in olly and select "search for>intermodular calls". A list of all intermodule calls is displayed and I locate all calls to winmm.sndplaysounda and use f2 to set breakpoints on these calls. Then I execute the program by clicking play. A few dialog windows pop up asking me for my name and whether I want to serve or connect to a server. I choose to be the server and get to the main game screen. The status bar suggests I should press f2 to play against the computer. I do so, and a ringing sound is played triggering my first breakpoint. This is what the code looks like:

00403C8E PUSH poker.00430144 ; ASCII "start.wav"
00403C93 CALL DWORD PTR DS:[<&WINMM.sndPlaySoundA>; WINMM.sndPlaySoundA

The value "start.wav" is pushed onto the stack as a parameter for the function sndPlaySoundA which is called as the next instruction. This is not the sound we're looking for, we're looking for the sound played when a card is dealt, so we press the play button again to resume execution. Oddly, all the cards are dealt, the sound is played for each card, but no breakpoints are triggered. After all the cards are dealt another sound is played and here a breakpoint is triggered.

00407806 PUSH poker.00430240 ; ASCII "money.wav"
0040780B CALL DWORD PTR DS:[<&WINMM.sndPlaySoundA>; WINMM.sndPlaySoundA

Alright, so at this point I had no idea how the card dealing sound could be playing without triggering my breakpoints, but one thing was for certain, the code I was looking for was between the two breakpoints that were triggering. So I restart the program and play till the start.wav breakpoint is triggered. I start stepping through the program here, but this gives me a bit of trouble. During execution millions of instructions are processed per second. I'm trying to walk through them one at a time. That could take forever, considering I don't really know how many instructions there are between the opening sound and the actual cards being dealt. So I try a different approach. I take a closer look at the two wav files being called:

00403C8E PUSH poker.00430144 ; ASCII "start.wav"
00407806 PUSH poker.00430240 ; ASCII "money.wav"

The addresses that the filenames are being called from (00430144/00430240) are relatively close in proximity. Could all the wav filenames be there?

00430240 6D 6F 6E 65 79 2E 77 61 money.wa
00430248 76 00 00 00 63 61 72 64 v...card
00430250 2E 77 61 76 00 00 00 00 .wav....
00430258 3A 00 00 00 64 72 61 77 :...draw
00430260 2E 77 61 76 00 00 00 00 .wav....
00430268 6C 6F 73 65 2E 77 61 76 lose.wav
00430270 00 00 00 00 77 69 6E 2E ....win.
00430278 77 61 76 wav

NOW we're getting somewhere. Card.wav looks like it might be the sound we're looking for. This is easily verified. All the wav's are in the game folder.

Directory of C:\Program Files\NetIntellGames\Net Poker 4

06/13/2006 12:40 AM .
06/13/2006 12:40 AM ..
08/01/1992 07:26 AM 676 card.wav
06/12/1992 08:46 AM 22,688 draw.wav

So I search the code for references to card.wav and come up with two:

004078B4 PUSH poker.0043024C ; ASCII "card.wav"
004078B9 CALL DWORD PTR DS:[<&WINMM.sndPlaySoundA>; WINMM.sndPlaySoundA

00407915 PUSH poker.0043024C ; ASCII "card.wav"
0040791A CALL DWORD PTR DS:[<&WINMM.sndPlaySoundA>; WINMM.sndPlaySoundA

Unfortunately, both of these already have breakpoints, from when I BP'd all calls to sndplaysounda. Its starting to look grim, but I get another idea. poker.0043024c is the address where the card.wav filename is located. So I search the code for references to THIS and voila, GOT one: 0040633E PUSH poker.0043024C. It turns out that it was just a problem with olly. When using "search for>all referenced text strings" card.wav is only found twice, but when actually searching for references to 0043024C, its found three times.

I also now know why I didn't find the call to winmm.sndPlaySoundA. Its not calling it directly. Here is what I mean:

00406282 MOV EBP,DWORD PTR DS:[<&WINMM.sndPlaySou>; WINMM.sndPlaySoundA
0040633E PUSH poker.0043024C ; ASCII "card.wav"
00406343 CALL EBP

The address for winmm.sndPlaySoundA is loaded into EBP, and then EBP is called. trixy ;).

Now, after adding a breakpoint to the CALL EBP that actually plays the sound, the program pauses after every card dealt. this happens 10 times before the the program continues execution. This is definately where we want to be in the code. Now here is what I was talking about when I said its cool how little assembler you have to know to get a feel for whats going on in a program. I start using F8 to 'step' through the program one instruction at a time. I quickly notice that i'm in a loop that flows something like this: The loop would iterate 10 times executing one series of instructions the first time, another series of instructions the second time, the first series the third time, the second series the fourth and so on. The only reason I spotted this is just by watching how the flow of execution moved on the screen. everytime I hit f8, the interface would highlight the next instruction... And it was moving through this loop switching back and fourth between each subsection of instructions. If you're having trouble following that, try it for yourself. Open up netpoker in olly, find 00406343 and add a breakpoint to it with F2, now press play. When it breaks just start pressing F8 over and over and over, you'll see what I mean by recognizing the loop without recognizing the commands.

I figure whats happening here is the first time through, a card is being dealt to one hand, the second time through the card is being dealt to another hand, etc, etc. The code follows, I've seperated it with dashed lines.

[This is the beginning of the loop]------------
00406291 MOV EAX,DWORD PTR DS:[433EE4]
00406296 AND EAX,0FF
0040629B SUB EAX,0
0040629E JE SHORT poker.004062D5
004062A0 DEC EAX
004062A1 JNZ SHORT poker.0040630A
[this is executed the first time through]------
004062A3 XOR EAX,EAX
004062A5 MOV AL,BYTE PTR DS:[433EE5]
004062B4 XOR EAX,EAX
004062B6 MOV BYTE PTR DS:[ECX*4+433D3D],DL
004062BD MOV AL,BYTE PTR DS:[433EE5]
004062C5 MOV BYTE PTR DS:[EAX*4+433D3C],BL
004062CC MOV BYTE PTR DS:[433EE4],0
004062D3 JMP SHORT poker.0040630A
[this is executed the second time through]-----
004062D5 XOR EAX,EAX
004062D7 MOV AL,BYTE PTR DS:[433EE5]
004062DC MOV DL,BYTE PTR DS:[EAX*2+433DC5]
004062E6 XOR EAX,EAX
004062E8 MOV BYTE PTR DS:[ECX*4+433CAD],DL
004062EF MOV AL,BYTE PTR DS:[433EE5]
004062F7 MOV BYTE PTR DS:[EAX*4+433CAC],BL
004062FE MOV AL,BYTE PTR DS:[433EE4]
00406303 INC AL
00406305 MOV BYTE PTR DS:[433EE4],AL
[the rest is executed everytime]---------------
0040630A MOV CL,BYTE PTR DS:[433EE4]
00406310 MOV AL,BYTE PTR DS:[433EE7]
00406315 CMP CL,AL
00406317 JNZ SHORT poker.0040631F
00406319 INC BYTE PTR DS:[433EE5]
0040631F MOV CL,BYTE PTR DS:[433EE6]
00406325 INC CL
00406327 MOV BYTE PTR DS:[433EE6],CL
0040632D MOV ECX,EDI
0040632F CALL poker.00405520
00406334 CMP DWORD PTR DS:[EDI+368],EBX
0040633A JNZ SHORT poker.00406345
0040633C PUSH 1
0040633E PUSH poker.0043024C
00406343 CALL EBP
00406345 MOV EDX,DWORD PTR DS:[EDI+358]
0040634B MOV ECX,poker.00433CA0
00406350 PUSH EDX
00406351 CALL poker.00402380
00406356 CMP BYTE PTR DS:[433EE6],0A
0040635D JNZ poker.00406291

Now, I understand a bit of the code in here. For instance the various jumps (jmp) are controlling the flow of this loop sometimes based on conditions (je, jnz). But to be honest, I couldn't just look at this code and tell you what it was doing. As I approach this next piece of the puzzle, I do so with the assumption that the section of code executed the first time through the loop represents dealing a card to the computers hand, and the section of code executed the second time through represents dealing a card to my hand. Since I can see my hand to begin with, I focus on the section that deals me a card.

004062A3 XOR EAX,EAX
004062A5 MOV AL,BYTE PTR DS:[433EE5]

I don't really know what the xor's purpose is, but the next two instructions are moving values into the low ends of EAX and EDX. So, my next step is to run through the loop a few times watching these registers in the registers panel in olly.

Iteration EAX EDX (EDX in Decimal)
1 0 29 (41)
2 0 0C (12)
3 1 31 (49)
4 1 2F (47)
5 2 02 (02)
6 2 1A (26)
7 3 27 (39)
8 3 31 (49)
9 4 11 (17)
10 4 16 (22)

It looks like EAX is acting as a counter for the loop, which isn't surprising as this register is also called the Accumulator. EDX, however, is a little more interesting. I've included the values in decimal to illustrate a point. All these numbers are between 1-52. Coincidence? Doubtful. Assuming those numbers DO represent the cards in both hands, our next step is to determine HOW they represent cards. Sort of a miniature reversing of a dataset's schema. In order to do this, we will look at the values we have, and what we are assuming they represent:

My hand EDX in Decimal
4 of hearts 12
King of diamonds 47
8 of clubs 26
Ace of spades 49
7 of clubs 22

Luckily there is enough data in this one hand to map out the format of the entire dataset. The following table represents this format

10 20
1234 5678 9012 3456 7890 1234 5678
2222 3333 4444 5555 6666 7777 8888
scdh scdh scdh scdh scdh scdh scdh

30 40 50
9012 3456 7890 1234 5678 9012
9999 0000 jjjj qqqq kkkk aaaa
scdh scdh scdh scdh scdh scdh

There we go, This means that by monitoring the value stored in EDX we can effectively determine the cards as they're dealt. PHEW. Now I need a rest :)

Friday, June 02, 2006

MSN Handwriting Interception.

First, sorry for the lapse in posting, I've been busy with LIFE and a few very time consuming projects.

Recently I was talking with a friend over msn messenger. They had some sensitive information to relay, but they were a little nervous about doing so from their wireless lan. This made me smile a little, always happy to see security concious minds, and I was about to tell them to call me when they sent the information using the MSN Handwriting feature. Her justification was that, text could easily be sniffed on the wire, but not an image. I knew this was a false sense of security so I set out to find an easy way to intercept MSN Handwriting.

Having no prior knowledge regarding the technologies involved, I decided to start by having a look at the packet traffic involved when sending a handwritten msg. An example payload follows:

MSG REMOVED@hotmail.com sp00kz 580

MIME-Version: 1.0

Content-Type: application/x-ms-ink


okay, first thing to note is the mime type: application/x-ms-ink. Googling this doesn't give me a lot of information. There's barely any references to the mime-type at all. The last peice of data from the packet is the most interesting, this is our data stream, base64 encoded for safe transmission. First we decode it in order to see if we can make heads or tails of the format. To do this I used b64dec.exe (http://www.4mhz.de/b64dec.html). A very handy tool, we cutpaste the encoded data into a text field, and it allows us to write the decoded binary data to a file.

At this point I'm hoping to see some image format I recognize. Could msn messenger be forwarding the data as BMP?

00000000: 00 FF 02 1C-03 80 80 04-1D 04 B0 02-EE 02 02 0D ☻∟♥ÇÇ♦↔♦░☻ε☻☻♪
00000010: 04 48 11 45-64 07 48 11-44 FF 01 45-64 19 14 32 ♦H◄Ed•H◄D ☺Ed↓¶2
00000020: 08 00 80 32-02 1C C7 B1-42 33 08 00-E0 12 02 1C ◘ Ç2☻∟╟▒B3◘ α↕☻∟
00000030: C7 31 42 15-AB AA D3 41-AB AA D3 41-00 58 D5 3E ╟1B§½¬╙A½¬╙A X╒>
00000040: 00 80 95 3E-1E 07 06 82-FC 71 F8 DC-00 09 01 0A Çò>▲•♠éⁿq°▄ ○☺◙
00000050: 77 BC 01 82-FE 00 F3 F8-03 DB 66 AD-22 25 C5 4B w╝☺é■ ≤°♥█f¡"%┼K
00000060: 12 C0 24 B2-CB 01 72 A0-95 36 6C B6-50 00 00 10 ↕└$▓╦☺ráò6l╢P ►
00000070: 2E 53 2C 96-4B 96 CD 28-54 D4 AB 65-2A 4E E1 52 .S,ûKû═(T╘½e*NßR
00000080: A2 C2 58 17-36 02 58 B1-60 00 82 FE-04 0B F8 10 ó┬X↨6☻X▒` é■♦♂°►
00000090: 3B 56 D9 D9-9B 17 35 2D-91 12 94 8B-88 45 91 60 ;V┘┘¢↨5-æ↕öïêEæ`
000000A0: C5 82 C5 8B-12 85 96 4D-96 6C B5 65-2C 59 28 95 ┼é┼ï↕àûMûl╡e,Y(ò
000000B0: 10 21 96 2E-55 83 34 4D-66 99 DC B2-B3 52 CA 58 ►!û.Uâ4MfÖ▄▓│R╩X
000000C0: 5C D9 0D 85-45 4A 96 C5-09 00 0A 2C-3D 82 FD C9 \┘♪àEJû┼○ ◙,=é²╔
000000D0: FB 94 00 01-15 2A 58 9B-28 06 C2 04-94 16 00 82 √ö ☺§*X¢(♠┬♦ö▬ é
000000E0: FE 00 8B F8-02 39 64 A9-B9 74 2A 5C-A2 54 B2 59 ■ ï°☻9d⌐╣t*\óT▓Y
000000F0: 62 C2 51 65-96 4A 2A 00-0A 26 2B 82-FE 00 23 F8 b┬QeûJ* ◙&+é■ #°
00000100: 00 94 58 B2-51 6A 2C 2C-59 52 96 25-92 80 82 FE öX▓Qj,,YRû%ÆÇé■
00000110: 00 C3 F8 03-19 65 03 62-43 36 06 C2-8A 95 2C 00 ├°♥↓e♥bC6♠┬èò,
00000120: 0A 25 2E 82-FE 00 DB F8-03 99 66 E5-80 49 4A 4A ◙%.é■ █°♥ÖfσÇIJJ
00000130: 00 4B 26 E5-2C 82 FE 00-C3 F8 03 10-40 14 54 A5 K&σ,é■ ├°♥►@¶TÑ
00000140: 93 2C D9 65-4A 94 00 0A-21 25 82 FE-01 5B F8 05 ô,┘eJö ◙!%é■☺[°♣
00000150: 70 01 00 25-4B 2A 68 00-82 FE 00 BB-F8 02 D9 49 p☺ %K*h é■ ╗°☻┘I
00000160: 57 62 CB 80-65 26 E6 E5-59 A0 0A 16-12 82 FD D9 Wb╦Çe&µσYá◙▬↕é²┘
00000170: FB AC 92 7A-55 DB 6B A0-82 FE 01 93-F8 06 5B 16 √¼ÆzU█káé■☺ô°♠[▬
00000180: 00 A4 - - - ñ

unfortunately, This is not data I recognize. I run searches for common image file format signatures found here: http://www.garykessler.net/library/file_sigs.html and here: http://www.wotsit.org. But still, this yields nothing.

At this point I notice something interesting. I click the handwritten msg my friend sent me and drag it into my handwriting space. The eraser tool is different than most eraser tools. Try it out right now, you'll see what I mean. It remembers each 'stroke' and erases them the way they were drawn. The interesting thing I notice is that MY msn messenger client remembers the 'strokes' that my friend used when writing HIS msg. That means this information is sent in that base64 encoded data stream. Interesting, what type of data would require that information? Well I know that 'graffiti' on my palm needs to know the order of strokes in a letter in order to determine effectively which letter was being written...

A little research tells me that Microsoft's XP Tablet PC edition has built in handwriting recognition. The format they use is called Ink Serialized Format, or ISF. I download a freeware ISF viewer from Agilix.com, rename my msnhw.out to msnhw.isf, and attempt to open it...

BINGO. So, by sniffing the wire, or reading packet traffic, it's trivial to view MSN handwriting. Also, this format is used for handwriting recognition, so It would likely be very easy for someone who understands the format to change this data to text, possibly adding word recognition to traffic monitoring systems. I assume this also forshadows future features to the msn messaging client.

Happy Hacking :)

Sunday, May 14, 2006

Javascript "Encryption"

Last night a friend pointed me at a forum where someone was 'daring' people to 'crack his encryption'. I absolutely love these challenges and always give them a try. In this case, the challenge wasn't so much 'cracking encryption' as making sense of some semi obfuscated javascript code. I've decided to explain some of the techniques I use to do this, incase anyone ever finds a good use for them :P

Alright, so the code is long and mungy, so rather than pasting it here I'll explain some of it and paste the main function so I can walk you through it.

First, two arrays are declared. A standard array called asciiasc and an associative array called asciichr. These are used to create an alphabet table in the following way:

asciichr["A"] = 65; asciichr["B"] = 66; asciichr["C"] = 67; asciichr["D"] = 68; ...
asciival[65] = "A"; asciival[66] = "B"; asciival[67] = "C"; asciival[68] = "D"; ...

each of these array contains 255 values.

Following this is a javascript function, I've added indentations for readability:

function synvb(text) {
var allchr, myfinal = "";
var textasc, allasc = 0;
var passasc = 137;
var i, j = 0;
var rep2 = "";
rep2 = /\\r/g;
text = text.replace(rep2,"\r");
rep2 = /\\n/g; text = text.replace(rep2,"\n");
for (i = 0; i < text.length; i++) {
textasc = asciichr[text.charAt(i)];
allasc = textasc - passasc;
if (allasc < 1) {
j = allasc + 255;
allchr = asciiasc[j];
} else {
allchr = asciiasc[allasc];
if (!allchr) {
allchr = " ";
myfinal = myfinal + allchr;
return myfinal;

First some variables are declared. The two replace()'s are done to unescape carriage retures (\r) and newlines (\n). Following this we are passed into a for loop.

for (i = 0; i < text.length; i++) {

A for loops peramaters can often contain enough information to allow us to make a preliminary hypothisis of what happens within the loop construction. In this case, the loops condition is based on the length of the string 'text' which is the paramater passed to this function. Having seen the webpage I know that text contains an 'encrypted' string. Knowing this, we can hypothesize that this loop will iterate over every character, and is therefor likely the 'decryption' routine.

textasc = asciichr[text.charAt(i)];
allasc = textasc - passasc;

In the first iteration of the loop we get the value for the 1st ascii character in 'text', and subract from that value 137. Ah, so this is a basic shift cipher! The conditional statements following this are used to 'wrap' the shift to accomodate for values of allasc less than 0. Finally we take our values and concatonate them in 'myfinal'.

Now that we know that this cipher shifts the alphabet by -137 it would be trivial to code a decryption function. Rather than doing this, however, we'll use the code already at our disposal. We can add a line here:

myfinal = myfinal + allchr;
> document.write(myfinal);

However, in the instance of the code I was looking at, this will not work. After running through it, nothing was displayed. But why? What kind of data could be passed to document.write and not be displayed in the content of the document? It must be code! maybe the value is something like: <!-- this is the decrypted text \-->. This would not be displayed. So how can we have it displayed? The <textarea> tag is a perfect perfect for this problem. Anything following this tag is considered data (not code) and placed inside a textarea box. So, the first time through our loop we want to document.write('<textarea>').

function synvb(text) {
var allchr, myfinal = "";
var textasc, allasc = 0;
var passasc = 137;
var i, j = 0;
var rep2 = "";
rep2 = /\\r/g;
text = text.replace(rep2,"\r");
rep2 = /\\n/g; text = text.replace(rep2,"\n");
for (i = 0; i < text.length; i++) {
textasc = asciichr[text.charAt(i)];
allasc = textasc - passasc;
if (allasc < 1) {
j = allasc + 255;
allchr = asciiasc[j];
} else {
allchr = asciiasc[allasc];
if (!allchr) {
allchr = " ";
--> if(!myfinal){document.write('<textarea>')}
myfinal = myfinal + allchr;
return myfinal;

Hope you like the technique :).

Dispelling Myths

Alright, so here's my beef. Someone claimed recently that they had developed a bandwidth consumption DoS. This, to me, is not a bad thing. Exploit developement can be positive and progressive. What bothers me is when 'developing a bandwidth consumption DoS' consists of nothing more than coming up with a simple idea, not testing it, not understanding the technologies involved, and propagating a rumour that it works. This is fear mongering. After asking around I found a good number of people who had heard of the DoS, and had just taken for granted that it was a big deal.

Basically, his theory is that by spoofing the IP address of a victim, and sending an HTTP GET request to a webserver for a file larger than the request he is achieving amplifacation. This is illustrated in the following diagram, A(C) represents the attacker (host A), spoofing the victims ip (host C) and requesting an image from the webserver (host B).

Assuming the file big.jpg is bigger than the HTTP request packet, his transmission has been amplified, and since his IP has been spoofed big.jpg will be sent to the victim. A standard http get request from a browser can be anywhere from 100-500 bytes, but a lot of this information, the useragent, referrer, etc, is unnecessary. All we really need to send in this case is the request method, URI and HTTP version.

GET /big.jpg HTTP/1.1

This will end up being around 50 bytes. If the image we're requesting is 5k we've amplified our attack by 100x. Having a look at google's image search gives indication that images over 500k are not uncommon. Looks pretty dangerous, right? An attacker can send out 50 bytes of data and have a victim recieve 500000 bytes. Or can they?

The problem is that the designer of this DoS doesn't understand the technology he's attempting to exploit.

Problem 1: Fragmentation.

Even if he could get this exploit to work (he can't, but I'll get to that) his metrics are off. Webservers don't just send out 500000 byte packets. These will be fragmented into more manageable pieces. The standard maximum size for a packet is 1500 bytes. Each 1500 byte fragment has to be acknowledged by the recipient but since this fragment isn't expected by the victim host it is discarded and an RST packet is sent back to the webserver, reseting the connection. That chokes our amplification to a maximum of 1500 bytes for each request (1500 is the STANDARD mtu, but in certain circumstances can be modified). This minimizes the risk significantly.

Problem 2: TCP isn't stupid.

This attack tries to take advantage of the fact that an HTTP GET request is relatively small, but has the potential to force large responses from a server. He's got the right idea, but he's looking at the wrong transmission protocol. TCP is connection based, and as such any TCP service requires a full two-way connection. Spoofing packets is easy, but spoofing a full connection is not. In order to have the webserver accept and respond to the GET request for this image, the attacker has to first complete the tcp initiation process (or threeway handshake) posing as the victim like this:

A(C) >-------SYN-------> B
B >-----SYN/ACK-----> C
A(C) >-------ACK-------> A

Those who know TCP will know that this is not an easy thing to do. The attacker is faced with a number of problems, but first and foremost is sequence numbers. I'll illustrate:

A(C) >-------SYN------[S:1111111111 / A:0000000000]------> B
B >-----SYN/ACK----[S:2222222222 / A:1111111112]------> C
A(C) >-------ACK------[S:1111111112 / A:2222222223]------> A

This is what needs to happen, not what will happen. Here's what will happen.

A(C) >-------SYN------[S:1111111111 / A:0000000000]------> B
B >-----SYN/ACK----[S:2222222222 / A:1111111112]------> C
A(C) >-------ACK------[S:1111111112 / A:uhmmmmmmmm]------> A

Our attacker will have no way of knowing the sequence number generated by the webserver as it was sent to the victim, not him. This, of course, is debatable. Okay, so lets debate it.

Michael Zalewski did some excellent research on ISN (Initial Sequence Number) randomness in various operating systems. His research shows that there are a number of TCP implementations using weak, predictable ISNs. If the ISN can be predicted, than the TCP connection can be spoofed! But what does this mean for the DoS? Nothing. Weak ISN's can be predicted but very rarely to absolute precision. This means that multiple guesses have to be made.

A(C) >-------SYN------[S:1111111111 / A:0000000000]------> B
B >-----SYN/ACK----[S:2222222222 / A:1111111112]------> C
A(C) >-------ACK------[S:1111111112 / A:1111111110]------> A
A(C) >-------ACK------[S:1111111112 / A:1111111111]------> A
A(C) >-------ACK------[S:1111111112 / A:1111111112]------> A
A(C) >-------ACK------[S:1111111112 / A:1111111113]------> A

Some of the weakest ISN's still require around 5000 guesses, meaning 5000 packets. ACK packets are usually about 50 bytes. Thats 250000 bytes of data being sent for every 1500 bytes reflected.

Now, not to add insult to injury, but lets say our attacker CAN guess the sequence number without having to send more than 1500 bytes. I mean, I suppose its feasable that he finds a webserver on a network generating absolutely predictable ISNs. (barely, but we're scientists.) there's something ELSE he's missing. When the webserver (host B) sends the SYN/ACK to the victim (host C), the victim's machine knows it didn't request a connection, assumes the webserver has malfunctioned and sends an RST back. This tips off the webserver, causing it to not send the 1500 byte fragment of our image.

A(C) >-------SYN-------> B
B >-----SYN/ACK-----> C
C >-------RST-------> B

The only way to effectively spoof a full connection, is if you can "gag" the host who's address you're spoofing, so that it can't reply. So, really... you have to DoS the victim in order to use this DoS against them! The attack is entirely superfluous.

Now, if you're looking for this kind of reflective amplification DoS, try looking at UDP services with the same small request/large reply type features.


Wednesday, May 10, 2006

a few more Bookmarklets

So, I decided to write a few more bookmarklets to help with some basic web application auditing. here they are:

methodToggle for toggling the method of a form.

noMax for ditching the maxlength restriction on text fields

hidden2text for changing hidden inputs to text inputs. After trying to get this one to work for about an hour I started googling to see if anyone else had done it. Thanks to Jesse from Squarefree.com for this, I modified it slightly so that the text inputs it creates to replace the hidden ones actually submit regularly with the form :)

Tuesday, May 09, 2006

Cookie Hacking Bookmarklet

A little while back Chris Shiflett posted on his blog about a quick and dirty way to modify cookies on-the-fly for pen testing web apps.

He talked about having to manually escape the data for it to work, and soon after Mike Willbanks commented about using a javascript prompt to simplify the method.

Here's a little bookmarklet I wrote for this:


Happy Hacking!

Monday, May 08, 2006

Weekend Pen Test.

As I mentioned on Friday a friend came to visit this weekend. While he was here we were asked to do some basic penetration testing on a website. We found a few interesting things that I'd like to discuss here.

First of all, when can webstatistics applications become an attack vector? If your webstatistics application lists accessed URL's or referer's, there is a potential risk of information disclosure when passing sensitive data through GET variables. The worst case scenario would be a login form that passes the username and password as GET variables, which are recorded as accessed URL's in the webstatistics output. Most of the really popular examples (Webalizer, AWStats, Modlogan) do not record get variables as part of the urls, but even this does not completely eliminate the problem. In certain circumstances the use of mod_rewrite can cause get variables to mask themselves as paths. In this case, they would show in your stats. This has been confirmed with the three examples of stats software given.

Next, if you're escaping characters manually DONT FORGET TO ESCAPE THE BACKSLASH. Now, I know at first it doesn't seem like there's much that can be done with a backslash but here's the only reason you'll ever need.

after escaping:
' becomes \'
effectively removing any syntactical relevance from the singlequote for any back end processing. BUT:
\' becomes \\'
Now, the first backslash escapes the second backslash and the singlequote is left alone.

Last but not least, NEVER EVEN LOOK at a file with a user defined variable name unless you've verified the filename good. Often developers will check if the file exists, then check its relevance before opening the file.

if (file_exists($filename)) {
switch ($filename) {
case "1.txt":
fopen($filename, 'r');
case "2.txt":
fopen($filename, 'r');
echo "invalid file, back off hacker.";
} else { echo "file doesnt exist"; }

This gives an attacker the ability to verify the existence of any file on the server (within the context of the webservers permissions.)

Seasonally Exclusive Environmental Incompatibility

Hay fever.

Late April and early May seem to be the worst for me. I've read that in April and May the offending pollen is from tree's, from may to july its from grass and after that its mostly ragweed pollen.

I'm considering trying an old homeopathic remedy. Apparently, chewing honeycomb from local beehives helps build an immunity to local pollen. I'll post an update on how it works :)

Friday, May 05, 2006

Geeks Love Geek Company.

One of my oldest, most haxerish friends is coming for a visit. He'll be here in a few hours.
I'm QUITE excited.

Thursday, May 04, 2006

01:02:03 04.05.06

Thanks to Mikko from F-Secure, I may have never noticed

An hour and two minutes after midnight tonight the clock will tick: 01:02:03 04:05:06

This wont happen for another thousand years.


MSN Display Pic Recovery

Have you ever noticed that once you use an image as your display pic in MSN Messenger, it stays in the display pic list even after the image has been deleted? Well it does, and this entry will explain how to recover them. (FYI, these images are modified to fit msn's display pic window.)

Today I decided to put a picture of myself in my blogger profile. Unfortunately the one I wanted to use had been innexplicably deleted. The only place that picture still existed was in MSN Messenger's list of recently used display pictures. So I fired up Mark Russinovich's brilliant tool filemon (systernals.com) and had a quick look at where these pictures were being retrieved from.

As illustrated, the images are being called from :

C:\Documents and Settings\USERNAME\Application Data\Microsoft\MSN Messenger\1385319040\UserTile

The files in that directory are as follows:

04/04/2006 01:31 AM 7,652 TFR2C4.dat
04/04/2006 01:12 PM 11,357 TFR2D9.dat
04/04/2006 04:16 PM 7,238 TFR2F0.dat
04/04/2006 04:17 PM 10,316 TFR2F2.dat
03/31/2006 04:46 PM 15,663 TFR38.dat
04/05/2006 02:28 PM 12,713 TFR48.dat
04/05/2006 02:29 PM 12,631 TFR4A.dat
03/31/2006 05:07 PM 16,388 TFR51.dat
03/31/2006 10:28 PM 17,312 TFR69.dat
04/29/2006 06:09 PM 20,672 TFRAA.dat
25 File(s) 369,659 bytes
2 Dir(s) 20,108,455,936 bytes free

There are 25 files in this directory 24 of which are in the format TFR[2]xx.dat where xx is a hexidecimal number, and 1 called map.dat. This corresponds nicely to the fact that there are 24 images in my display pics list. In hopes that these files might just be backups of the profile images, I run a strings analysis against them.

Strings v2.2
Copyright (C) 1999-2005 Mark Russinovich
Sysinternals - www.sysinternals.com



Note the first recognizable ascii string in the file is PNG, which is a wellknown image format.
Renaming the file to image.png and opening it in a PNG compatible image viewer confirms that these ARE infact PNG files as it properly loads one of my profile pics.

In order to see what the nubmer 1385319040 represents, I search the registry for any reference to it:

the string is found here:


This path holds keys representing settings for my particular msn account. One of these keys, called "MessageLogPath" actually contains my msn username. This is useful. Using this information we can recover images.

We open regedit, navigate to:


and run a search for our msn name. If it locates our name within that path, we will have the number corresponding to our passport profile. We can then go to

C:\Documents and Settings\USERNAME\Application Data\Microsoft\MSN Messenger\thatnumbergoeshere\usertile

and start renaming tfr*.dat files to .png files, there appears to be no order to how they are displayed in msn.

Automated Script Injection, Additional Applications.

I've been thinking a lot lately about samy's myspace worm. A few weeks before samy's worm started making headlines a friend and I had designed something similar for a browser based sci fi mmorpg. In the game you role-play the part of a merchant ship captain, traveling from planet to planet buying and selling commodities. When docked at a planet the browser based interface displays, among other things, a list of other ships also docked there. After a little bit of hacking around we realized that the HTTP POST form that allows a player to change the name of his ship was handled insecurely.

  • A) the handling for the form was not POST strict, meaning that POST and GET variables are accepted interchangeably. This is not in and of itself a vulnerability, however it can provide an attacker with unnecessary flexibility.
  • B) the developer doesn't understand how to securely cleanse user input. Rather than stripping out or escaping non-alphanumeric characters (which are really unnecessary for naming a ship) the developer attempts to detect and modify 'bad input'. For instance, the string <script> is detected and nuetered, becoming script>. This system is flawed. Due to the nature of how the modification is processed the string <<script> will be modified to: <script>. If you don't get why, read it a couple of times. The real problem though is that 'bad input' refers to an everchanging array of variable length. We will never know every possible 'bad input' and therefore cannot effectively detect it. As a rule, force good input rather than detecting bad input.

We modified our ships name to include javascript code that, when executed, would append 'assimilated by borg' plus the javascript iteslf to the end of the shipname of the user viewing it. because the name is displayed to every users browser docked at the same planet as you, it spread virally accross the virtual galaxy.

Though samy's methodology was superior to mine in its technical hackery, the concept is similar. Both he and I used features of the system we were attacking (myspace for samy, the sci fi rpg for me) to automate an attack through persistent script injection vectors. It would have been easy for samy to use his code maliciously. Harvest email addresses, ruin reputations, even systematically delete accounts. Just as I could have used my code to cheat at the game (thats right, could have.)

These are all good reasons to pay more attention to xss but i've been thinking about another application for automated script injection. Considering the growing popularity of social networking communities such as myspace and livejournal (there are plenty more) it would be trivial for marketing agencies to use viral xss for very fast very large scale market research projects. As an example, when we were playing with the sci-fi mmorpg we decided to try and get an idea of how many users were using IE vs mozilla or safari. Along with our code we injected an <img> with the src pointing to a php we had written which grabbed and stored the user agent to an sql db. It wouldn't be difficult to chart where the users were located geographically based on hostnames. This is just scratching the surface, but you see where I'm going. An interesting point to take note of is that the system is being attacked rather than the users, making it impossible or at least very difficult for users to protect themselves. The nature of these community sites makes the propagation of these xss worms very fast. I would be surprised if someone doesn't take advantage of this for monetary gain soon.

Monday, May 01, 2006

Pressure Crack.

I know, I know... a really late start for the blog fad.

I held off for as long as I could, but a lot of people were suggesting I keep a journal of all my nerd work. So I signed up for blogger. I'm actually pretty interested to see what neat little css tricks I can play with the layout template I picked. I'm expecting this to be a fun little archive of trixy little hacker bits. Here's hoping!