Wednesday, June 14, 2006

A Study in Reverse Code Engineering (RE)

For quite a while I've wanted to learn RCE. The ability to learn the subtleties of how an executable operates through runtime disassembly is an art and an incredible challenge. I've learned a few things that I think will prove useful to other novice reversers, and as such I will be documenting them here. This is not a tutorial. Its just an explanation of a reverse engineering task I've been hacking at. We're going to try to cheat at poker!

One point I want to emphasize is that I know very little assembly. I'm learning as I go. If you spot any mistakes, or incorrect assumptions I'd love for you to point them out. I could use the help! :)

I think one of the coolest things I learned in the process of this RCE project is how easily certain parts of the program can be recognized in the disassembler even with very little knowledge of or experience with x86 assembly.

Okay, the program I reversed is Net Poker by netintellgames. (http://www.netintellgames.com/poker.htm) The only tool I used is a shareware disassembler/debugger called Ollydbg (http://www.ollydbg.de/).

The first step to any task is to define the task. You should always approach a problem with a solid understanding of what you're trying to do. Netpoker is a 2 player poker game. You can play either against the computer, or against an opponent. If you're playing against an opponent one host acts as the server, the other connects over the network. This is interesting to me. If there are only 2 parties, in order to keep the deck of cards true at least one player will have to know what cards remain in the deck after some have been dealt to the other player. This means that, theoretically, at least one host will be able to determine which cards have been dealt to both hands. Another interesting thing to note is that playing against the computer is functionally identical to playing against a human. When playing against the computer a second program executes invisibly and connects to you as a 'netpoker bot'. Our task, then, will be to try to determine which cards are in the opposing players hand. (Also, I may have been able to figure all this out quicker by monitoring network communications but thats not what I'm trying to learn :P.)

Okay, so I open up netpoker in olly. The first problem for me is always trying to get into the meat of the program with the debugger. There's probably a lot of tricks for doing this, but here's what I came up with for this project. Various sounds are played throughout the game. An easy way for applications to play audio like this is to make a call to the function sndPlaySoundA in the winmm.dll module. This is relevant because every time a card is dealt, a particular sound is played. So, locating calls to this module within netpoker might be a good way to find the function for dealing cards within the code listing.

I right click in the dissassembler panel in olly and select "search for>intermodular calls". A list of all intermodule calls is displayed and I locate all calls to winmm.sndplaysounda and use f2 to set breakpoints on these calls. Then I execute the program by clicking play. A few dialog windows pop up asking me for my name and whether I want to serve or connect to a server. I choose to be the server and get to the main game screen. The status bar suggests I should press f2 to play against the computer. I do so, and a ringing sound is played triggering my first breakpoint. This is what the code looks like:

00403C8E PUSH poker.00430144                      ;  ASCII "start.wav"
00403C93 CALL DWORD PTR DS:[<&WINMM.sndPlaySoundA>;  WINMM.sndPlaySoundA


The value "start.wav" is pushed onto the stack as a parameter for the function sndPlaySoundA which is called as the next instruction. This is not the sound we're looking for, we're looking for the sound played when a card is dealt, so we press the play button again to resume execution. Oddly, all the cards are dealt, the sound is played for each card, but no breakpoints are triggered. After all the cards are dealt another sound is played and here a breakpoint is triggered.

00407806 PUSH poker.00430240                      ;  ASCII "money.wav"
0040780B CALL DWORD PTR DS:[<&WINMM.sndPlaySoundA>;  WINMM.sndPlaySoundA


Alright, so at this point I had no idea how the card dealing sound could be playing without triggering my breakpoints, but one thing was for certain, the code I was looking for was between the two breakpoints that were triggering. So I restart the program and play till the start.wav breakpoint is triggered. I start stepping through the program here, but this gives me a bit of trouble. During execution millions of instructions are processed per second. I'm trying to walk through them one at a time. That could take forever, considering I don't really know how many instructions there are between the opening sound and the actual cards being dealt. So I try a different approach. I take a closer look at the two wav files being called:

00403C8E PUSH poker.00430144                      ;  ASCII "start.wav"
00407806 PUSH poker.00430240                      ;  ASCII "money.wav"


The addresses that the filenames are being called from (00430144/00430240) are relatively close in proximity. Could all the wav filenames be there?

00430240  6D 6F 6E 65 79 2E 77 61  money.wa
00430248  76 00 00 00 63 61 72 64  v...card
00430250  2E 77 61 76 00 00 00 00  .wav....
00430258  3A 00 00 00 64 72 61 77  :...draw
00430260  2E 77 61 76 00 00 00 00  .wav....
00430268  6C 6F 73 65 2E 77 61 76  lose.wav
00430270  00 00 00 00 77 69 6E 2E  ....win.
00430278  77 61 76                 wav


NOW we're getting somewhere. Card.wav looks like it might be the sound we're looking for. This is easily verified. All the wav's are in the game folder.

Directory of C:\Program Files\NetIntellGames\Net Poker 4

06/13/2006  12:40 AM              .
06/13/2006  12:40 AM              ..
08/01/1992  07:26 AM               676 card.wav
06/12/1992  08:46 AM            22,688 draw.wav
...


So I search the code for references to card.wav and come up with two:

004078B4 PUSH poker.0043024C                      ;  ASCII "card.wav"
004078B9 CALL DWORD PTR DS:[<&WINMM.sndPlaySoundA>;  WINMM.sndPlaySoundA

00407915 PUSH poker.0043024C                      ;  ASCII "card.wav"
0040791A CALL DWORD PTR DS:[<&WINMM.sndPlaySoundA>;  WINMM.sndPlaySoundA


Unfortunately, both of these already have breakpoints, from when I BP'd all calls to sndplaysounda. Its starting to look grim, but I get another idea. poker.0043024c is the address where the card.wav filename is located. So I search the code for references to THIS and voila, GOT one: 0040633E PUSH poker.0043024C. It turns out that it was just a problem with olly. When using "search for>all referenced text strings" card.wav is only found twice, but when actually searching for references to 0043024C, its found three times.

I also now know why I didn't find the call to winmm.sndPlaySoundA. Its not calling it directly. Here is what I mean:

00406282 MOV EBP,DWORD PTR DS:[<&WINMM.sndPlaySou>;  WINMM.sndPlaySoundA
...
0040633E PUSH poker.0043024C                      ;  ASCII "card.wav"
00406343 CALL EBP


The address for winmm.sndPlaySoundA is loaded into EBP, and then EBP is called. trixy ;).

Now, after adding a breakpoint to the CALL EBP that actually plays the sound, the program pauses after every card dealt. this happens 10 times before the the program continues execution. This is definately where we want to be in the code. Now here is what I was talking about when I said its cool how little assembler you have to know to get a feel for whats going on in a program. I start using F8 to 'step' through the program one instruction at a time. I quickly notice that i'm in a loop that flows something like this: The loop would iterate 10 times executing one series of instructions the first time, another series of instructions the second time, the first series the third time, the second series the fourth and so on. The only reason I spotted this is just by watching how the flow of execution moved on the screen. everytime I hit f8, the interface would highlight the next instruction... And it was moving through this loop switching back and fourth between each subsection of instructions. If you're having trouble following that, try it for yourself. Open up netpoker in olly, find 00406343 and add a breakpoint to it with F2, now press play. When it breaks just start pressing F8 over and over and over, you'll see what I mean by recognizing the loop without recognizing the commands.

I figure whats happening here is the first time through, a card is being dealt to one hand, the second time through the card is being dealt to another hand, etc, etc. The code follows, I've seperated it with dashed lines.

[This is the beginning of the loop]------------
00406291  MOV EAX,DWORD PTR DS:[433EE4]
00406296  AND EAX,0FF
0040629B  SUB EAX,0                               
0040629E  JE SHORT poker.004062D5
004062A0  DEC EAX
004062A1  JNZ SHORT poker.0040630A
[this is executed the first time through]------
004062A3  XOR EAX,EAX                          
004062A5  MOV AL,BYTE PTR DS:[433EE5]
004062AA  MOV DL,BYTE PTR DS:[EAX*2+433DCF]
004062B1  LEA ECX,DWORD PTR DS:[EAX+EAX*2]
004062B4  XOR EAX,EAX
004062B6  MOV BYTE PTR DS:[ECX*4+433D3D],DL
004062BD  MOV AL,BYTE PTR DS:[433EE5]
004062C2  LEA EAX,DWORD PTR DS:[EAX+EAX*2]
004062C5  MOV BYTE PTR DS:[EAX*4+433D3C],BL
004062CC  MOV BYTE PTR DS:[433EE4],0
004062D3  JMP SHORT poker.0040630A
[this is executed the second time through]-----
004062D5  XOR EAX,EAX                           
004062D7  MOV AL,BYTE PTR DS:[433EE5]
004062DC  MOV DL,BYTE PTR DS:[EAX*2+433DC5]
004062E3  LEA ECX,DWORD PTR DS:[EAX+EAX*2]
004062E6  XOR EAX,EAX
004062E8  MOV BYTE PTR DS:[ECX*4+433CAD],DL
004062EF  MOV AL,BYTE PTR DS:[433EE5]
004062F4  LEA EAX,DWORD PTR DS:[EAX+EAX*2]
004062F7  MOV BYTE PTR DS:[EAX*4+433CAC],BL
004062FE  MOV AL,BYTE PTR DS:[433EE4]
00406303  INC AL
00406305  MOV BYTE PTR DS:[433EE4],AL
[the rest is executed everytime]---------------
0040630A  MOV CL,BYTE PTR DS:[433EE4]            
00406310  MOV AL,BYTE PTR DS:[433EE7]
00406315  CMP CL,AL
00406317  JNZ SHORT poker.0040631F
00406319  INC BYTE PTR DS:[433EE5]
0040631F  MOV CL,BYTE PTR DS:[433EE6]
00406325  INC CL
00406327  MOV BYTE PTR DS:[433EE6],CL
0040632D  MOV ECX,EDI
0040632F  CALL poker.00405520
00406334  CMP DWORD PTR DS:[EDI+368],EBX
0040633A  JNZ SHORT poker.00406345
0040633C  PUSH 1
0040633E  PUSH poker.0043024C                    
00406343  CALL EBP
00406345  MOV EDX,DWORD PTR DS:[EDI+358]
0040634B  MOV ECX,poker.00433CA0
00406350  PUSH EDX                               
00406351  CALL poker.00402380                    
00406356  CMP BYTE PTR DS:[433EE6],0A
0040635D  JNZ poker.00406291


Now, I understand a bit of the code in here. For instance the various jumps (jmp) are controlling the flow of this loop sometimes based on conditions (je, jnz). But to be honest, I couldn't just look at this code and tell you what it was doing. As I approach this next piece of the puzzle, I do so with the assumption that the section of code executed the first time through the loop represents dealing a card to the computers hand, and the section of code executed the second time through represents dealing a card to my hand. Since I can see my hand to begin with, I focus on the section that deals me a card.

004062A3  XOR EAX,EAX                          
004062A5  MOV AL,BYTE PTR DS:[433EE5]
004062AA  MOV DL,BYTE PTR DS:[EAX*2+433DCF]


I don't really know what the xor's purpose is, but the next two instructions are moving values into the low ends of EAX and EDX. So, my next step is to run through the loop a few times watching these registers in the registers panel in olly.

Iteration EAX EDX   (EDX in Decimal)
-----------------------------------------------
1         0   29    (41)
2         0   0C    (12)
3         1   31    (49)
4         1   2F    (47)
5         2   02    (02)
6         2   1A    (26)
7         3   27    (39)
8         3   31    (49)
9         4   11    (17)
10        4   16    (22)


It looks like EAX is acting as a counter for the loop, which isn't surprising as this register is also called the Accumulator. EDX, however, is a little more interesting. I've included the values in decimal to illustrate a point. All these numbers are between 1-52. Coincidence? Doubtful. Assuming those numbers DO represent the cards in both hands, our next step is to determine HOW they represent cards. Sort of a miniature reversing of a dataset's schema. In order to do this, we will look at the values we have, and what we are assuming they represent:

My hand           EDX in Decimal
------------------------------------------
4    of hearts    12
King of diamonds  47
8    of clubs     26
Ace  of spades    49
7    of clubs     22


Luckily there is enough data in this one hand to map out the format of the entire dataset. The following table represents this format

           10          20           
1234 5678 9012 3456 7890 1234 5678 
2222 3333 4444 5555 6666 7777 8888 
scdh scdh scdh scdh scdh scdh scdh 

 30          40           50 
9012 3456 7890 1234 5678 9012
9999 0000 jjjj qqqq kkkk aaaa
scdh scdh scdh scdh scdh scdh 


There we go, This means that by monitoring the value stored in EDX we can effectively determine the cards as they're dealt. PHEW. Now I need a rest :)

1 comment:

Anonymous said...

Interesting post. I'm also reversing code, mostly malware.

XOR EAX, EAX sets the EAX register to 0. XOR REG, REG is often used to set registers to 0, as it takes only 2 bytes (33 C0) instead of 5 bytes for MOV EAX, 0 (B8 00 00 00 00).

I just started a blog, reversing will be one of the topics: http://didierstevens.com

Have you heard about the IDA Pro disassembler: http://www.datarescue.com/idabase/
There's also a freeware version: http://www.programmersheaven.com/zone5/cat460/37637.htm

Didier