Unknown processor reverse engineering in a single program







TL; DR: we reverse-engineered a program written for a completely unknown CPU architecture without any documentation on the CPU (without an emulator, without ISA, without everything) in just ten hours. From the article you will learn how we did it ...



Last weekend, the CMU PPP team and I participated in the Dragon Sector Teaser CTF 2019 team to relax and move away from the hard deadline of CHI 2020 . Dragon Sector is a respected Polish team with a history of interesting CTFs , so I was wondering what they had in stock.



Having solved “ummmfpu”, a task that included reverse engineering of bytecode for the Micromega uM-FPU floating point coprocessor, I decided to participate in a competition to solve the CPU Adventure problem, which at that time was not solved by any of the teams (as a result we were the only ones who completed the task).



Here is a description of the CPU Adventure task:



My grandfather in the 60s was engaged in the development of computers. Putting things in order in his attic, I found a strange car. Next to the machine was a stack of punch cards marked “Dragon Adventure Game”. After some time, I managed to connect it to modern equipment, but the game is too complicated and I can not get to the end without cheating. Can you help me? I am enclosing a transcription of punch cards used in the machine. It is claimed that the machine has 4 general-purpose registers, 1 kibibyte of data memory, and 32 kibibyte of command memory. To play the game, connect to the server as follows: socat tcp4-connect:cpuadventure.hackable.software:1234 fd:0,rawer



Hint: the processor of the machine is unique, do not try to google information on it.



game.bin


Having connected to the server, we received the following information:



THERE IS A TAVERN HERE. INSIDE THE TAVERN, YOU SEE VALIS.



SELECT AN OPTION:



- GO (N)ORTH

- GO (E)AST

- (T)ALK TO VALIS

- (D)RINK

- SHOW (I)NVENTORY



YOUR CHOICE:






Fine. This is an old-school adventure game. After playing it a little, we realized that you can fight enemies and get a flag from this Valis character if we please him:



YOUR CHOICE: T



YOU ENTER THE TAVERN AND APPROACH VALIS.



- HEY, I WAS WONDERING IF YOU COULD HELP ME FIND THE FLAG?

- THE FLAG? MAYBE, BUT FIRST, I NEED A REDBULL.

- I... I DON'T HAVE A REDBULL.

- WELL THEN, MAKE YOURSELF USEFUL AND FIND ONE.



THERE IS A TAVERN HERE. INSIDE THE TAVERN, YOU SEE VALIS.



SELECT AN OPTION:



- GO (N)ORTH

- GO (E)AST

- (T)ALK TO VALIS

- (D)RINK

- SHOW (I)NVENTORY



YOUR CHOICE:






First steps



I didn’t play the game for very long, realizing that it is most likely more important to reverse engineer the game.bin



file. I opened it in a hex editor, expecting to see binary values. Imagine my surprise when I saw this:



  110011111101000000111100110010001110000011001101000000000000110010011101010000001101001111100001111111001100111000000011 ... 




This is literally a binary file - a text file that contains nothing but the ASCII characters 1 and 0. We know that this is most likely the machine code of the processor, but in addition to having 4 registers, 1 kbyte of data memory and 32 kibibyte of instruction memory, nothing is known about it. Therefore, our first serious task will be to determine the unit size of this binary file (for example, does it have an 8-bit dimension? Or, perhaps, does it have a 12-bit or 18-bit dimension, as in some ancient architectures ?)



To find out the size of an unknown file, I used an old trick - I resized the text box until the line wrapping length became aligned. This method is great for things like multiple XOR-encrypted text, unknown (uncompressed) file formats, and code from an unknown CPU:









Change the size of the text box



From this quick check, I found out that the unit size of this file should be a divisor of 20 (the width of the aligned window). To find out the exact size, I quickly wrote a script looking for long repeating lines (assuming that any code has repeating stereotyped sequences). The longest repeating line was the following 425-bit block, found at 43625 and 44510:



  10000011111110000001010100011111110100000101100010111000001001000101000100001000100001010001011000101000000001111111111100010000011110010100100001010100111100000110000010100000101000101000011110001111001101111001010100001010000111110100001010000110010011011110011111000000111011101000000001100000110000111101011010111011000100100010100000111000100011100011000000000101010101100010111000001010000001101010010000000011000001100 


Since the distance between repetitions is 885, we came to the conclusion that the dimension should be 5 bits, i.e. unknown CPU must have 5-bit "bytes". Progress!



We searched for 5-bit punch card encodings and quickly settled on the old encoding with the Baudot code . And in fact - when we used the online decoder to decode some fragments, we got readable text!



⇩DRAGON⇩HERE⇧; ⇧⇧⇩SHE⇩APPEARS⇩TO⇩BE⇩GUARDING⇩ SOME⇩KIND⇩OF⇩A⇩BOTTLE⇧; &. &. ⇩␀THERE⇩IS⇩A⇩B



Baudot ITA 1 425-bit extended with LSB


Then we tried to decode the entire file using Bodo code, but in about the first 20 thousand bits we got garbage, after which there was a perfectly readable text. This made it clear to us that the first part of the file belongs to the “code” section, followed by the “data” section, which contains constant lines. We hypothesized that the machine probably uses Bodo code for I / O, and therefore stores constant lines in Bodo encoding in memory as well. To make the code segment more readable, I decided to encode it using base-32 encoding, similar to hexadecimal encoding, but expanding it with the alphabet 0-9a-v. Here is the game.bin file with the base-32 encoded first part and the second part decoded by Bodo (the full file is posted here: game.b32 ):



pv83pi70pk00p7a0qfgvpjg3f0kf13f28p5f3pv10pk40pn60f0sf1sf24p5f3r9c11qad0f0sf1df26p5f39c21qad0f05f1ff26p5f39c41qad0f08f1df26p5f39c81qad0f0hf1ef26p5f3r1c00qaq15c20qcl0f01f1of27p5f3p3g3psf35c10qal0f02f1nf27p5f3p3g3psf3rf0hf1nf27p5f3f05f16f27p5f3rf84f95101311fl0f510f84907qa40b518447qa40b514f84f95k9m0k9m0k9m0907qa40b511447qa40b512ruougf10f20g0i9g0i910931b320u2u1u0ro9f0o9f0ojh0o9f0o9f0o9f0olj0o9f0o9f0o9f0o9f0o9f0o9f0o9f0o9k1onp0o9f0o9f0o9f0o9f0onf0ot82odi0o9f0o9f0o9f0o9f0o9f0o9f0o9f0olg0o9f0f0gf1df24p5f3r9c11qa835548755

[...]

93e9n59ka8fo87r85g8ui8ml8ed87b9h89u291u82333333333456789abcdb01234567892)%c3BOTTLE OF SOPLICA PIGWOWAENEMY HEALTH:



YOU ATTACK



YOU APPROACH REDFORD.







YOU ENTER THE TAVERN AND APPROACH VALIS.

[...]






For simplicity, I will call five-bit units “bytes” below. Between each other in the team, we came up with no other names - I called them kibbles, and Zak - hacks.



Unknown processor architecture reverse engineering



Now we are getting down to the difficult part - reverse engineering 4000 bytes of code that runs on a completely unknown unique CPU architecture. From the code, it’s obvious enough that this should be a set of instructions of variable length , because it is impossible to find a noticeable stable repeating pattern in it. I spent several hours on this, later my team member Zachary Wade (@ zwad3) began to help me. First of all, I started repeating pieces of code, suggesting that they would often use a small number of instructions. I began to split the code into shorter repeating sequences that were more convenient for analysis. I would like to say that it was a rigorous process, but in fact, the following vague algorithm was mainly used:





For example, one of the patterns I discovered was “0f0.f”, where “.” Stands for an unknown character. I broke the string on this pattern and got the following:



pv83pi70pk00p7a0qfgvpjg3f0kf13f28p5f3pv10pk40pn60f0sf

1sf24p5f3r9c11qad0f0sf

1df26p5f39c21qad0f05f

1ff26p5f39c41qad0f08f

1df26p5f39c81qad0f0hf

1ef26p5f3r1c00qaq15c20qcl0f01f

1of27p5f3p3g3psf35c10qal0f02f






Very useful! From the second and third lines we see that there is “... p5f3r9c ...” and “... p5f39c ...”, and this hints to us that “r” is a single-byte opcode, that is, “... 5f3” is the end of one opcode, and “ 9c .. ”is the beginning of another. In the last two lines we see “p5f3r1c ...” and this means that “1c ..” is another opcode, and “p3 ...” is also another opcode.



I continued to split instructions over and over in a similar fashion, using the similarities and differences of similar blocks to find likely instructions. In the end, I got something like this:



pv83

pi70

pk00

p7a0

qfgv

pjg3

f0k

f13

f28

p5f3

pv10

pk40

pn60

f0s

f1s

f24

p5f3

r

9c11

qad0

f0s

f1d

f26

p5f3

9c21

qad0

f05

f1f






I assumed that “p” and “q” were instructions with three bytes of operands, “f0”, “f1” and “f2” were instructions with one operand, and “9c” was an instruction with two operands. However, I did not know what each of these instructions is.



Therefore, I turned to the catalog of all the “p” instructions that I highlighted, and found that at the moment the most frequent instructions with “p” were “p5f3”. Moreover, looking at the places where it occurs, I found that it is always preceded by the instructions “f0”, “f1” and “f2”. Looking at all the operands “f0”, “f1” and “f2”, I noticed that the operands f2 are always in the range 4-8. Remembering that the CPU has 32 kb of program memory for addressing which requires 15 bits, I assumed that “f0”, “f1” and “f2” loaded some address with f2 as the high byte. When I connected some of these addresses together, I found that they all point exactly to the beginning of the constant lines in the data section. I found the print



function! It immediately followed that “p5f3” is actually some kind of instruction to print a line or call; If you take into account the three-byte operand, then most likely it is a “call”. Again, looking at the “p” instructions, I realized that the three bytes of the operand indicate the address in direct order (little-endian) , that is, the last byte of the operand is the highest byte of the address.



It was a huge breakthrough! We have identified our first instruction. Having seen that “f0” and “f1” are used in some other places, I assumed that they do not load addresses, but one of four registers (for example, f0 loads register 0) with 5-bit constants with direct addressing. This is logical for p5f3 — it loads three register arguments for the 3f5 function (“print_string”).



I started writing a disassembler that recognizes the “print” idiom (f0x, f1x, f2x, p5f3), putting the substitution of the printed line in the disassembled code. Due to the large number of lines in the program, the disassembled code quickly became very readable, and it became easier to find out the location of the function blocks (the full disassembled code is here ):



0: call 38v

4: call 7i

8: call k

c: call a7

g: q vgf



k: call 3gj

o: print 83k # 'SELECT AN OPTION\x0e:\r\n\r\n\x0f\x00'

15: call 1v

19: call 4k

1d: call 6n

1h: print 4ss # '\r\nYOUR CHOICE\x0e: \x0f\x00'

1u: ret



1v: unk 9

20: unk c

21: unk 1

22: unk 1

23: q 0da

27: print 6ds # '\x0e- \x0fGO \x0e(\x0fS\x0e)\x0fOUTH\r\n\x00'

2k: unk 9

2l: unk c

2m: unk 2

2n: unk 1

2o: q 0da

2s: print 6f5 # '\x0e- \x0fGO \x0e(\x0fN\x0e)\x0fORTH\r\n\x00'

39: unk 9

3a: unk c

3b: unk 4

3c: unk 1

3d: q 0da

3h: print 6d8 # '\x0e- \x0fGO \x0e(\x0fE\x0e)\x0fAST\r\n\x00'

3u: unk 9

3v: unk c

40: unk 8

41: unk 1

42: q 0da

46: print 6eh # '\x0e- \x0fGO \x0e(\x0fW\x0e)\x0fEST\r\n\x00'

4j: ret






From this small piece of code, I managed to figure out several aspects: the “q0” instruction should indicate some conditional branching (because it is used to skip printing unnecessary directions in function 1v), and the instructions “9c11”, “9c21”, “9c41”, “ 9c81 ”should indicate some kind of AND instruction - they check the set bits to see if these directions are allowed (this is clearly hinted at by the use of“ 1 ”,“ 2 ”,“ 4 ”and“ 8 ”in these instructions).



Over the next two hours, Zachary Wade (@ zwad3) and I sorted out the various instructions, making and refining our assumptions about what they were doing. Having many readable print statements made our job a lot easier. We decided that each of us individually would write his own disassembler so that we could examine the instructions at our own pace and share our findings.



Code reverse engineering



A few hours later, I began to make great progress in disassembling. After examining the code that worked with the user's inventory (more specifically, the “drink” function and each handler associated with it), we found instructions for saving and loading from memory (do not forget that the CPU has 1 kibibyte of data memory). Then we found out that some arithmetic / logical (ALU) instructions take memory operands (for example, “9c41” means “execute AND with value at data address 1 with value 4”). From this, we were able to recreate the variables in the data memory, for example, in [0] the NPC identifier is stored in the current location, and in [6,7] the player’s current health is stored (the lower 5 bits in [6], the oldest 5 bits in [7 ]). At this point, I moved from reverse engineering instructions to annotating my disassembled code and reverse engineering the program itself. Below is my funny notation for 5-bit values:



_start:

call init



L4:

call check_moves

call print_menu

call handle_command

br 4



print_menu:

call print_itemname

print 83k # 'SELECT AN OPTION\x0e:\r\n\r\n\x0f\x00'

call print_moves

call print_npcmenu

call print_itemmenu

print 4ss # '\r\nYOUR CHOICE\x0e: \x0f\x00'

ret



print_moves:

and 0y1, [1]

brz 2k

print 6ds # '\x0e- \x0fGO \x0e(\x0fS\x0e)\x0fOUTH\r\n\x00'

2k: and 0y2, [1]

brz 39

print 6f5 # '\x0e- \x0fGO \x0e(\x0fN\x0e)\x0fORTH\r\n\x00'

39: and 0y4, [1]

brz 3u

print 6d8 # '\x0e- \x0fGO \x0e(\x0fE\x0e)\x0fAST\r\n\x00'

3u: and 0y8, [1]

brz 4j

print 6eh # '\x0e- \x0fGO \x0e(\x0fW\x0e)\x0fEST\r\n\x00'

4j: ret



print_npcmenu:

add 0y0, [0]

brz 6m

sub 0y2, [0]

br<c> 5p

print 7o1 # '\x0e- (\x0fT\x0e)\x0fALK TO \x00'

call print_npcname

call print_crlf

5p: sub 0y1, [0]

brz 6m

print 7n2 # '\x0e- (\x0fF\x0e)\x0fIGHT \x00'

call print_npcname

call print_crlf

6m: ret



print_itemmenu:

print 7nh # '\x0e- (\x0fD\x0e)\x0fRINK\r\n\x00'

print 765 # '\x0e- \x0fSHOW \x0e(\x0fI\x0e)\x0fNVENTORY\r\n\x00'

ret






We still have many unknown opcodes, for example, although we found out that “qa” is a conditional branching on zero (branch-on-zero, brz), we were not able to understand what “qc” is (indicated above as br <c>). But that was enough to begin to understand the logic of the program.



In fact, the game allows the player to move around an 8 × 8 map on which NPCs (dragons, red bulls, and humans) are randomly placed. You can fight with any NPC (even Valis, despite the lack of a corresponding menu item). During the battle, you can attack the enemy, causing a random amount of damage or miss, after which the enemy attacks the player, also causing a random amount of damage or miss. You can also select a shield lock, thanks to which the enemy either misses or hits the shield without causing damage. Finally, you can cheat by raising your health to 1000, but in this case, the hidden variable (“cheated”, address 10) is set to 1. If the player successfully kills the enemy, an object falls out of him, usually a bottle of some alcohol (obviously, this game is not for children).



The main NPC Valis, from which the player must receive the flag, has a state machine in which he asks the player for several items - a bunch of red bull drinks (obviously obtained when defeating red bull enemies), various mixed drinks (for example, gin and tonic - to get them, you need to defeat the blue dragon and the gray dragon, and then mix the objects dropped out of them), and the power strip, which can be obtained if you defeat another NPC person in the game (Redford) or help him. If you fulfill all this long series of requests, he will give you the flag, but only if the variable “cheated” is not equal to 1. That is, our task is to win the game without cheating. Since we start with just 100 HP, like all enemies, with the usual passage it will be impossible to defeat all enemies (to get all the necessary things you need to defeat about 20 opponents). It is necessary to modify the RNG in some way so that the enemy always misses.



Random numbers are generated by a function that is similar to some kind of PRNG (address 37a), but it uses unique instructions that are not used anywhere else, so we cannot reverse engineer them. However, we noticed that it loads its state vector from three addresses in memory ([11], [12] and [13]), that is, its full state takes only 15 bits. This means that the RNG must have a short period - no more than 2 ^ 15 = 32768 in length.



While we (unsuccessfully) tried to reverse the RNG implementation, Jay Bosamia (@ jay_f0xtr0t) and Matthew Savage (@thebluepichu) implemented an exploit. By simply sending the “shield” command 100,000 times in a row, we were able to get a sequence of enemy “hits” and “misses” corresponding to the bit output from the RNG. We made sure that this sequence repeats with a period of 32767. Thanks to this, we were able to assemble our main exploit - when fighting with the first enemy we met, we closed our shields 40 times to recreate the sequence of hits and misses, looked for this sequence in a large periodic sequence, and then found out when to shield and when to attack so that the enemy always misses. Then we just went around the whole map in murderhobo mode, killing everyone and collecting their loot. At the end, we returned to Valis and politely asked for our flag, which we received:



DrgnS{m4kin9-v4lis-happy-w1th-n4t1ve-b4ud0t-cpu}





Fuh! Indeed, a real adventure. I still can’t fully believe that we came from a binary string and a complete lack of processor documentation to two almost complete disassemblers and a clean disassembled code in less than 10 hours! All the code is available on GitHub: my disassembler , the Zachary disassembler , my raw disassembled code , my annotated disassembled code , Matt's Exploit client .



All Articles