Binary Leetness Walkthrough (DefCon CTF 2007 Qualifiers)
This category was like Pwnage-lite. You had to do a good bit of digging around in assembly, but with an eye towards intended function instead of UN-intended function. This gave it a nice flavor very different from the Pwnage challenges, which tend to focus more on breaking than working with the binary.
100: Do you follow the news?
Based on the "HD-DVD" clue, we didn't even disassemble this binary, we just started out by feeding it the HD-DVD key "09F911029D74E35BD84156C5635688C0". This produced the output "oh nine eff nine", and we dorked around for a little while before realizing that that was actually the answer itself.
$ ./reversing100-5dcea7d1ae5a060472ef5b4155ce77fb
USAGE: ./reversing100-5dcea7d1ae5a060472ef5b4155ce77fb <password>
$ ./reversing100-5dcea7d1ae5a060472ef5b4155ce77fb 09F911029D74E35BD84156C5635688C0
oh nine eff nine
200: Backtrace where?
This binary does a handful of mmap calls, copies itself into the mmapped locations, jumps to them in order, and uses the ud2 opcode to generate an undefined opcode exception. Running it under gdb, we can't help but notice that all four bytes of %eip are in the ascii range. The "bt" command in gdb doesn't give us a valid backtrace, but we can do it manually by inspecting %ebp...
(gdb) run
Starting program: ./reversing200-f742f0774c1ff2a3ec5a0f4b2c9d9019 
5...4...3...2...1...
The key is ready!

Program received signal SIGILL, Illegal instruction.
0x54686520 in ?? ()
(gdb) x/2x $ebp
0xbff7fc80:     0xbff7fca0      0x6b657920
(gdb) set var $ebp = 0xbff7fca0
(gdb) x/2x $ebp
0xbff7fca0:     0xbff7fcc0      0x69733a22
(gdb) set var $ebp = 0xbff7fcc0
(gdb) x/2x $ebp
0xbff7fcc0:     0xbff7fce0      0x73617665
(gdb) set var $ebp = 0xbff7fce0
(gdb) x/2x $ebp
0xbff7fce0:     0xbff7fd00      0x64206569
(gdb) set var $ebp = 0xbff7fd00
(gdb) x/2x $ebp
0xbff7fd00:     0xbff7fd20      0x702c2066
(gdb) set var $ebp = 0xbff7fd20
(gdb) x/2x $ebp
0xbff7fd20:     0xbff7fd28      0x74772122
(gdb)
By taking the %eip from each stack frame that we've walked back and printing them out as an ascii string, we end up with the key.
$ perl -e 'print unpack("Z*", pack("N", hex($_))) for @ARGV' 0x54686520 0x6b657920 0x69733a22 0x73617665 0x64206569 0x702c2066 0x74772122
The key is:"saved eip, ftw!"
We can also using gdb's internal scripting language to manually walk up the stack and print out %eip as ascii:
define print_ascii_eip_frames
  set $a = (long)$eip
  set $b = $ebp
  set $i = 0
  while ($i < ($arg0))
    printf "%c%c%c%c", $a>>24, ($a>>16) & 0xff, ($a>>8) & 0xff, $a & 0xff
    set $a = *((long *)($b+4))
    set $b = *((long *)$b)
    set $i++
  end
  printf "\n"
end
Which give us...
(gdb) run
Starting program: /home/underminers/ctf07/reversing200-f742f0774c1ff2a3ec5a0f4b2c9d9019 
5...4...3...2...1...
The key is ready!

Program received signal SIGILL, Illegal instruction.
0x54686520 in ?? ()
(gdb) print_ascii_eip_frames 7
The key is:"saved eip, ftw!"
(gdb)
300: Know your encryption
Running "strings" shows a chunk of text that appears to be from an in-binary copy of libbzip2:
$ strings reversing300-35d40ed39d529041a3b5e9dedf1e6390
...
bzip2/libbzip2: internal error number %d.
This is a bug in bzip2/libbzip2, %s.
...
So, we assume one of the algorithms is going to be "bzip2". We also guessed that another algorithm will likely be encryption. Since it needs to be reversible and work on arbitrary length data, making it likely to be a symmetrical stream cipher.
After taking a close look at the disassembly of the "main" routine (0x8048dcb), we see two calls being made with what appear to be the contents of the file. The first is 0x804a630, and the second is 0x80491e8.
Running data through 0x804a630, we find that the output is identical to other bzip2 implementations:
...
(gdb) x/50xb $ebp - 65536 + 53108
0xbfba98fc:     0x42    0x5a    0x68    0x39    0x31    0x41    0x59    0x26
0xbfba9904:     0x53    0x59    0x40    0xc8    0xcb    0xc1    0x00    0x00
0xbfba990c:     0x02    0xc8    0x00    0x04    0x10    0x20    0x00    0x20
(gdb) quit
$ python
>>> import bz2
>>> hexText(bz2.compress('11111111\n',9))
'\\x42\\x5a\\x68\\x39\\x31\\x41\\x59\\x26\\x53\\x59\\x40\\xc8\\xcb\\xc1\\x00\\x00\\x02\\xc8\\x00\\x04\\x10\\x20\\x00\\x20\\x00\\x21\\x98\\x19\\x84\\x3b\\x0b\\xb9\\x22\\x9c\\x28\\x48\\x20\\x64\\x65\\xe0\\x80'
This confirms that 0x804a630 is the bzip2 call, and that we only have one more algorithm to discover.
At the start of 0x80491e8, we see that the length argument from "main" is being rounded up to the next whole multiple of 8:
LC080491fd:     movl    16(%ebp), %eax
                and     $0x07, %eax
This seems like an indicator of a 8-byte block cipher. Working from a good list of ciphers we find that blowfish matches (8-byte block, symmetric). Trying "blowfishbzip2", we find that we have the correct answer.
For a more precise description of how to discover blowfish, sk3wlm4st3r sent this:
Following the first function called by main after it parses its options at 08048F66, it calls a second function (sub_8048CCE) at 08048B37. In sub_8048CCE, a decoding loop is run that xors a block of bytes with the value 0x75757575. If you decode the bytes, the first word decoded is 0x243F6A88. If you google the number, most of the hits reference blowfish. Following the code a bit further runs you right into the blowfish key setup which I was lucky enough to be familiar with.
400: TBD
Unfinished! Please send us a write-up if you've got one. :)
500: sk3wlm4st3r is seriously Ninja
During the Quals, 1@stPlace didn't finished BinLeet 400, much less BinLeet 500. It is with great thanks that we present the BinLeet 500 walk-through from sk3wl0fr00t's own sk3wlm4st3r:
The provided binary was reported by file as:
reversing500: ELF 32-bit LSB executable, Intel 80386, version 1 (FreeBSD), dynamically linked (uses shared libs), stripped
Intersting strings: from "strings -a"
   Congratulations. Your decrypted key is: %s
   The Netwide Assembler 0.98.39
The first string hints that the binary will print the key for us. The second string hints that some nasm generated code has been linked in
Firing up IDA on the binary, we locate "main" based on the behavior of "start". main (0x804849c) will be the first function called after the call to __init in this case.
main quickly copies pointers to 6 functions into a stack allocated array (from the table at 0x8048b20). We dubbed these function func_0, func_1, ... func_5 and refered to them collectively as the stage functions.
The program assumes that argv[1] exists, and computes its length which must be 24, it then does is own version of strdup to create a copy of argv[1]. As a simplifying assumption, we assumed that argv[1] was made up of ascii printable characters. We also assume that the output was ascii printable based on the format string above.
main loops three times calling two of the stage functions per pass. The functions are called in pairs, 0/1, 2/3, and 4/5. Each of the functions takes a length and an array of bytes. The length is always 24 and the array of bytes is always the copy that was made of argv[1]. The functions all return a boolean and if any return false, the program terminates.
For understanding, we reversed the entire program back to C which paid off later when we elected to build a brute forcer to solve the challenge. Our source is available in re500.c and func4.asm. Both contain many variations from the original kenshoto code but function essentially the same.
The basic transformation of data through the program is as follows:
     argv[1] -> func_0 -> func_2 -> func_4 -> build_key -> key
                       ^         ^         ^       ^
                       |         |         |       |
FILTERS:             func_1    func_3    func_5  bad_hash
None of the filter functions modify data, they simply verify that certain conditions hold at the completion of various phases.
We studied all of the functions to understand how variations in the input data could affect the output data. It turns out that a given output byte n is dependent on input bytes n, n + 1, and n + 2, and to a lesser extent all other input bytes from 0 through n-1. Dependency on n + 1 comes from func_0, n + 2 from func_2, and 0..n-1 from func_4. The relationship between the input bytes is fairly easy to follow through func_2 where for a byte at index i f2(i) = (f0[i] ^ f0[(i % 24) + 1]) + f2_val[i] and f0[i] = in[i] + in[(i % 24) + 1] + f0_val[i]. All of this makes values out of func_2 easy to compute. The first filter in func_3 is useful here as it requires that f2(i) must be prime when i is prime. We therefore have an opportunity to discard many potential input candidates once we can start computing f2(2), f2(3), f2(5) etc. This part was not difficult, but did not significantly cut the number of valid candidates that made it through func_3.
The func_4 presented the greatest challenge as it contained two stages of self modifying code, and used the outputs from func_2 to index into and select transformation functions from a table of 32 possibilities. These functions performed a wide variety of operations including float to int conversions, trigonometric functions, and byte swapping operations. Many of the functions took additional input in the form of the current loop count, and many of these functions carried results forward from previous iterations. This is why outputs from func_4 become dependent on all of the previous outputs. As a side twist, two of the func_4 subfunctions resulted in the execution of an illegal instruction which would cause the program to terminate. When developing the brute force attack we had to insert logic to prevent these functions from being called.
In order to brute force the key, we needed to be able to run values through func_4 an get the same results as Kenshoto did. To achieve this we elected to strip all func_4 and its related code out of the binary and link it in to our brute forcer. Before we could extract func_4, it needed to be deobfuscated which we did with the IDA script below:
//IDC Script to decode the func_4 stuff
//this implements the decoder loop at loc_8049D01
auto i, addr, val;
addr = 0x08049D2E;
for (i = 0; i < 377; i++) {
   val = Byte(addr);
   val = val ^ 0x4B;
   PatchByte(addr, val);
   addr++;
}
The extracted func_4 then needed to be modified to preserve its exact behavior, even after multiple calls. The modified version can be found in func4.asm.
Once we could push values through func_4, we could utilize the checks in func_5 to further filter down input candidates. As before, the number of candidates that were ascii on the input side and yielded ascii on the output side were huge.
As a result we started working backwards from the location that the key is displayed to the user. We learned that the output from stage 4 gets hashed and must hash to a specific value. If the hash matches a predetermined value, the buffer was transformed one additional time to yield the actual key which was displayed to the user.
We could not used the hash function as a filter while growing input candidates because the hash function required 24 bytes of input. At this point we chose to attack the hash function which was a simple shift and xor type hash that yielded a 4 byte value. By unrolling the hash loop we determined that the final hash was actually dependent on only a few bytes out ouf func_4. This is because the shifting portion of the hash causes all of the other bytes to "fall off" the right side. The final dependencies are shown below:
input                   11
bytes               15  14
from            19  18  17
func_4      23  22  21  20
            
final hash: C6  68  35  9B
As can be seen, the most significant byte of the hash is determined solely by byte f4(23), thus f4(23) was known to yield C6. Furthermore C6 subtracted from key_bytes[23] must yeild the output byte at position 23 ('!'). Knowing f4(23) immediately yields f4(2), f4(5), f4(8), f4(11), f4(14), f4(17), and f4(20) which in turn yielded output bytes 2, 5, 8, 11, 14, 17, and 20. At this point, we knew the output to look something like this:
char output[] = "..t..n..e..y..i.. ..r..!";
where dots represent unknown bytes. This was enough additional information to cut down the number of candidate input values significantly, enough to conduct a practical brute force on the input space. Our brute force grew to a maximum of 580 million input candidates by the time we had grown the input to 10 characters. Fortunately growing beyond 10 started to bring more of the filters from func_1, and func_3 in to play and the number of candidates tailed off significantly as we continued to grow the input. As the number of candidates started to shrink we finally reached a point where there were few enough for us to start looking for plain language candidates either on the input side or output side. By the time the inputs had grown to the mid teens, it was apparent that the output was going to be something along the lines of "rotten decryptin....." which was obtained with input in the form D0ct3rK3NpwNzALLM1W so we figured we were on the right track.
In the end out final solution was ugly and used 22 programs, each responsible for growing the input by a single byte. The 22 programs could be chained together with pipes to create a brute forcer that would solve to problem in about 5-6 hours and looked something like this on the command line:
b0 | b1 | b2 | b3 | b4 | b4 | b6 | ..... | b20 | b21
Stage b0 required brute forcing 3 bytes of input to yeild a single byte of output. After that each additional stage appends a single input byte to generate an additional output byte. At stages b2, b5, b8, ... we have the benefit of know what the output byte in those positions must be which cuts down the number on input candidates significantly at each of those stages and keeps the brute force space to a reasonable size.
By far the biggest challenge in the process was the extraction of func_4 which was self modifying, contained position dependent transformations, and which did not honor the gcc requirements to preserve ebx, edi, and esi across function calls (hence the heavily modified version of func_4 that we incorporated into our brute_forcer). The second challenge of incorporating func_4 into a brute forcer was that func_4 contained a subfunction that was essentially a flip-flop toggled by a specific input value. We had to incorporate code that ensured the flip-flop function was reset prior to each brute force attempt that we made.
props to sk3wl member r00tbat for all of his contributions to our solution.
--sk3wlm4st3r

ctf 2007 quals