Sunday, April 11, 2010

Reverse Engineering

To start off I'd like to point at a fellow student
who is doing some really good work, John.

Read about his Universal Decompiler project here

Get the source code here


AND for more RCOS projects be sure to visit this link
================================================


He's also been working on a different kind of reverse engineering.

Years ago I saw a presentation at Hope 2008 by Karsten Nohl on reversing chips. Using some space alien technologies such as "Sandpaper" and "MATLAB" they were able to break proprietary RFID cryptography. Wow! Most hackers would have just called up a secretary or something.

Read Karsten Nohl, Starbug, Henryk Plötz, and David Evans fascinating USENIX paper here.

And another nice presentation was given at cansecwest

While the skilled might be able to read www.siliconzoo.org like a coloring book,
I can not. I'm sure that maybe to VLSI CAD researchers and others it's a breeze. But a blog post over at flylogic left me equally confused. I needed something easier for a newbie like me. I kept getting my parallels vs series pretty darn wrong. So I emailed Dr. Nohl who kindly responded.

So once you see it, it's pretty magical. Sort of. I haven't worked my way through all of the zoo just yet.

Enjoy this educational visual inspection of 2-input CMOS NOR and NAND gates.

Whoaaa

Hello from one year in the future.

I have not worked on RCOS-Binstat since last April. When we left off I realized I did not have enough formal background. I had implemented incomplete disassemblers for x86 and MIPS with basic value propagation for constant folding, some simple SSA, and rudimentary analysis for doing things such as resolving function arguments. Just that alone was actually useful for finding real life bugs.

That code is still available here: http://github.com/adc/rcos-binstat

It sucks.

Hopefully I will get back to it soon. Currently I'm searching around for a language with good features and no bad ones. Something painfully simple and powerful with a low learning curve.

The next thing is to analyze the tremendous amount of projects that have advanced and been created and released to the public in this past year alone.

So what have I been doing in all my free-time? I spent the summer with some very good friends in the land of pirates.

Then stuff happened when I got back to school. My computer engineering classes have picked up, landing me an infinite well of excuses. And just when I thought it got better, the club started teaching a class: www.cs.rpi.edu/academics/courses/spring10/csci4971/.

At least last Fall I finally learned what a transistor is. After all, the university I attend once digested and emitted Teddy Hoff, the father of the microprocessor and Intel employee #12.

I also got some significant appreciation for Shannon's work on boolean logic. A Jean-Claude vs Claude deathmatch will obviously show Shannon as one mighty dude.

Someone should report back on Shannon's work with genetic computing.

Friday, May 1, 2009

rcos ppt

I was asked to make a presentation on current progress. Here it is: http://rpisec.net/attachments/download/21/go.pdf

Thursday, March 12, 2009

The Intermediate Representation

In this post I will show an example of the Intermediate Language previously described. 

As previously mentioned there are more or less four basic operations.

  • 1. operation()             <- for math, takes operands and operators
  • 2. load $register        <- implicitly retrieves data from the memory address resulting from the previous operation and puts it into $register
  • 3. store $register      <- stores $register into the implicit result of the previous operat
  • 4. branch_true           <- branch if the previous operation is != 0 , otherwise fall through

For now call and ret operations are also used to make function detection more manageable. In the future these will be removed, as they can be derived from combinations of load/branch. 

Here is some sample output from the MIPS translator on the print2lpr binary. In addition some basic analysis has been performed to resolve GOT entries and strings. This is fully automated but register propagation is very basic at the moment. Be sure to check out the code at http://rpisec.net/repositories/show/rcosbinstat. The majority of that code is in the libtransform function in mips_translator.py 

0x4023ec ($28, '+', -32460)     
0x4023ec LOAD $25     ### getenv
0x4023f0 ($4, '=', $4, '+', 1588)     %%%% "LPUSER"
0x4023f4 ($0, '=', $0, '<<', $0)     
0x4023f8 ($31, '=', $32, '+', 4)     
0x4023f8 CALL $25     
0x4023fc ($29, '+', 48)     
0x4023fc LOAD $28     
0x402400 ($6, '=', $2, '|', $0)     
0x402404 ($2, '==', $0)     
0x402404 BRANCH loc_0x402428     
---end of block---

--- block 402408 -> 402424:13--
parents:  []
branches:  0x402428 0x0
0x402408 ($28, '+', -32692)     
0x402408 LOAD $5     %% (10000000)
0x40240c ($28, '+', -32556)     
0x40240c LOAD $25     ### sprintf
0x402410 ($4, '=', $16, '|', $0)     
0x402414 ($5, '=', $5, '+', 1596)     %%%% "P%s
0x402418 ($31, '=', $32, '+', 4)     
0x402418 CALL $25     
0x40241c ($29, '+', 48)     
0x40241c LOAD $28     
0x402420 ($16, '=', $16, '+', $2)     

On MIPS arguments are passed in registers $4 - $8 and then the stack. It would be nice to automate that translation but I haven't had a chance to. Anyway, let's look at how the sprintf function is being called.

Note that register $16 points to a global write buffer in the BSS. sprintf takes 2 arguments and then a variable number of arguments:  

sprintf(dest, fmt, .... ) 
$4 is the destination <- $16 = global write buffer
$5 is the format string <- "P%s"
$6 is $2. 

Return values on MIPS are placed in $v1/$v2 which are registers $1/$2.  Looking up at the previous code block the last call was to getenv("LPBUF"); 

Our code is then

write_buf += sprintf(write_buf, "P%s", getenv("LPBUF")); 

We are still quite a ways from automating that higher level translation, but having automated string and library resolution is certainly nice.

By the way, this particular code segment was not exploitable as there is nothing good after the global BSS buffer. If you're really interested in Irix bugs, get a life, or contact me :-).

Back to the topic at hand. The IR looks to be painfully simple. This is to make analysis very easy in the long term. In addition forcing this simplicity makes translation from other architectures better possible. Analysis tools will then be much more useful as platforms change and so on. 

Wednesday, February 18, 2009

Why Pink?

Because there are no other static analysis blogs decorated with pink. Roughly 20 days have passed and this project is finally rolling. Please check out the project in progress and play with the Wiki and forums.












































Alright. It's 5 minutes to go, you're me, and you need to show your boss all the work you've done. Close the lolcat tabs. Oh gosh, I really meant to save that one, thank you oh mighty and omnipotent firefox recently closed tabs history menu.


On your sketch pad you write down:


While surfing the chaos I have built up the following:
  • A notion of a higher level IR that no longer resembles asm
  • An almost complete MIPS translator into this IR
  • Really basic procedure detection and inter+intra procedure flow graphs on mips-code-generated IR

Now this one is the easiest but the most important. The Intermediate Representation is the basis for what the code from all of the other architectures is translated into. Originally something along the lines of U-Code was envisioned. Now, UCode works well for writing a compiler. UCode has a concept of a stack, a heap, global variables, and so on, it's nifty! But overly complicated for what I think is needed. We need something super simple. Here it is:


There are 4 staple operations in the IR:
  1. operation(...) (... = list of operands and operators)
  2. load [destination]
  3. store [source]
  4. branch_true [destination]

#1 covers all of the bit twiddling and math instructions. #2/#3/#4 use an implicit value which is the result of the previous operation(). For example, load is implicitly given a source memory address.


In terms of analysis #4 covers many contexts, so this one has actually been re-complicated back into jump, call, and branch_true. And, call also has a corresponding ret instruction. For what purpose? Procedure detection. What's the difference between a call and a jump? A return address. A jump and a branch_true? It's philosophical but I consider it to be locality.


The next blog post will elaborate more on the IR, how the operands work, and other instruction abstractions that are created to ease analysis. If you can't wait, I encourage you to glance at the code.




Upcoming Milestones
  • x86 translator!
  • integer under and overflow hot-spot detection
  • unchecked return values, double frees

What's needed after those
  • A GUI is needed to play with these graphs, Processing? Any suggestions? Readers please help
  • PE file format support
  • IR Transformation gadgets -- an interface to the IR for applying easy transformations for detecting external library calls, function prologues and epilogues, specialized integer operations, and ....