A. Simon and J. Kranz. The GDSL toolkit: Generating Frontends for the Analysis of Machine Code. Program Protection and Reverse Engineering Workshop, PPREW '14, San Diego, California, USA, January 2014. ACM.

Any inspection, analysis or reverse engineering of binaries requires a translation of the program text into an intermediate representation (IR) that conveys the semantics of the program. To this end, we propose a domain specific language called GDSL (Generic Decoder Specification Language) that facilitates the translation from byte streams to instructions and from there to other intermediate representations. We present the GDSL toolkit, containing a compiler from GDSL to C, instruction decoders (currently for Intel x86 and Atmel AVR), translations to semantics, and optimizations of the semantics. Other processors, semantics and optimizations can be added, thereby providing a common platform for building front-ends for the analysis of binaries. The emitted C code is human-readable and outperforms hand-written code such as the XED decoder shipped with the Intel Pin toolkit.

Download: PDF Reference: Bibtex Electronic Copy: DOI