Recovering Meaningful Variable Names in Decompiled Code
Software Engineering Institute
In this project, we propose the Decompiled Identifier Renaming Engine (DIRE), a novel probabilistic technique for variable name recovery that uses both lexical and structural information. We also present a technique for generating corpora suitable for training and evaluating models of decompiled code renaming, which we use to create a corpus of 164,632 unique x86-64 binaries generated from C projects mined from Github. Our results show that on this corpus DIRE can predict variable names identical to the names in the original source code up to 74.3% of the time.