LLM-Assisted Reverse Engineering of Binaries

Michał Jurzak

A reverse-engineering assistant that pairs a large language model with the radare2 analysis framework to decompile and explain compiled C/C++ binaries. Rather than treating the model as a one-shot decompiler, the tool gives it a toolbox and lets it investigate the binary the way an analyst would.

How it works

The LLM acts as a tool-using agent over radare2. Given a binary, it can:

run full program analysis (aaa),
list functions, strings, and imports/exports,
inspect memory at specific addresses, and
request high-quality decompilation of selected functions.

From these observations it produces commented, high-level C/C++ source. The decompiled output can then be recompiled and compared against the original binary, giving concrete metrics on how faithful the reconstruction is. On success, the tool also auto-generates documentation and a README for the recovered code.

Explainability

Two features make the agent’s behaviour legible. An “explain in detail” mode returns a structured breakdown of the decompilation, and a tool-usage graph visualises the exact sequence of radare2 calls the model made during its analysis, useful both for trust and for debugging the agent’s strategy.

Engineering

The interface is a Streamlit application with real-time tool logging and save/load/delete of whole analysis sessions. Models are pluggable: hosted OpenAI models or local ones via Ollama. The whole stack, including radare2, ships as a self-contained Docker image for one-command setup.

Source

This was built for the project “LLM: code translation and reverse engineering”. Code and Docker instructions are in the source repository.