|
|
--- |
|
|
license: mit |
|
|
--- |
|
|
SK²Decompile: LLM-based Two-Phase Binary Decompilation from Skeleton to Skin |
|
|
|
|
|
SK²Decompile is a novel two-phase framework for binary decompilation using Large Language Models (LLMs). Our approach decomposes the complex decompilation task into two manageable phases: |
|
|
|
|
|
Phase 1 Structure Recovery (Skeleton): Transform binary/pseudo-code into obfuscated intermediate representations (current model) |
|
|
|
|
|
Phase 2 Identifier Naming (Skin): Generate human-readable source code with meaningful identifiers 🤗 [HF Link](https://huggingface.co/LLM4Binary/sk2decompile-ident-6.7) |
|
|
|
|
|
Usage: |
|
|
``` |
|
|
pip install vllm |
|
|
apt install clang-format |
|
|
#translate the data into reverse_sample.json format |
|
|
python normalize_pseudo.py --input_json reverse_sample.json --output_json reverse_sample.json |
|
|
python sk2decompile.py --dataset_path reverse_sample.json --model_path LLM4Binary/sk2decompile-struct-6.7b --recover_model_path LLM4Binary/sk2decompile-ident-6.7 |
|
|
``` |
|
|
|