HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Can We Generate Shellcodes via Natural Language? An Empirical Study

Pietro Liguori Erfan Al-Hossami Domenico Cotroneo Roberto Natella Bojan Cukic Samira Shaikh

Can We Generate Shellcodes via Natural Language? An Empirical Study

Abstract

Writing software exploits is an important practice for offensive security analysts to investigate and prevent attacks. In particular, shellcodes are especially time-consuming and a technical challenge, as they are written in assembly language. In this work, we address the task of automatically generating shellcodes, starting purely from descriptions in natural language, by proposing an approach based on Neural Machine Translation (NMT). We then present an empirical study using a novel dataset (Shellcode_IA32), which consists of 3,200 assembly code snippets of real Linux/x86 shellcodes from public databases, annotated using natural language. Moreover, we propose novel metrics to evaluate the accuracy of NMT at generating shellcodes. The empirical analysis shows that NMT can generate assembly code snippets from the natural language with high accuracy and that in many cases can generate entire shellcodes with no errors.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
code-generation-on-shellcode-ia32Seq2Seq with Attention
BLEU-4: 90.03
Exact Match Accuracy: 82.92
code-generation-on-shellcode-ia32CodeBERT
BLEU-4: 91.70
Exact Match Accuracy: 89.75

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Can We Generate Shellcodes via Natural Language? An Empirical Study | Papers | HyperAI