Synesthesia: Modern Shellcode Synthesis under Arbitrary Encoding Restrictions

Author(s): Rolf Rolles
Category: Exploit Development
Duration: 90
Summary: The problems of shellcode generation and of memory
corruption exploit development share a birthday. In brief, memory
corruption exploits must trick a program into executing machine code
("shellcode") provided as input. Each individual exploit scenario may
place constraints upon the allowable machine code bytes: NULL bytes (or
any arbitrary bytes) may be disallowed; the input may be constrained to
be alphanumeric; all ASCII characters may be required to be uppercase;
certain characters may be filtered; the input may be transformed in
arbitrary ways; the input may be required to lie within UTF-8 or UTF-16;
and so on.

Historically, the security community has dealt with these problems on a
case-by-case basis. Many papers were written regarding various processor
architectures and some common encoding restriction schema. Generally,
these publications describe patterns for performing common operations
(setting registers to constants, obtaining the program counter, etc.)
within the given encoding restriction. From these publications came
shellcode encoding; rather than writing the entire shellcode within the
encoding restriction, we encode a shellcode blob within the encoding,
and generate a machine code program within that encoding to decode the
blob and execute it. Shellcode encoders are useful, but they suffer from
a number of issues. They expand the size of the shellcode blob, which
can render an exploit unworkable. They often contain common sequences of
machine code, for which IDS detections are readily available. They are
not guaranteed to find an encoding and a decoder, even if one exists. In
short, shellcode generation is still a real-world problem, despite the
existence of shellcode encoders.

In this publication, we provide a novel technique based on program
synthesis for creating machine code programs that fall within a given
encoding. Essentially, Synesthesia is a compiler whose inputs are a
specification of the desired functionality, together with a
specification of the allowable encodings. In another mode, Synesthesia
can create equivalent code sequences to ones provided as input, where
the input does not conform to the encoding requirement but the output
does. More experimentally, given an existing shellcode, Synesthesia can
encode it and create a full-fledged decoder loop conforming to an
encoding restriction.

Synesthesia enjoys a number of nice theoretical properties: it is
guaranteed to find an encoding for the desired functionality if one
exists among the modelled instructions; the user can search for the
shortest program in terms of byte length or number of instructions; it
does not rely upon pattern databases of any sort, so each run can
potentially produce an entirely unique output; and it can produce
self-modifying code. The ideas behind Synesthesia are not tied to any
specific processor architecture, and do not require any dynamic analysis
or brute-forcing.

This presentation will discuss the context wherein Synesthesia exists,
the concepts behind its design, case studies on more than one assembly
language, a performance evaluation, and a discussion of the theoretical
limitations (i.e., permanent issues) and practical ones (i.e.,
limitations of contemporary SMT solvers). Synesthesia shall be made
available as open source.

Likes: 1