Mace: Model-Inference-Assisted Concolic Exploration for Protocol and
Vulnerability Discovery, C. Y. Cho, et al., USENIX Security 2011.

Synopsis by Andrew Chi, 2013 December 1


Problem addressed:
==================

Symbolic execution can get stuck in small local state-subspaces (e.g.,
loops in the code).  In certain classes of applications such as
network protocols, there exists a finite state machine (FSM) that is a
good model for the overall behavior of the application.  In such
cases, the protocol FSM can be used to deepen the coverage of symbolic
execution techniques by forcing it to skip to different parts of the
state space.  The trick, as shown in this paper, is to do this
(almost) automatically.


Basic approach:
===============

0. Pick implementations of network protocols (i.e., RFB and SMB) where
   the overall model (FSM) has a good chance of being discoverable
   from external observation.  Manually create equivalence classes of
   outputs, named "abstract outputs" in the paper.

1. Begin with some seed input/output pairs, such as from a regression
   test suite.  Use L* to build a minimal model of the protocol FSM.

2. Run the DART (?) concolic execution engine, exploring the
   "vicinity" of the abstract states in the FSM, creating more
   input/output pairs.  Filter out pairs where the output (when mapped
   to abstract output) has already been seen.

3. Iterate steps 1 and 2 until the FSM model converges.

In the meantime, record the input/output behavior of step 2.  If there
were any crashes (e.g., critical exceptions), record them, and then
later deduplicate them based on location of crash.


Key insight/innovation:
=======================

Pairing concolic exploration with an automatic protocol discovery tool
can produce mutual benefit:

1. The concolic exploration can be "seeded" to start at the protocol
   FSM states, thereby skipping over time sinks such as loops, and
   increase code/behavior coverage.
   
2. The automatic protocol discovery is helped by symbolic execution in
   that the symbolic execution can automatically generate more
   input/output pairs by taking branches in the code that were not
   traversed previously.


Pros/cons: important problem, assumptions, evaluation, approach
===============================================================

One still has the manual work of deciding between equivalence classes
of outputs.  And this seems like an art -- one is trying to decide
which variations in output actually cause traversal of different parts
of the overall state machine.

The Mealy machine will still be incomplete if there is a closed
subspace of messages.  As the authors mention, "[MACE] might not
discover some types of messages required to infer the full state
machine of the protocol".  And case in point: they only discover 23
out of the 67 SMB message types.  But I guess it's better than before.

(4.4) The model assumes that 1 input maps to 1 output.  If there are
either 0 or >1 outputs, they had to introduce a bit of cludge:
artificial no-response message, and ignoring all but the first output
message.  Many protocols will certainly not be a 1-1 mapping between
input and output packets.  TCP sends an ACK of the last complete
sequence number, so multiple input packets can be ACKed with one
output reply packet.  DNS zone transfer has diff behavior: send me the
list of stuff that I haven't seen yet.  The model needs to be
stronger.

For a complex protocol such as SMB, there is more internal consistency
required in the protocol messages than this technique is likely to
discover automatically.  "The concrete messages generated... often had
invalid message parameters, so the server would simply respond with an
error."  This may be a fundamental limitation.

They basically use "does it crash" / "does it hang" to decide whether
a vulnerability has been discovered.  Actually, many insidious
vulnerabilities don't crash the application, but rather cause it to
misbehave silently.


Questions:
==========

Is "dynamic symbolic execution" the same thing as "concolic execution"?

Does the phrase "decision procedures" mean "SAT solvers"?

Section 3.3: Let s = m_0 ... m_{n-1}.  Does s_j = m_j, or
s_j = m_0 ... m_j?  Their definition was not clear.

Figure 5: Isn't the MACE line guaranteed to be at 100%, since it is
the only basis for comparison?  I find this to be a misleading
graph...


Ideas for future research:
==========================

Any improvement to the issues I list in the "pros/cons" section.

One lower hanging fruit might be finding non-crashing security
violations, i.e., check a security property more sophisticated than
"it doesn't segfault."

"Techniques that automatically reverse-engineer message encryption are
required."  I laughed out loud at this.  And yet, they cite a paper
(Wang, et al., "ReFormat").  Very curious how this could be possible
assuming the crypto is worth anything at all...