applications.tex

% !TEX root = appendix.tex

\section{Applications of Symbolic Execution}
\label{se:applications}

\revedit{
\cite{CGK-ICSE11} observes how the recent explosion of research work in symbolic execution makes for an interesting story about the increasing impact of this program analysis since its introduction in the mid '70s. The availability of powerful off-the-shelf SMT solvers and hardware resources, along with advances in symbolic execution techniques to deal with the challenges identified in Section 1.2, facilitated the application of symbolic execution to increasing large problem instances from many domains.

%The last decade has witnessed an increasing adoption of symbolic execution techniques not only in the software testing domain, but also to address other compelling engineering problems such as automatic generation of exploits or authentication bypass. We now discuss prominent applications of symbolic execution techniques to these domains. Examples of extensions to other areas can be found, e.g., in~\cite{CGK-ICSE11}.

In this section we do not aim at presenting a comprehensive overview of applications of symbolic execution. Our goal is instead to provide the reader with a selection of works appeared in the last few years that either incubated novel ideas that might be effective in other domains too (e.g., to deal with the path explosion problem), or significantly affected the state of the art of a specific field.

The works we are about to discuss are drawn from four domains: software testing, program understanding, bug exploitation, and authentication bypass. Other fields that have seen uses of symbolic execution, such as automatic filter generation (e.g., \cite{BND-SP06,BOUNCER-SOSP07}) and code analysis (e.g., \cite{HMH-VSTTE12,BCP-USENIXSEC17}), are not covered here. Also, we do not address techniques tailored to programs with concurrent threads (e.g., \cite{BGC-OOPSLA14,GKW-ESEC15}) or floating-point arithmetic (e.g., \cite{RPW-SIGSOFT15,LSC-ASE17}).}

%The last decade has witnessed an increasing adoption of symbolic execution techniques not only in the software testing domain, but also to address other compelling engineering problems such as automatic generation of exploits or authentication bypass. We now discuss \iffullver{three prominent}{prominent} applications of symbolic execution techniques to these domains. Examples of extensions to other areas can be found, e.g., in~\cite{CGK-ICSE11}.

\subsection{Software Testing}%\mynote{Rendere piu' di ampio respiro il titolo di questa sezione? Keyword: software testing, program understanding}
\label{ss:bug-detection}

Software testing strategies typically attempt to execute a program with the intent of finding bugs. As manual test input generation is an error-prone and usually non-exhaustive process, automated testing techniques have drawn a lot of attention over the years. Random testing techniques such as fuzzing are cheap in terms of run-time overhead, but fail to obtain a wide exploration of a program state space. Symbolic and concolic execution techniques on the other hand achieve a more exhaustive exploration, but they become expensive as the length of the execution grows: for this reason, they usually reveal shallow bugs only.

\cite{RK-ICSE07} proposes {\em hybrid concolic testing} for test input generation, which combines random search and concolic execution to achieve both deep program states and wide exploration. The two techniques are interleaved: in particular, when random testing saturates (i.e., it is unable to hit new code coverage points after a number of steps), concolic execution is used to mutate the current program state by performing a bounded depth-first search for an uncovered coverage point. For a fixed time budget, the technique outperforms both random and concolic testing in terms of branch coverage. The intuition behind this approach is that many programs show behaviors where a state can be easily reached through random testing, but then a precise sequence of events -- identifiable by a symbolic engine -- is required to hit a specific coverage point.

% which uses preconstraining on the program states to ensure consistency
% fuzzy \revedit
\cite{DRILLER-NDSS16} refines this idea by devising Driller, a vulnerability excavation tool based on {\sc Angr}~\cite{ANGR-SSP16} that interleaves fuzzing and concolic execution to discover memory corruption vulnerabilities. The authors remark that user inputs can be categorized as {\em general} input, which has a wide range of valid values, and {\em specific} input; a check for particular values of a specific input splits an application into {\em compartments}. Driller offloads the majority of unique path discovery to a fuzzy engine, and relies on concolic execution to move across compartments. During the fuzzy phase, Driller marks a number of inputs as interesting (for instance, when an input was the first to trigger some state transition) and once it gets stuck in the exploration, it passes the set of such paths to a concolic engine, which preconstraints the program states to ensure consistency with the results of the native execution. On the dataset used for the DARPA Cyber Grand Challenge qualifying event, Driller could identify crashing inputs in 77 applications, including both the 68 and 16 applications for which fuzzing and symbolic execution alone succeeded, respectively. For 6 applications, Driller was the only one to detect a vulnerability.

% temporaneamente messo qui
%  \cite{QRL-TOSEM12} \revedit
\smallskip
Maintenance of large and complex applications is a very hard task. Fixing bugs can sometimes introduce new and unexpected issues in the software, which in turn may require several hours or even weeks to be detected and properly addressed by the developers. \cite{QRL-TOSEM12} tackles the problem of identifying the root cause of failures during regression testing. Given a program $P$ and a newer revision of the program $P'$, if a testing input $t$ generates a failure in $P'$ but not in  $P$, then symbolic execution is used to track the path constraints $\pi$ and $\pi'$ when executing $P$ and $P'$ on the failing input $t$, respectively. Using an SMT solver, a new input $t'$ is generated by solving the formula $\pi ~\wedge \neg\pi'$. If $t'$ exists (i.e., the formula is satisfiable), then $P'$ has one or more {\em deviations} in the control flow graph with respect to $P$ that can be the root cause of the failure. By carefully tracking branch conditions during symbolic execution, \cite{QRL-TOSEM12} are also able to pinpoint which branches are responsible for these deviations. If $\pi \wedge \neg\pi'$ is unsatisfiable, the symmetric formula $\neg\pi \wedge \pi'$ is evaluated and analogous actions are taken to detect possible branch conditions that may have led to the failure. If also $\neg\pi \wedge \pi'$ is unsatisfiable, the root cause of the problem cannot be determined.
%\revedit{the technique} cannot determine the root cause of the problem.

% over, to check \revedit
Another interesting work that targets the problem of software regressions through the use of symbolic execution is~\cite{BOR-ICSE13}. The work introduces an approach called {\em partition-based regression verification} that combines the advantages of both regression verification (RV) and regression testing (RT). Indeed, RV is a very powerful technique for identifying regressions but hardly scales to large programs due to the difficulty in proving behavioral equivalence between the original and the modified program. On the other hand, RT allows for checking a modified program for regressions by testing selected concrete sample inputs, making it more scalable but providing limited verification guarantees. The main intuition behind partition-based regression verification is the identification of {\em differential partitions}. Each differential partition can be seen as a subset of the input space for which the two program versions -- given the same path constraints -- either expose the same output ({\em equivalence-revealing partition}) or produce different results ({\em difference-revealing partition}). For each partition, a test case is generated and added to the regression test suite, which can later be used by a developer for classical RT. Since differential partitions are derived by exploiting symbolic execution, this approach suffers from the common limitations that come with this technique. However, if the exploration is interrupted (e.g., due to excessive time or memory usage), partition-based regression verification can still provide guarantees over the subset of input space that has been covered so far by the detected partitions.

\revedit{
Directed incremental symbolic execution (DiSE) is usually used for regression testing. As pointed out in the main article, its strength lies in applying static analyses in synergy with symbolic execution, directing the exploration to the sole code portions affected by changes. \cite{BPR-SPIN13} uses DiSE to generate summaries of behaviors affected by differences, and proves behavioral equivalence of two program versions by comparing the affected behaviors only. Their approach is sound and complete for sequential programs under a given depth bound for the symbolic exploration.}

\smallskip
Static data flow analysis tools can significantly help developers track malicious data leaks in software applications. Unfortunately, they often report several alleged bugs that only after a manual inspection can be regarded as false positives. To mitigate this issue,~\cite{ARH-SOAP15} proposes TASMAN, a system that, after performing data-flow analysis to track information leaks, uses symbolic backward execution to test each reported bug. Starting from a leaking statement, TASMAN explores the code backwards, pruning any path that can be proved unfeasible. If all the paths starting at the leaking statement are discarded by TASMAN, the bug is deemed a false positive.

% . Intuitively, a usage profile can be seen as the distribution over the input space.
% other -> several \revedit
\subsection{Program Understanding}
While symbolic execution is largely employed in testing activities, over the few last years several works (e.g., \cite{GDV-ISSTA12,FPV-ICSE13,CLL-ICSE16}) have shown how it can be valuable also for program understanding activities.

\cite{GDV-ISSTA12} introduces {\em probabilistic symbolic execution}, an approach that makes it possible to compute the probability of executing different code portions of a program. This is achieved by exploiting model counting techniques, such as the {\tt LattE}~\cite{LHT-JSC04} toolset, to determine the number of solutions for the different path constraints given by the alternative execution paths of a program.

The work by~\cite{FPV-ICSE13} takes a step further by using probabilistic symbolic execution to perform software reliability analysis. Reliability is computed as the probability of executing paths that have been labeled as successful given a usage profile, which represents the input space of all the successfully accomplished external interactions (with the user and with external resources) of the program.  Since in general the termination of symbolic execution cannot be guaranteed in presence of loops, the proposed technique resorts to bounded exploration. Nonetheless, the authors define a metric for evaluating the confidence of their reliability estimation, allowing a developer to increase the bounds in order to improve the confidence value.

Of a different flavor is the work by~\cite{CLL-ICSE16}, which uses probabilistic symbolic execution to conduct performance analysis. Based on usage profiles and on path execution probabilities, paths are classified into two types: {\em low-probability} and {\em high-probability}. Initially, high-probability paths are explored in a way that maximizes path diversity, generating a first set of test inputs. In a second phase, low-probability paths are analyzed using symbolic execution, generating a second set of test inputs that should expose executions characterized by the best and by the worst execution times. Finally, the program is executed using the test inputs generated during the two phases, and its running time is measured to generate performance distributions. 

Another interesting application of symbolic execution to program understanding is presented in~\cite{PPM-CSF18}. The technique exploits model counting and symbolic execution for computing quantitative bounds on the amount of information that can be leaked by a program through side-channel attacks. 
	
%As it is based on {\sc Angr}, Driller adopts an index-based memory model as in Section~\ref{ss:index-based-memory} where reads can be symbolic and writes are always concretized. % read/write addresses

\subsection{Bug Exploitation}
\label{ss:bug-exploitation}
Bugs are a consequence of the nature of human factors in software development and are everywhere. Those that can be exploited by an attacker should normally be fixed first: systems for automatically and effectively identifying them are thus very valuable.

{\sc AEG}~\cite{AEG-NDSS11} employs preconditioned symbolic execution to analyze a potentially buggy program in source form and look for bugs amenable to stack smashing or return-into-libc exploits~\cite{PB-SSP04}, which are popular control hijack attack techniques. The tool augments path constraints with exploitability constraints and queries a constraint solver, generating a concrete exploit when the constraints are satisfiable. The authors devise the {\em buggy-path-first} and {\em loop-exhaustion} strategies (Table~\ref{tab:heuristics}) to prioritize paths in the search. On a suite of 14 Linux applications, {\sc AEG} discovered 16 vulnerabilities, 2 of which were previously unknown, and constructed control hijack exploits for them.

{\sc Mayhem}~\cite{MAYHEM-SP12} takes another step forward by presenting the first system for binary programs that is able identify end-to-end exploitable bugs. It adopts a hybrid execution model based on checkpoints and two components: a concrete executor that injects taint-analysis instrumentation in the code and a symbolic executor that takes over when a tainted branch or jump instruction is met. Exploitability constraints for symbolic instruction pointers and format strings are generated, targeting a wide range of exploits, e.g., SEH-based and jump-to-register ones. Three path selection heuristics help prioritizing paths that are most likely to contain vulnerabilities (e.g., those containing symbolic memory accesses or instruction pointers). A virtualization layer intercepts and emulates all the system calls to the host OS, while preconditioned symbolic execution can be used to reduce the size of the search space. Also, restricting symbolic execution to tainted basic blocks only gives very good speedups in this setting, as in the reported experiments more than $95\%$ of the processed instructions were not tainted. {\sc Mayhem} was able to find exploitable vulnerabilities in the 29 Linux and Windows applications considered in the evaluation, 2 of which were previously undocumented. Although the goal in {\sc Mayhem} is to reveal exploitable bugs, the generated simple exploits can be likely transformed in an automated fashion to work in the presence of classical OS defenses such as data execution prevention and address space layout randomization~\cite{Q-SEC11}. 

\vspace{-1mm} % TODO
\subsection{Authentication Bypass}
\label{ss:auth-bypass}
Software backdoors are a method of bypassing authentication in an algorithm, a software product, or even in a full computer system. Although sometimes these software flaws are injected by external attackers using subtle tricks such as compiler tampering~\cite{KRS-TR74}, there are reported cases of backdoors that have been surreptitiously installed by the hardware and/or software manufacturers~\cite{CZF-USEC14}, or even by governments~\cite{NSA-BACKDOOR}. 

Different works (e.g., \cite{DMR-USEC13,ZBF-NDSS14,FIRMALICE-NDSS15}) have exploited symbolic execution for analyzing the behavior of binary firmwares. Indeed, an advantage of this technique is that it can be used even in environments, such as embedded systems, where the documentation and the source code that are publicly released by the manufacturer are typically very limited or none at all. For instance,~\cite{FIRMALICE-NDSS15} proposes Firmalice, a binary analysis framework based on {\sc Angr}~\cite{ANGR-SSP16} that can be effectively used for identifying authentication bypass flaws inside firmwares running on devices such as routers and printers. Given a user-provided description of a privileged operation in the device, Firmalice identifies a set of program points that, if executed, forces the privileged operation to be performed. The program slice that involves the privileged program points is then symbolically analyzed using {\sc Angr}. If any such point can be reached by the engine, a set of concrete inputs is generated using an SMT solver. These values can be then used to effectively bypass authentication inside the device. On three commercially available devices, Firmalice could detect vulnerabilities in two of them, and determine that a backdoor in the third firmware is not remotely exploitable.