Software diversity is a research field about the comprehension and engineering of diversity in the context of software.

Areas

The different areas of software diversity are discussed in surveys on diversity for fault-tolerance or for security. A recent survey emphasizes on the most recent advances in the field. The main areas are: * design diversity,

n-version programming ''N''-version programming (NVP), also known as multiversion programming or multiple-version dissimilar software, is a method or process in software engineering where multiple functionally equivalent programs are independently generated from the sam ...

, data diversity for

fault tolerance Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of one or more faults within some of its components. If its operating quality decreases at all, the decrease is proportional to the ...

randomization Randomization is the process of making something random. Randomization is not haphazard; instead, a random process is a sequence of random variables describing a process whose outcomes do not follow a deterministic pattern, but follow an evolution ...

* software variability

Domains

Software can be diversified in most domains: * in firmware of embedded systems and sensors * in internet applications * in mobile applications * in browser applications, incl. those using WebAssembly.

Techniques

Code transformations

It is possible to amplify software diversity through automated transformation processes that create synthetic diversity. A "multicompiler" is compiler embedding a diversification engine. A multi-variant execution environment (MVEE) is responsible for selecting the variant to execute and compare the output. Fred Cohen was among the very early promoters of such an approach. He proposed a series of rewriting and code reordering transformations that aim at producing massive quantities of different versions of operating systems functions. These ideas have been developed over the years and have led to the construction of integrated obfuscation schemes to protect key functions in large software systems. Another approach to increase software diversity of protection consists in adding randomness in certain core processes, such as memory loading. Randomness implies that all versions of the same program run differently from each other, which in turn creates a diversity of program behaviors. This idea was initially proposed and experimented by Stephanie Forrest and her colleagues. Recent work on automatic software diversity explores different forms of program transformations that slightly vary the behavior of programs. The goal is to evolve one program into a population of diverse programs that all provide similar services to users, but with a different code. This diversity of code enhances the protection of users against one single attack that could crash all programs at the same time. Transformation operators include: * code layout randomization: reorder functions in code * globals layout randomization: reorder and pad globals * stack variable randomization: reorder variables in each stack frame * heap layout randomization As exploring the space of diverse programs is computationally expensive, finding efficient strategies to conduct this exploration is important. To do so, recent work studies plastic regions in software code: plastic regions are those parts is code more susceptible to be changed without disrupting the functionalities provided by the piece of software. These regions can be specifically targeted by automatic code transformation to create artificial diversity in existing software. Turning the search for software diversity into a constraint satisfaction problem is another approach to explore trade-offs between the number of program variants and the size of the code of these variants. In a context where code is automatically generated from a formal specification, it is possible to turn adapt the code generator so that it generates software diversity in the form of multiple versions of the source code that are all conform to the specification.

Natural software diversity

It is known that some functionalities are available in multiple interchangeable implementations, this has been called natural software diversity. For example, a diversity of library that implement similar features, naturally emerges in software repositories. This natural diversity can be exploited, for example it has been shown valuable to increase security in cloud systems. Natural diversity can also be used to combine the strengths of different tools: for example if you combine many decompilers together, the resulting meta-decompiler is more effective.{{Cite journal, last1=Harrand, first1=Nicolas, last2=Soto-Valero, first2=César, last3=Monperrus, first3=Martin, last4=Baudry, first4=Benoit, date=2020, title=Java decompiler diversity and its application to meta-decompilation, journal=Journal of Systems and Software, language=en, volume=168, pages=110645, doi=10.1016/j.jss.2020.110645, arxiv=2005.11315, s2cid=218870447

References

Software engineering