Software archaeology
   HOME

TheInfoList



OR:

Software archaeology or source code archeology is the study of poorly documented or undocumented legacy software implementations, as part of software maintenance. Software archaeology, named by analogy with
archaeology Archaeology or archeology is the scientific study of human activity through the recovery and analysis of material culture. The archaeological record consists of artifacts, architecture, biofacts or ecofacts, sites, and cultural landsca ...
, includes the reverse engineering of software modules, and the application of a variety of tools and processes for extracting and understanding program structure and recovering design information. Software archaeology may reveal dysfunctional team processes which have produced poorly designed or even unused software modules, and in some cases deliberately obfuscatory code may be found. The term has been in use for decades. Software archaeology has continued to be a topic of discussion at more recent software engineering conferences.


Techniques

A workshop on Software Archaeology at the 2001
OOPSLA OOPSLA (Object-Oriented Programming, Systems, Languages & Applications) is an annual ACM research conference. OOPSLA mainly takes place in the United States, while the sister conference of OOPSLA, ECOOP, is typically held in Europe. It is opera ...
(Object-Oriented Programming, Systems, Languages & Applications) conference identified the following software archaeology techniques, some of which are specific to
object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code. The data is in the form of fields (often known as attributes or ''properties''), and the code is in the form of ...
: *
Scripting language A scripting language or script language is a programming language that is used to manipulate, customize, and automate the facilities of an existing system. Scripting languages are usually interpreted at runtime rather than compiled. A scripting ...
s to build static reports and for filtering diagnostic output *Ongoing documentation in HTML pages or Wikis *Synoptic signature analysis, statistical analysis, and
software visualization Software visualization or software visualisation refers to the visualization of information of and related to software systems—either the architecture of its source code or metrics of their runtime behavior—and their development process by mea ...
tools *Reverse-engineering tools *Operating-system-level tracing via
truss A truss is an assembly of ''members'' such as beams, connected by ''nodes'', that creates a rigid structure. In engineering, a truss is a structure that "consists of two-force members only, where the members are organized so that the assembl ...
or
strace strace is a diagnostic, debugging and instructional userspace utility for Linux. It is used to monitor and tamper with interactions between processes and the Linux kernel, which include system calls, signal deliveries, and changes of proce ...
*Search engines and tools to search for keywords in source files * IDE file browsing *
Unit testing In computer programming, unit testing is a software testing method by which individual units of source code—sets of one or more computer program modules together with associated control data, usage procedures, and operating procedures&md ...
frameworks such as
JUnit JUnit is a unit testing framework for the Java programming language. JUnit has been important in the development of test-driven development, and is one of a family of unit testing frameworks which is collectively known as xUnit that originated ...
and CppUnit *API documentation generation using tools such as
Javadoc Javadoc (originally cased JavaDoc) is a documentation generator created by Sun Microsystems for the Java language (now owned by Oracle Corporation) for generating API documentation in HTML format from Java source code. The HTML format is used for ...
and
doxygen Doxygen ( ) is a documentation generator and static analysis tool for software source trees. When used as a documentation generator, Doxygen extracts information from specially-formatted comments within the code. When used for analysis, Doxyge ...
*
Debugger A debugger or debugging tool is a computer program used to test and debug other programs (the "target" program). The main use of a debugger is to run the target program under controlled conditions that permit the programmer to track its executi ...
s More generally, Andy Hunt and Dave Thomas note the importance of
version control In software engineering, version control (also known as revision control, source control, or source code management) is a class of systems responsible for managing changes to computer programs, documents, large web sites, or other collections o ...
, dependency management, text indexing tools such as GLIMPSE and
SWISH-E SWISH-E stands for ''Simple Web Indexing System for Humans - Enhanced''. It is used to index collections of documents ranging up to one million documents in size and includes import filters for many document types. SWISH-E is based on SWISH, dev ...
, and " rawinga map as you begin exploring." Like true archaeology, software archaeology involves investigative work to understand the thought processes of one's predecessors. At the OOPSLA workshop, Ward Cunningham suggested a synoptic signature analysis technique which gave an overall "feel" for a program by showing only punctuation, such as semicolons and
curly braces A bracket is either of two tall fore- or back-facing punctuation marks commonly used to isolate a segment of text or data from its surroundings. Typically deployed in symmetric pairs, an individual bracket may be identified as a 'left' or 'r ...
. In the same vein, Cunningham has suggested viewing programs in 2 point font in order to understand the overall structure. Another technique identified at the workshop was the use of
aspect-oriented programming In computing, aspect-oriented programming (AOP) is a programming paradigm that aims to increase modularity by allowing the separation of cross-cutting concerns. It does so by adding behavior to existing code (an advice) ''without'' modifying th ...
tools such as
AspectJ AspectJ is an aspect-oriented programming (AOP) extension created at PARC for the Java programming language. It is available in Eclipse Foundation open-source projects, both stand-alone and integrated into Eclipse. AspectJ has become a widely use ...
to systematically introduce tracing code without directly editing the legacy program. Network and temporal analysis techniques can reveal the patterns of collaborative activity by the developers of legacy software, which in turn may shed light on the strengths and weaknesses of the software artifacts produced. Michael Rozlog of
Embarcadero Technologies Embarcadero Technologies, Inc. is an American computer software company that develops, manufactures, licenses, and supports products and services related to software through several product divisions. It was founded in 1993, went public in 2000, ...
has described software archaeology as a six-step process which enables programmers to answer questions such as "What have I just inherited?" and "Where are the scary sections of the code?" These steps, similar to those identified by the OOPSLA workshop, include using visualization to obtain a visual representation of the program's design, using software metrics to look for design and style violations, using
unit testing In computer programming, unit testing is a software testing method by which individual units of source code—sets of one or more computer program modules together with associated control data, usage procedures, and operating procedures&md ...
and profiling to look for bugs and performance bottlenecks, and assembling design information recovered by the process. Software archaeology can also be a service provided to programmers by external consultants. Mitch Rosenberg of InfoVentions.net, Inc. claims that the first law of software archaeology (he calls it code or data archaeology) is:


In popular culture

The profession of ''programmer–archaeologist'' features prominently in
Vernor Vinge Vernor Steffen Vinge (; born October 2, 1944) is an American science fiction author and retired professor. He taught mathematics and computer science at San Diego State University. He is the first wide-scale popularizer of the technological singu ...
's 1999 sci-fi novel '' A Deepness in the Sky.''


See also

* Software architecture recovery * Code refactoring *
Retrocomputing Retrocomputing is the use of older computer hardware and software in modern times. Retrocomputing is usually classed as a hobby and recreation rather than a practical application of technology; enthusiasts often collect rare and valuable hardw ...
* Software brittleness * Software rot *
Software entropy Software entropy is the idea that software eventually rots as it is changed if sufficient care is not taken to maintain coherence with product design and established design principles. The common usage is only tangentially related to entropy as de ...
*
Abandonware Abandonware is a product, typically software, ignored by its owner and manufacturer, and for which no official support is available. Within an intellectual rights contextual background, abandonware is a software (or hardware) sub-case of the ...


References


External links

* * * * {{Software engineering Computer jargon Software maintenance