In
computer programming
Computer programming or coding is the composition of sequences of instructions, called computer program, programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of proc ...
and
software design
Software design is the process of conceptualizing how a software system will work before it is implemented or modified.
Software design also refers to the direct result of the design process the concepts of how the software will work which co ...
, code refactoring is the process of restructuring existing
source code
In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer.
Since a computer, at base, only ...
—changing the ''
factoring''—without changing its external behavior. Refactoring is intended to improve the design, structure, and/or implementation of the
software
Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications.
The history of software is closely tied to the development of digital comput ...
(its ''
non-functional'' attributes), while preserving its
functionality. Potential advantages of refactoring may include improved code
readability and reduced
complexity
Complexity characterizes the behavior of a system or model whose components interact in multiple ways and follow local rules, leading to non-linearity, randomness, collective dynamics, hierarchy, and emergence.
The term is generally used to c ...
; these can improve the source code's
maintainability and create a simpler, cleaner, or more expressive internal
architecture
Architecture is the art and technique of designing and building, as distinguished from the skills associated with construction. It is both the process and the product of sketching, conceiving, planning, designing, and construction, constructi ...
or
object model to improve
extensibility. Another potential goal for refactoring is improved performance; software engineers face an ongoing challenge to write programs that perform faster or use less memory.
Typically, refactoring applies a series of standardized basic ''micro-refactorings'', each of which is (usually) a tiny change in a computer program's source code that either preserves the behavior of the software, or at least does not modify its conformance to functional requirements. Many
development environments provide automated support for performing the mechanical aspects of these basic refactorings. If done well, code refactoring may help software developers discover and fix hidden or dormant
bugs or
vulnerabilities in the system by simplifying the underlying logic and eliminating unnecessary levels of complexity. If done poorly, it may fail the requirement that external functionality not be changed, and may thus introduce new bugs.
Motivation
Refactoring is usually motivated by noticing a
code smell.
For example, the method at hand may be very long, or it may be a near
duplicate of another nearby method. Once recognized, such problems can be addressed by ''refactoring'' the source code, or transforming it into a new form that behaves the same as before but that no longer "smells".
For a long routine, one or more smaller subroutines can be extracted; or for duplicate routines, the duplication can be removed and replaced with one shared function. Failure to perform refactoring can result in accumulating
technical debt; on the other hand, refactoring is one of the primary means of repaying technical debt.
Benefits
There are two general categories of benefits to the activity of refactoring.
#
Maintainability. It is easier to fix bugs because the source code is easy to read and the intent of its author is easy to grasp.
This might be achieved by reducing large monolithic routines into a set of individually concise, well-named, single-purpose methods. It might be achieved by moving a method to a more appropriate class, or by removing misleading comments.
#
Extensibility. It is easier to extend the capabilities of the application if it uses recognizable
design patterns, and it provides some flexibility where none before may have existed.
[
Performance engineering can remove inefficiencies in programs, known as software bloat, arising from traditional software-development strategies that aim to minimize an application's development time rather than the time it takes to run. Performance engineering can also tailor ]software
Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications.
The history of software is closely tied to the development of digital comput ...
to the hardware on which it runs, for example, to take advantage of parallel processors and vector units.
Timing and responsibility
There are two possible times for refactoring.
# Preventive refactoring – the original developer of the code makes the code more robust when it is still free of smells to prevent the formation of smells in the future.[
]
# Corrective refactoring – a subsequent developer performs refactoring to correct code smells as they occur.
A method that balances preventive and corrective refactoring is "shared responsibility for refactoring".
This approach splits the refactoring action into two stages and two
roles. The original developer of the code just prepares the code for refactoring, and when the code smells form, a subsequent developer carries out the actual refactoring action.
Challenges
Refactoring requires extracting software system structure, data models, and intra-application dependencies to get back knowledge of an existing software system.
The turnover of teams implies missing or inaccurate knowledge of the current state of a system and about design decisions made by departing developers. Further code refactoring activities may require additional effort to regain this knowledge.
Refactoring activities generate architectural modifications that deteriorate the structural architecture of a software system. Such deterioration affects architectural properties such as maintainability and comprehensibility which can lead to a complete re-development of software systems.
Code refactoring activities are secured with software intelligence when using tools and techniques providing data about algorithms and sequences of code execution. Providing a comprehensible format for the inner-state of software system structure, data models, and intra-components dependencies is a critical element to form a high-level understanding and then refined views of what needs to be modified, and how.
Testing
Automatic unit tests should be set up before refactoring to ensure routines still behave as expected. Unit tests can bring stability to even large refactors when performed with a single atomic commit. A common strategy to allow safe and atomic refactors spanning multiple projects is to store all projects in a single repository, known as monorepo.
With unit testing in place, refactoring is then an iterative cycle of making a small program transformation
A program transformation is any operation that takes a computer program and generates another program. In many cases the transformed program is required to be semantically equivalent to the original, relative to a particular Formal semantics of p ...
, testing it to ensure correctness, and making another small transformation. If at any point a test fails, the last small change is undone and repeated in a different way. Through many small steps the program moves from where it was to where you want it to be. For this very iterative process to be practical, the tests must run very quickly, or the programmer would have to spend a large fraction of their time waiting for the tests to finish. Proponents of extreme programming
Extreme programming (XP) is a software development methodology intended to improve software quality and responsiveness to changing customer requirements. As a type of agile software development,"Human Centred Technology Workshop 2006 ", 2006, ...
and other agile software development
Agile software development is an umbrella term for approaches to software development, developing software that reflect the values and principles agreed upon by ''The Agile Alliance'', a group of 17 software practitioners, in 2001. As documented ...
describe this activity as an integral part of the software development cycle.
Techniques
Here are some examples of micro-refactorings; some of these may only apply to certain languages or language types. A longer list can be found in Martin Fowler's refactoring book and website.[(these are only about OOP however]
Refactoring techniques in Fowler's refactoring Website
/ref> Many development environments provide automated support for these micro-refactorings. For instance, a programmer could click on the name of a variable and then select the "Encapsulate field" refactoring from a context menu. The IDE would then prompt for additional details, typically with sensible defaults and a preview of the code changes. After confirmation by the programmer it would carry out the required changes throughout the code.
Static analysis
Static program analysis (called "linting" when performed on less strict interpreted languages) detects problems in a valid but substandard program.
* Program dependence graph - explicit representation of data and control dependencies
* System dependence graph - representation of procedure calls between PDG
* Cyclometric complexity analysis.
* Software intelligence - reverse engineers the initial state to understand existing intra-application dependencies
Transformations
Transformations modify the syntactic representation of a program. Some modifications alter the semantics or structure of the program in a way which improves its flexibility or robustness. Such modifications require knowledge of the problem domain and intended logic, and thus are infeasible to automate. Modifications exist which make the program easier to read and modify but which to not alter the underlying logic of the program; these transformations can be automated.
* Techniques that allow for more abstraction
Abstraction is a process where general rules and concepts are derived from the use and classifying of specific examples, literal (reality, real or Abstract and concrete, concrete) signifiers, first principles, or other methods.
"An abstraction" ...
** Encapsulate field – force code to access the field with getter and setter methods
** Generalize type – create more general types to allow for more code sharing
** Replace type-checking code with state/strategy
** Replace conditional with polymorphism
* Techniques for breaking code apart into more logical pieces
** Componentization breaks code down into reusable semantic units that present clear, well-defined, simple-to-use interfaces.
** Extract class moves part of the code from an existing class into a new class.
** Extract method, to turn part of a larger method into a new method. By breaking down code in smaller pieces, it is more easily understandable. This is also applicable to functions.
* Techniques for improving names and location of code
** Move method or move field – move to a more appropriate class
Class, Classes, or The Class may refer to:
Common uses not otherwise categorized
* Class (biology), a taxonomic rank
* Class (knowledge representation), a collection of individuals or objects
* Class (philosophy), an analytical concept used d ...
or source file
** Rename method or rename field – changing the name into a new one that better reveals its purpose
** Pull up – in object-oriented programming
Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impl ...
(OOP), move to a superclass
** Push down – in OOP, move to a subclass
* Automatic clone detection
Hardware refactoring
While the term ''refactoring'' originally referred exclusively to refactoring of software code, in recent years code written in hardware description language
In computer engineering, a hardware description language (HDL) is a specialized computer language used to describe the structure and behavior of electronic circuits, usually to design application-specific integrated circuits (ASICs) and to progra ...
s has also been refactored. The term ''hardware refactoring'' is used as a shorthand term for refactoring of code in hardware description languages. Since hardware description languages are not considered to be programming language
A programming language is a system of notation for writing computer programs.
Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
s by most hardware engineers, hardware refactoring is to be considered a separate field from traditional code refactoring.
Automated refactoring of analog hardware descriptions (in VHDL-AMS
VHDL-AMS is a derivative of the hardware description language VHDL (IEEE 1076-2002). It includes analog and mixed-signal extensions (AMS) in order to define the behavior of analog and mixed-signal systems (IEEE 1076.1-2017).
The VHDL-AMS standard ...
) has been proposed by Zeng and Huss. In their approach, refactoring preserves the simulated behavior of a hardware design. The non-functional measurement that improves is that refactored code can be processed by standard synthesis tools, while the original code cannot. Refactoring of digital hardware description languages, albeit manual refactoring, has also been investigated by Synopsys
Synopsys, Inc. is an American electronic design automation (EDA) company headquartered in Sunnyvale, California, that focuses on silicon design and verification, silicon intellectual property and software security and quality. Synopsys sup ...
fellow
A fellow is a title and form of address for distinguished, learned, or skilled individuals in academia, medicine, research, and industry. The exact meaning of the term differs in each field. In learned society, learned or professional society, p ...
Mike Keating. His target is to make complex systems easier to understand, which increases the designers' productivity.
History
The first known use of the term "refactoring" in the published literature was in a September, 1990 article by William Opdyke and Ralph Johnson.
Although refactoring code has been done informally for decades, William Griswold's 1991 Ph.D. dissertation
is one of the first major academic works on refactoring functional and procedural programs, followed by William Opdyke's 1992 dissertation
on the refactoring of object-oriented programs, although all the theory and machinery have long been available as program transformation
A program transformation is any operation that takes a computer program and generates another program. In many cases the transformed program is required to be semantically equivalent to the original, relative to a particular Formal semantics of p ...
systems. All of these resources provide a catalog of common methods for refactoring; a refactoring method has a description of how to apply the method and indicators for when you should (or should not) apply the method.
Martin Fowler's book ''Refactoring: Improving the Design of Existing Code'' is the canonical reference.
The terms "factoring" and "factoring out" have been used in this way in the Forth community since at least the early 1980s. Chapter Six of Leo Brodie's book '' Thinking Forth'' (1984) is dedicated to the subject.
In extreme programming, the Extract Method refactoring technique has essentially the same meaning as factoring in Forth; to break down a "word" (or function) into smaller, more easily maintained functions.
Refactorings can also be reconstructed posthoc to produce concise descriptions of complex software changes recorded in software repositories like Git.
Automated code refactoring
Many software editors and IDEs have automated refactoring support. Here is a list of a few of these editors, or so-called refactoring browsers.
* DMS Software Reengineering Toolkit (Implements large-scale refactoring for C, C++, C#, COBOL, Java, PHP and other languages)
* Eclipse based:
** Eclipse (for Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, and to a lesser extent, C++, PHP, Ruby and JavaScript)
** PyDev (for Python)
* Embarcadero Delphi
* IntelliJ based:
** AppCode
JetBrains s.r.o. (formerly IntelliJ Software s.r.o.) is a Czech software development private limited company which makes tools for software developers and project managers. The company has its headquarters in Amsterdam, and has offices in Ch ...
(for Objective-C
Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was ...
, C and C++)
** IntelliJ IDEA (for Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
)
** PyCharm
PyCharm is an integrated development environment (IDE) used for programming in Python. It provides code analysis, a graphical debugger, an integrated unit tester, integration with version control systems, and supports web development with D ...
(for Python)
** WebStorm (for JavaScript
JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior.
Web browsers have ...
)
** PhpStorm
JetBrains s.r.o. (formerly IntelliJ Software s.r.o.) is a Czech software development private limited company which makes tools for software developers and project managers. The company has its headquarters in Amsterdam, and has offices in Ch ...
(for PHP)
** Android Studio
Android Studio is the official integrated development environment (IDE) for Google's Android operating system, built on JetBrains' IntelliJ IDEA software and designed specifically for Android development. This is available for download on W ...
(for Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
and C++)
* JDeveloper
JDeveloper is a freeware IDE supplied by Oracle Corporation. It offers features for development in Java, XML, SQL and PL/SQL, HTML, JavaScript, BPEL and PHP. JDeveloper covers the full development lifecycle from design through coding, debug ...
(for Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
)
* NetBeans
NetBeans is an integrated development environment (IDE) for Java (programming language), Java. NetBeans allows applications to be developed from a set of modular software components called ''modules''. NetBeans runs on Microsoft Windows, Windows, ...
(for Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
)
*Smalltalk
Smalltalk is a purely object oriented programming language (OOP) that was originally created in the 1970s for educational use, specifically for constructionist learning, but later found use in business. It was created at Xerox PARC by Learni ...
: Most dialects include powerful refactoring tools. Many use the original refactoring browser produced in the early '90s by Ralph Johnson.
* Visual Studio based:
** Visual Studio (for .NET and C++)
** Visual Assist (addon for Visual Studio with refactoring support for C# and C++)
* Wing IDE (for Python)
* Xcode
Xcode is a suite of developer tools for building apps on Apple devices. It includes an integrated development environment (IDE) of the same name for macOS, used to develop software for macOS, iOS, iPadOS, watchOS, tvOS, and visionOS. It w ...
(for C, Objective-C
Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style message passing (messaging) to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was ...
, and Swift
Swift or SWIFT most commonly refers to:
* SWIFT, an international organization facilitating transactions between banks
** SWIFT code
* Swift (programming language)
* Swift (bird), a family of birds
It may also refer to:
Organizations
* SWIF ...
)
* Qt Creator
Qt Creator is a cross-platform C++, JavaScript, Python and QML integrated development environment (IDE) which simplifies GUI application development. It is part of the SDK for the Qt GUI application development framework and uses the Q ...
(for C++, Objective-C and QML)
See also
* Amelioration pattern
* Code review
* Database refactoring
* Decomposition (computer science)
* Modular programming
Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect or "concern" of the d ...
* Obfuscated code
* Prefactoring
* Rewrite (programming)
* Separation of concerns
In computer science, separation of concerns (sometimes abbreviated as SoC) is a design principle for separating a computer program into distinct sections. Each section addresses a separate '' concern'', a set of information that affects the code o ...
* Software peer review
* Test-driven development
References
Further reading
*
*
*
*
*
*
*
External links
What Is Refactoring?
(c2.com article)
Martin Fowler's homepage about refactoring
{{Authority control
Extreme programming
Technology neologisms