HOME

TheInfoList



OR:

dd is
shell Shell may refer to: Architecture and design * Shell (structure), a thin structure ** Concrete shell, a thin shell of concrete, usually with no interior columns or exterior buttresses Science Biology * Seashell, a hard outer layer of a marine ani ...
command Command may refer to: Computing * Command (computing), a statement in a computer language * command (Unix), a Unix command * COMMAND.COM, the default operating system shell and command-line interpreter for DOS * Command key, a modifier key on A ...
for reading, writing and converting file
data Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
. Originally developed for
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
, it has been implemented on many other environments including
Unix-like A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
operating system An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
s,
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
, Plan 9 and Inferno. The command can be used for many purposes. For relatively simple copying operations, it tends to be slower than domain-specific alternatives, but it excels at overwriting or truncating a file at any point or seeking in a file. The command supports reading and writing files, and if a driver is available to support file-like access, the command can access devices too. Such access is typically supported on Unix-based systems that provide file-like access to devices (such as storage) and special
device file In Unix-like operating systems, a device file, device node, or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These s ...
s (such as /dev/zero and /dev/random). Therefore, the command can be used for tasks such as backing up the boot sector of a drive, and obtaining random data. The command can also support converting data while copying; including byte order swapping and converting between
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
and
EBCDIC Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight- bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding si ...
text encodings. dd is sometimes humorously called "Disk Destroyer", due to its drive-erasing capabilities involving typos.


History

In 1974, the command appeared as part of Version 5 Unix. According to Dennis Ritchie, the name is an allusion to the DD statement found in
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
's
Job Control Language Job Control Language (JCL) is a scripting language used on IBM mainframe operating systems to instruct the system on how to run a batch processing, batch job or start a subsystem. The purpose of JCL is to say which programs to run, using which fi ...
(JCL), where ''DD'' is short for ''data definition''. According to Douglas McIlroy, was "originally intended for converting files between the
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
, little-endian, byte-stream world of DEC computers and the
EBCDIC Extended Binary Coded Decimal Interchange Code (EBCDIC; ) is an eight- bit character encoding used mainly on IBM mainframe and IBM midrange computer operating systems. It descended from the code used with punched cards and the corresponding si ...
, big-endian, blocked world of
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
"; thus, explaining the cultural context of its syntax. Eric S. Raymond believes "the interface design was clearly a prank", due to the command's syntax resembling a JCL statement more than other Unix commands do. In 1987, the command is specified in the X/Open Portability Guide issue 2 of 1987. This is inherited by
IEEE The Institute of Electrical and Electronics Engineers (IEEE) is an American 501(c)(3) organization, 501(c)(3) public charity professional organization for electrical engineering, electronics engineering, and other related disciplines. The IEEE ...
Std 1003.1-2008 (
POSIX The Portable Operating System Interface (POSIX; ) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines application programming interfaces (APIs), along with comm ...
), which is part of the
Single UNIX Specification The Single UNIX Specification (SUS) is a standard for computer operating systems, compliance with which is required to qualify for using the "UNIX" trademark. The standard specifies programming interfaces for the C language, a command-line shell, ...
. In 1990, David MacKenzie announced GNU fileutils (now part of
coreutils The GNU Core Utilities or coreutils is a collection of GNU software that implements many standard, Unix-based shell commands. The utilities generally provide POSIX compliant interface when the environment variable is set, but otherwise offers ...
) which includes the dd command; it was written by Paul Rubin, David MacKenzie, and Stuart Kemp. Since 1991, Jim Meyering is its maintainer. In 1995, Plan 9 2nd edition was released with a command with a more traditional command-line option style than the JCL statement style. Since at least 1999, UnxUtils has provided a native implementation for the
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
platform.


Use

The
command line interface A command-line interface (CLI) is a means of interacting with software via commands each formatted as a line of text. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user-friendly alternati ...
significantly differs from most modern shell commands in that an option is formatted as ''option'value'' instead of the more typical syntax that denotes an option with a dash prefix such as: -x, -y ''value'', --abc, --def ''value''. By default, reads from standard input and writes to
standard output Standard may refer to: Symbols * Colours, standards and guidons, kinds of military signs * Standard (emblem), a type of a large symbol or emblem used for identification Norms, conventions or requirements * Standard (metrology), an object t ...
, but input and output can be overridden. Option specifies an input file and option specifies an output file. Non-standardized aspects of depend on the underlying system or implementation, including: * Direct memory access * Signal handling * End-of-file (EOF) handling; in particular the
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
ports vary: Cygwin uses (the usual for Unix) and MKS Toolkit uses (the usual for Windows)


Output messages

On completion, writes statistics to standard error. The format is standardized in POSIX. The manual page for GNU dd does not describe this format, but the BSD manuals do. Each of the "Records in" and "Records out" lines shows the number of complete blocks transferred + the number of partial blocks, e.g. because the physical medium ended before a complete block was read, or a physical error prevented reading the complete block. If receives a SIGINFO signal while it's running typically triggered by the user pressing it writes intermediate statistics to standard error and continues processing.


Block size

The command processes data in blocks. The default size is 512 (the POSIX-mandated size and a common legacy size for disk hardware) but can be specified via command-line options. Option specifies the size for both input (read) and output (write) operations. Alternatively, option specifies the size for input operations and for output operations. Option affects conversion operations. Options , and specify a number of blocks: maximum to read, to start reading at offset from the start of the input, and to start writing at offset from the start of the output, respectively. A block size option value is specified as a whole decimal number of bytes with an optional suffix to indicate a multiplier. POSIX requires suffixes (blocks) for 512 and ( kibibytes) for 1024, but implementations differ on other suffixes. (Free) BSD uses for mebibytes, for
gibibyte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
s and so on for larger
power of two A power of two is a number of the form where is an integer, that is, the result of exponentiation with number 2, two as the Base (exponentiation), base and integer  as the exponent. In the fast-growing hierarchy, is exactly equal to f_1^ ...
units. GNU uses and and so on for these units and uses , , and for
SI units The International System of Units, internationally known by the abbreviation SI (from French ), is the modern form of the metric system and the world's most widely used system of measurement. It is the only system of measurement with official st ...
. For example, for GNU ''dd'', indicates a size of 16 mebibytes (16777216 bytes) and specifies 3000 bytes. For POSIX compliance, some implementations interpret the character as a multiplication operator for both block size and count option values. For example, is interpreted as 2 × 80 × 18 × 512 = , the size of a 1440 KiB
floppy disk A floppy disk or floppy diskette (casually referred to as a floppy, a diskette, or a disk) is a type of disk storage composed of a thin and flexible disk of a magnetic storage medium in a square or nearly square plastic enclosure lined with a ...
. For implementations that do not support this feature, the POSIX shell arithmetic syntax of bs=$((2*80*18))b may be used. Block size affects performance. Many small reads and writes is often slower than fewer, larger ones. On the downside, larger blocks require more RAM and can complicate error recovery. When used with a variable block size device such as a tape drive or a network, the block size may determine the tape record size or
network packet In telecommunications and computer networking, a network packet is a formatted unit of Data (computing), data carried by a packet-switched network. A packet consists of control information and user data; the latter is also known as the ''Payload ...
size, depending on the network protocol.


Examples

The examples below apply to many implementations, but are specifically written for GNU dd. Generally, the only difference between implementations is block size values and can be portable by using shell arithmetic expression instead of a size multiplier suffix. For example, instead of use or .


Data transfer

The command can duplicate data across files, devices, partitions and volumes, and it can transform data during transfer as specified via option . In some cases, data transfer is faster with . To create an
ISO The International Organization for Standardization (ISO ; ; ) is an independent, non-governmental, international standard development organization composed of representatives from the national standards organizations of member countries. Me ...
disk image from a
CD-ROM A CD-ROM (, compact disc read-only memory) is a type of read-only memory consisting of a pre-pressed optical compact disc that contains computer data storage, data computers can read, but not write or erase. Some CDs, called enhanced CDs, hold b ...
, DVD or
Blu-ray Blu-ray (Blu-ray Disc or BD) is a digital optical disc data storage format designed to supersede the DVD format. It was invented and developed in 2005 and released worldwide on June 20, 2006, capable of storing several hours of high-defin ...
disc: blocks=$(isosize -d 2048 /dev/sr0) dd if=/dev/sr0 of=isoimage.iso bs=2048 count=$blocks status=progress To restore a drive from an image file: dd if=system.img of=/dev/sdc bs=64M conv=noerror To create an image of partition sdb2, using a 64 MiB block size: dd if=/dev/sdb2 of=partition.image bs=64M conv=noerror To clone one partition to another: dd if=/dev/sda2 of=/dev/sdb2 bs=64M conv=noerror To clone drive ad0 to ad1; ignoring any errors: dd if=/dev/ad0 of=/dev/ad1 bs=64M conv=noerror


In-place modification

The command can modify data in place. For example, this overwrites the first 512 bytes of a file with null bytes: dd if=/dev/zero of=path/to/file bs=512 count=1 conv=notrunc Option requests to not truncate the output file. That is, if the output file already exists, replace the specified bytes and leave the rest of the output file as-is. Without this option, the command would create an output file 512 bytes long.


Master boot record backup and restore

The example above can also be used to backup and restore any region of a device to a file; including a
master boot record A master boot record (MBR) is a type of boot sector in the first block of disk partitioning, partitioned computer mass storage devices like fixed disks or removable drives intended for use with IBM PC-compatible systems and beyond. The concept ...
. To duplicate the first two sectors of a floppy disk: dd if=/dev/fd0 of=MBRboot.img bs=512 count=2


Disk wipe

For security reasons, it is sometimes necessary to have a disk wipe of a discarded device. This can be achieved by a "data transfer" from the Unix special files. * To write ''zeros'' to a disk, use dd if= /dev/zero of= /dev/sda bs=16M. * To write ''random data'' to a disk, use dd if= /dev/urandom of= /dev/sda bs=16M. When compared to the data modification example above, conversion option is not required as it has no effect when the output file is a block device. Option makes dd read and write 16  mebibytes at a time. For modern systems, an even greater block size may be faster. Note that filling the drive with random data may take longer than zeroing the drive, because the random data must be created by the CPU, while creating zeroes is very fast. On modern hard-disk drives, zeroing the drive will render most data it contains permanently irrecoverable. However, with other kinds of drives such as flash memories, much data may still be recoverable by data remanence. Modern
hard disk drive A hard disk drive (HDD), hard disk, hard drive, or fixed disk is an electro-mechanical data storage device that stores and retrieves digital data using magnetic storage with one or more rigid rapidly rotating hard disk drive platter, pla ...
s contain a Secure Erase command designed to permanently and securely erase every accessible and inaccessible portion of a drive. It may also work for some solid-state drives (flash drives). As of 2017, it does not work on
USB flash drive A flash drive (also thumb drive, memory stick, and pen drive/pendrive) is a data storage device that includes flash memory with an integrated USB interface. A typical USB drive is removable, rewritable, and smaller than an optical disc, and u ...
s nor on Secure Digital flash memories. When available, this is both faster than using dd, and more secure. On
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
machines it is accessible via the hdparm command's option. The shred program offers multiple overwrites, as well as more secure deletion of individual files.


Data recovery

Data recovery involves reading from a drive with some parts potentially inaccessible. The command is a good fit with this job with its flexible skipping () and other low-level settings. The vanilla , however, is clumsy to use as the user has to read the error messages and manually calculate the regions that can be read. The single block size also limits the granularity of the recovery, as a trade-off has to be made: either use a small one for more data recovered or use a large one for speed. A C program called ''dd_rescue'' was written in October 1999. It did away with the conversion functionality of , and supports two block sizes to deal with the dilemma. If a read using a large size fails, it falls back to the smaller size to gather as much as data possible. It can also run backwards. In 2003, a ''dd_rhelp'' script was written to automate the process of using ''dd_rescue'', keeping track of what areas have been read on its own. In 2004, GNU wrote a separate utility, unrelated to , called ddrescue. It has a more sophisticated dynamic block-size algorithm and keeps track of what has been read internally. The authors of both ''dd_rescue'' and ''dd_rhelp'' consider it superior to their implementation. To help distinguish the newer GNU program from the older script, alternate names are sometimes used for GNU's ''ddrescue'', including ''addrescue'' (the name on freecode.com and freshmeat.net), ''gddrescue'' (
Debian Debian () is a free and open-source software, free and open source Linux distribution, developed by the Debian Project, which was established by Ian Murdock in August 1993. Debian is one of the oldest operating systems based on the Linux kerne ...
package name), and ''gnu_ddrescue'' ( openSUSE package name). Another open-source program called ''savehd7'' uses a sophisticated algorithm, but it also requires the installation of its own programming-language interpreter.


Benchmark drive performance

To make drive benchmark test and analyze the sequential (and usually single-threaded) system read and write performance for 1024-byte blocks: * Write performance: dd if= /dev/zero bs=1024 count=1000000 of=1GB_file_to_write * Read performance: dd if=1GB_file_to_read of= /dev/null bs=1024


Generate a file with random data

To make a file of 100 random bytes using the random driver: dd if=/dev/urandom of=myrandom bs=100 count=1


Convert a file to upper case

To convert a file to uppercase: dd if=filename of=filename1 conv=ucase,notrunc


Progress feedback

On request, the command reports progress. When it receives signal ( on BSD systems), it writes the number of transferred blocks to standard error. The following bash script requests progress every 10 seconds until the transfer completes. The text ''PID'' stands for the process identifier. while kill -USR1 PID ; do sleep 10 ; done Newer versions of GNU ''dd'' support the option which enables periodic status feedback.


Forks


dcfldd

''dcfldd'' is a fork of GNU ''dd'' that is an enhanced version developed by Nick Harbour, who at the time was working for the United States' Department of Defense Computer Forensics Lab. Compared to , ''dcfldd'' allows more than one output file, supports simultaneous multiple checksum calculations, provides a verification mode for file matching, and can display the percentage progress of an operation. As of February 2024, the last release was 1.9.1 from April 2023.


dc3dd

''dc3dd'' is another fork of GNU ''dd'' from the United States Department of Defense Cyber Crime Center (DC3). It can be seen as a continuation of the dcfldd, with a stated aim of updating whenever the GNU upstream is updated. , the last release was 7.3.1 from April 2023.


See also

* * * * *


References

{{Backup software Data recovery software Disk cloning Hard disk software Standard Unix programs Unix SUS2008 utilities Plan 9 commands Inferno (operating system) commands Data erasure software