split
is a utility on
Unix
Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
,
Plan 9, and
Unix-like
A Unix-like (sometimes referred to as UN*X, *nix or *NIX) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Uni ...
operating system
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
s most commonly used to split a
computer file
A computer file is a System resource, resource for recording Data (computing), data on a Computer data storage, computer storage device, primarily identified by its filename. Just as words can be written on paper, so too can data be written to a ...
into two or more smaller files.
History
The
command
Command may refer to:
Computing
* Command (computing), a statement in a computer language
* command (Unix), a Unix command
* COMMAND.COM, the default operating system shell and command-line interpreter for DOS
* Command key, a modifier key on A ...
first appeared in
Version 3 Unix
Research Unix refers to the early versions of the Unix operating system for DEC PDP-7, PDP-11, VAX and Interdata 7/32 and 8/32 computers, developed in the Bell Labs Computing Sciences Research Center (CSRC). The term ''Research Unix'' first appe ...
and is part of the
X/Open X/Open group (also known as the Open Group for Unix Systems and incorporated in 1987 as X/Open Company, Ltd.) was a consortium founded by several European UNIX systems manufacturers in 1984 to identify and promote open standards in the field of info ...
Portability Guide since issue 2 of 1987. It was inherited into the first version of POSIX.1 and the
Single Unix Specification
The Single UNIX Specification (SUS) is a standard for computer operating systems, compliance with which is required to qualify for using the "UNIX" trademark. The standard specifies programming interfaces for the C language, a command-line shell, ...
. The version of
split
bundled in
GNU coreutils
The GNU Core Utilities or coreutils is a collection of GNU software that implements many standard, Unix-based shell commands. The utilities generally provide POSIX compliant interface when the environment variable is set, but otherwise offers a ...
was written by Torbjorn Granlund and
Richard Stallman
Richard Matthew Stallman ( ; born March 16, 1953), also known by his initials, rms, is an American free software movement activist and programmer. He campaigns for software to be distributed in such a manner that its users have the freedom to ...
. The command has also been ported to the
IBM i
IBM i (the ''i'' standing for ''integrated'') is an operating system developed by IBM for IBM Power Systems. It was originally released in 1988 as OS/400, as the sole operating system of the IBM AS/400 line of systems. It was renamed to i5/OS in 2 ...
operating system.
Usage
The command-
syntax
In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...
is:
split PTION NPUT [PREFIX
The default behavior of
split
is to generate output files of a fixed size, default 1000 lines. The files are named by appending ''aa'', ''ab'', ''ac'', etc. to ''output filename''. If ''output filename'' is not given, the default filename of ''x'' is used, for example, ''xaa'', ''xab'', etc. When a hyphen (''-'') is used instead of ''input filename'', data is derived from standard input. The files are typically rejoined using a utility such as cat (Unix), cat.
Additional program options permit a maximum character count (instead of a line count), a maximum line length, how many incrementing characters in generated filenames, and whether to use letters or digits.
Split file into pieces
Create a file named "
myfile.txt
" with exactly 3,000 lines of data:
$ head -3000 < /dev/urandom > myfile.txt
Now, use the
split
command to break this file into pieces (note: unless otherwise specified,
split
will break the file into 1,000-line files):
$ split myfile.txt
$ ls -l
-rw-r--r-- 1 root root 761K Jun 16 18:17 myfile.txt
-rw-r--r-- 1 root root 242K Jun 16 18:17 xaa
-rw-r--r-- 1 root root 263K Jun 16 18:17 xab
-rw-r--r-- 1 root root 256K Jun 16 18:17 xac
$ wc --lines xa*
1000 xaa
1000 xab
1000 xac
3000 total
As seen above, the
split
command has broken the original file (keeping the original intact) into three, equal in number of lines (i.e., 1,000), files:
xaa
,
xab
, and
xac
.
See also
*
csplit – splits by content rather than by size
*
File spanning
*
List of Unix commands
This is a list of the shell commands of the most recent version of the Portable Operating System Interface (POSIX) IEEE Std 1003.1-2024 which is part of the Single UNIX Specification (SUS). These commands are implemented in many shells on moder ...
References
External links
*
Standard Unix programs
Unix SUS2008 utilities
Plan 9 commands
IBM i Qshell commands
{{unix-stub