HOME

TheInfoList



OR:

The Sort/Merge
utility In economics, utility is a measure of a certain person's satisfaction from a certain state of the world. Over time, the term has been used with at least two meanings. * In a normative context, utility refers to a goal or objective that we wish ...
is a
mainframe A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterpris ...
program to sort records in a file into a specified order, merge pre-sorted files into a sorted file, or copy selected records. Internally, these utilities use one or more of the standard
sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a List (computing), list into an Total order, order. The most frequently used orders are numerical order and lexicographical order, and either ascending or descending ...
s, often with proprietary fine-tuned code. Mainframes were originally supplied with limited
main memory Computer data storage or digital data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers. The central processin ...
by today's standards and the amount of data to be sorted was frequently very large. Because of this, unlike more recent sort programs, early Sort/Merge programs placed great emphasis on efficient techniques for sorting data on
secondary storage Computer data storage or digital data storage is a technology consisting of computer components and Data storage, recording media that are used to retain digital data. It is a core function and fundamental component of computers. The cent ...
, typically tape or disk. In 1968 the OS/360 Sort/Merge program provided five different "sequence distribution techniques" that could be used depending on the number and type of devices available. Historically, the "alias" SORT has been used to refer to an installation's preferred sort program, IBM's Sort/Merge, and third party Sort/Merge programs (i.e., SYNCSORT, CASORT). DFSORT is often referred to by its program name, ICEMAN (component ICE; the original OS/360 Sort/Merge program name was IERRCO00, component IER, also with "alias" SORT).


Virtual storage systems

Prior to the
System/370 The IBM System/370 (S/370) is a range of IBM mainframe computers announced as the successors to the IBM System/360, System/360 family on June 30, 1970. The series mostly maintains backward compatibility with the S/360, allowing an easy migrati ...
, all IBM mainframe operating systems included sort/merge utilities. With the announcement of virtual storage operating systems, DOS/VS and
OS/VS The IBM System/370 (S/370) is a range of IBM mainframe computers announced as the successors to the System/360 family on June 30, 1970. The series mostly maintains backward compatibility with the S/360, allowing an easy migration path for cus ...
, IBM unbundled much of the software and offered chargeable sort/merge program products. For OS/VS IBM offered 5734-SM1, OS Sort/Merge, and later offered 5740-SM1, OS/VS Sort/Merge, subsequently renamed Data Facility Sort (DFSORT). In 1990 IBM introduced a new merge algorithm called BLOCKSET in DFSORT the successor to OS/360 Sort/Merge. Of historical note, the BLOCKSET algorithm was invented by an IBM Systems Engineer in 1963 and was discovered in IBM's archives and implemented in 1990.


Usage

Sort/Merge is very frequently used; often the most commonly used application program in a mainframe shop generally consuming about twenty percent of the processing power of the shop. Modern Sort/Merge programs also can copy files, select or omit certain records, summarize records, remove duplicates, reformat records, append new data and produce reports. Indeed, most Sort/Merge applications use the wide range of additional processing capabilities, rather than purely sorting or merging records: the Sort/Merge product is a very fast way of performing input to and output from these functions. Quite a number of "user exits" are supported, and these may be load modules (i.e., a member of a library), or object decks (i.e., the output of an assembler), with the Sort/Merge application loading (load modules) or linking (object decks; termed "dynamic link editing" in DFSORT) the exit, as specified and required. Working storage datasets (i.e., SORTWK01, ..., SORTWKnn) may be disk or tape, although the BLOCKSET algorithm is restricted to disk working storage; more working storage datasets generally improves performance.


Competition

Sort/merge is important enough that there are multiple companies each selling their own sort/merge package for
IBM mainframes IBM mainframes are large computer systems produced by IBM since 1952. During the 1960s and 1970s, IBM dominated the computer market with the IBM 700/7000 series, 7000 series and the later System/360, followed by the System/370. Current mainfram ...
and their
z/OS z/OS is a 64-bit operating system for IBM z/Architecture mainframes, introduced by IBM in October 2000. It derives from and is the successor to OS/390, which in turn was preceded by a string of MVS versions.Starting with the earliest: ...
,
z/VM z/VM is the current version in IBM's VM family of virtual machine operating systems. First released in October 2000, z/VM remains in active use and development . It is directly based on technology and concepts dating back to the 1960s, particu ...
and
z/VSE VSEn (''Virtual Storage Extended'') is an operating system for IBM mainframe computers, the latest one in the DOS/360 lineage, which originated in 1965. It is less common than z/OS and is mostly used on smaller machines. DOS/VSE was introduced i ...
operating systems An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs. Time-sharing operating systems scheduler (computing), schedule tasks for ...
. These programs are largely compatible with IBM's SORT programs, often with some extensions. The major Sort/Merge packages are: * DFSORT sold by
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
. * SyncSort sold by Syncsort, Inc. * CA-Sort sold by
CA Technologies CA Technologies, Inc., formerly Computer Associates International, Inc., and CA, Inc., was an American multinational corporation, multinational enterprise software developer and publisher that existed from 1976 to 2018. CA grew to rank as one o ...
. (Some of these companies also sell versions for other platforms, such as
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
,
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
, or
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
.)


Migration

Sort/Merge is a critical component of many mainframe environments. When migrating from the mainframe to other platforms such as
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
,
Linux Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
or
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
, a Sort/Merge utility is needed; MFSORT from
Micro Focus Micro Focus International plc was a British multinational software and information technology business based in Newbury, Berkshire, England. The firm provided software and consultancy. The company was listed on the London Stock Exchange and t ...
and AHLSORT emulate the functions of DFSORT outside of the Mainframe environment.


IBM OS/360 SORT

Prior to virtual storage operating systems, "The input data set asalmost always too large to be brought into main storage and sorted all at once." SORT used a ''replacement selection technique'' to reduce storage usage. The program placed emphasis on ''sequence distribution techniques'', which could be defaulted depending on the number and type of devices available, or could be specified by the user, for making best use of secondary storage "sort work" (SORTWK) files. These techniques were methods of distributing partially sorted sequences of records most efficiently. There were five distribution techniques available to the OS/360 SORT: * Magnetic tape techniques ** Balanced (BALN) - required a minimum of 12,000 bytes of main storage and 2x+1 tape devices for intermediate storage, where ''x'' is the number of input tape volumes, up to a maximum of 15 input reels. ** Polyphase (POLY) - required a minimum of 12,000 bytes and 3 intermediate storage tape devices. Only one input reel was allowed. ** Oscillating (OSCL) - required 21,000 bytes and max(x+2,4) intermediate tape devices, where ''x'' is the number of input volumes, up to a maximum of 15. * Direct access techniques ** Balanced (BALN) - required a minimum of 13,000 bytes and 3 to 6 disk work areas. The maximum number of records that could be sorted depended on the main and auxiliary storage available. **Crisscross (CRCX) - Not available for IBM 2311 or IBM 2301 auxiliary storage devices. Required a minimum of 24,000 bytes of main storage and 6 to 17 auxiliary storage workareas. The maximum number of records that could be sorted depended on the main and auxiliary storage available.


IBM OS/VS SORT

The distribution techniques listed for tape sorts were retained by the OS/VS SORT program, now called "conventional techniques." The disk sort techniques were replaced by four new ones: * FLR-Blockset for fixed length records * VLR-Blockset for variable-length records * Peerage for fixed length records * Vale for both fixed and variable-length records


See also

* BatchPipes * External sort


Notes


References

{{Reflist


External links


IBM DFSORT ManualsSome basic DFSORT and SyncSort examples
Mainframe utility programs Data processing