Byte Sieve
   HOME

TheInfoList



OR:

The Byte Sieve is a computer-based implementation of the
Sieve of Eratosthenes In mathematics, the sieve of Eratosthenes is an ancient algorithm for finding all prime numbers up to any given limit. It does so by iteratively marking as composite number, composite (i.e., not prime) the multiples of each prime, starting with ...
published by ''
Byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
'' as a
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
performance benchmark. It first appeared in the September 1981 edition of the magazine and was revisited on occasion. Although intended to compare the performance of different languages on the same computers, it quickly became a widely used machine benchmark. The Sieve was one of the more popular benchmarks of the
home computer Home computers were a class of microcomputers that entered the market in 1977 and became common during the 1980s. They were marketed to consumers as affordable and accessible computers that, for the first time, were intended for the use of a s ...
era, another being the
Creative Computing Benchmark The Creative Computing Benchmark, also called Ahl's Simple Benchmark, is a Benchmark (computing), computer benchmark that was used to compare the performance of the BASIC programming language on various machines. It was first introduced in the Nove ...
of 1983, and the Rugg/Feldman benchmarks, mostly seen in the UK in this era. ''Byte'' later published the more thorough
NBench NBench, short for Native mode Benchmark and later known as BYTEmark, is a synthetic computing benchmark program developed in the mid-1990s by the now defunct BYTE magazine intended to measure a computer's CPU, FPU, and Memory System speed. Hist ...
in 1995 to replace it.


History


Origins

Jim Gilbreath of the Naval Ocean System Center had been considering the concept of writing a small language benchmarking program for some time, desiring one that would be portable across languages, small enough that the program code would fit on a single printed page, and did not rely on specific features like hardware multiplication or division. The solution was inspired by a meeting with
Chuck Forsberg Charles Alton "Chuck" Forsberg (May 6, 1944 – September 24, 2015) developed two data transmission protocols popular in the 1990s, for uploading and downloading files from dial-up bulletin board systems. He received a Dvorak Award for Excell ...
at the January 1980
USENIX USENIX is an American 501(c)(3) nonprofit membership organization based in Berkeley, California and founded in 1975 that supports advanced computing systems, operating system (OS), and computer networking research. It organizes several confe ...
meeting in
Boulder, CO Boulder is a List of municipalities in Colorado#Home rule municipality, home rule city in Boulder County, Colorado, United States, and its county seat. With a population of 108,250 at the 2020 United States census, 2020 census, it is the most ...
, where Forsberg mentioned an implementation of the sieve written by
Donald Knuth Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist and mathematician. He is a professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of comp ...
. Gilbreath felt the sieve would be an ideal benchmark as it avoided indirect tests on arithmetic performance, which varied widely between systems. The algorithm mostly stresses array lookup performance and basic logic and branching capabilities. Nor does it require any advanced language features like
recursion Recursion occurs when the definition of a concept or process depends on a simpler or previous version of itself. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in m ...
or advanced collection types. The only modification from Knuth’s original version was to remove a multiplication by two and replace it with an addition instead. With the original version, machines with hardware multipliers would otherwise run so much faster that the rest of the performance would be hidden. After six months of effort porting it to as many platforms as he had access to, the first results were introduced in the September 1981 edition of ''
Byte The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
'' in an article entitled "A High-Level Language Benchmark". Gilbreath was quick to point out that: The article provided reference implementations in ten languages, including more popular selections like
BASIC Basic or BASIC may refer to: Science and technology * BASIC, a computer programming language * Basic (chemistry), having the properties of a base * Basic access authentication, in HTTP Entertainment * Basic (film), ''Basic'' (film), a 2003 film ...
, C, Pascal,
COBOL COBOL (; an acronym for "common business-oriented language") is a compiled English-like computer programming language designed for business use. It is an imperative, procedural, and, since 2002, object-oriented language. COBOL is primarily ...
, and FORTRAN, and some less well-known examples like
Forth Forth or FORTH may refer to: Arts and entertainment * ''forth'' magazine, an Internet magazine * ''Forth'' (album), by The Verve, 2008 * ''Forth'', a 2011 album by Proto-Kaw * Radio Forth, a group of independent local radio stations in Scotl ...
, ZSPL,
Ratfor Ratfor (short for ''Rational Fortran'') is a programming language implemented as a preprocessor for Fortran#FORTRAN 66, Fortran 66. It provides Structured programming, modern control structures, unavailable in Fortran 66, to replace GOTOs and sta ...
,
PL/1 PL/I (Programming Language One, pronounced and sometimes written PL/1) is a procedural, imperative computer programming language initially developed by IBM. It is designed for scientific, engineering, business and system programming. It has b ...
and PLMX. Example runs were provided for a variety of machines, mostly
Zilog Z80 The Zilog Z80 is an 8-bit computing, 8-bit microprocessor designed by Zilog that played an important role in the evolution of early personal computing. Launched in 1976, it was designed to be Backward compatibility, software-compatible with the ...
or
MOS 6502 The MOS Technology 6502 (typically pronounced "sixty-five-oh-two" or "six-five-oh-two") William Mensch and the moderator both pronounce the 6502 microprocessor as ''"sixty-five-oh-two"''. is an 8-bit microprocessor that was designed by a small ...
-based. The best time was initially 16.5 seconds, turned in by Ratfor on a 4 MHz Z80 machine, but
Gary Kildall Gary Arlen Kildall (; May 19, 1942 – July 11, 1994) was an American computer scientist and microcomputer entrepreneur. During the 1970s, Kildall created the CP/M operating system among other operating systems and programming tools, and s ...
personally provided a version in
Digital Research Digital Research, Inc. (DR or DRI) was a privately held American software company created by Gary Kildall to market and develop his CP/M operating system and related 8-bit, 16-bit and 32-bit systems like MP/M, Concurrent DOS, FlexOS, Multiuser ...
's prototype version of PL/1 that ran in 14 seconds and set the mark for this first collection. The slowest was Microsoft COBOL on the same machine, which took a whopping 5115 seconds (almost one and a half hours), longer even than interpreted languages like BASIC. A notable feature of this first run was that C, Pascal and PL/1 all turned in a roughly similar performance that easily beat the various interpreters. A second set of tests was carried out on more powerful machines, with
Motorola 68000 The Motorola 68000 (sometimes shortened to Motorola 68k or m68k and usually pronounced "sixty-eight-thousand") is a 16/32-bit complex instruction set computer (CISC) microprocessor, introduced in 1979 by Motorola Semiconductor Products Sector ...
assembly language In computing, assembly language (alternatively assembler language or symbolic machine code), often referred to simply as assembly and commonly abbreviated as ASM or asm, is any low-level programming language with a very strong correspondence bet ...
turning in the fastest times at 1.12 seconds, slightly besting C on a
PDP-11/70 The PDP–11 is a series of 16-bit minicomputers originally sold by Digital Equipment Corporation (DEC) from 1970 into the late 1990s, one of a set of products in the Programmed Data Processor (PDP) series. In total, around 600,000 PDP-11s of all ...
and almost twice as fast as 8086 assembler. Most PDP-11 and HP-3000 times were much slower, on the order of 10 to 50 seconds. Tests on these machines using only high-level languages was led by NBS Pascal on the PDP-11, at 2.6 seconds.
UCSD Pascal UCSD Pascal is a Pascal programming language system that runs on the UCSD p-System, a portable, highly machine-independent operating system. UCSD Pascal was first released in 1977. It was developed at the University of California, San Diego (UC ...
provided another interesting set of results as the same program can be run on multiple machines. Running on the dedicated Ithaca InterSystems Pascal-100 machine, a
Pascal MicroEngine Pascal MicroEngine is a series of microcomputer products manufactured by Western Digital from 1979 through the mid-1980s, designed specifically to run the UCSD p-System efficiently. Compared to other microcomputers, which use a machine language p-c ...
based computer, it ran in 54 seconds, while on the Z80 it was 239, and 516 on the Apple II.


Spread

Gilbreath, this time along with his brother Gary, revisited the code in the January 1983 edition of ''Byte''. This version removed most of the less popular languages, leaving Pascal, C, FORTRAN IV, and COBOL, while adding
Ada Ada may refer to: Arts and entertainment * '' Ada or Ardor: A Family Chronicle'', a novel by Vladimir Nabokov Film and television * Ada, a character in 1991 movie '' Armour of God II: Operation Condor'' * '' Ada... A Way of Life'', a 2008 Bollywo ...
and
Modula-2 Modula-2 is a structured, procedural programming language developed between 1977 and 1985/8 by Niklaus Wirth at ETH Zurich. It was created as the language for the operating system and application software of the Lilith personal workstation. It w ...
. Thanks to readers providing additional samples, the number of machines, operating systems and languages compared in the resulting tables was greatly expanded.
Motorola 68000 The Motorola 68000 (sometimes shortened to Motorola 68k or m68k and usually pronounced "sixty-eight-thousand") is a 16/32-bit complex instruction set computer (CISC) microprocessor, introduced in 1979 by Motorola Semiconductor Products Sector ...
(68k) assembly remained the fastest, almost three times the speed of the
Intel 8086 The 8086 (also called iAPX 86) is a 16-bit computing, 16-bit microprocessor chip designed by Intel between early 1976 and June 8, 1978, when it was released. The Intel 8088, released July 1, 1979, is a slightly modified chip with an external 8-b ...
running at the same 8 MHz clock. Using high-level languages the two were closer in performance, with the 8086 generally better than half the speed of the 68k and often much closer. A wider variety of
minicomputer A minicomputer, or colloquially mini, is a type of general-purpose computer mostly developed from the mid-1960s, built significantly smaller and sold at a much lower price than mainframe computers . By 21st century-standards however, a mini is ...
s and
mainframe A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterpris ...
s was also included, with times that the 68k generally beat except for the very fastest machines like the IBM 3033 and high-end models of the
VAX VAX (an acronym for virtual address extension) is a series of computers featuring a 32-bit instruction set architecture (ISA) and virtual memory that was developed and sold by Digital Equipment Corporation (DEC) in the late 20th century. The V ...
. Older machines like the
Data General Nova The Nova is a series of 16-bit computing, 16-bit minicomputers released by the American company Data General. The Nova family was very popular in the 1970s and ultimately sold tens of thousands of units. The first model, known simply as "Nov ...
, PDP-11 and HP-1000 were nowhere near as fast as the 68k. Gilbreath's second article appeared as the benchmark was becoming quite common as a way to compare the performance of various machines, let alone languages. In spite of his original warning not to do so, it soon began appearing in magazine advertisements as a way to compare performance against the competition, and as a general benchmark. ''Byte'' once again revisited the sieve later in August 1983 as part of a whole-magazine series of articles on the C language. In this case the use was more in keeping with the original intent, using a single
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
and running it on a single machine to compare the performance of C compilers on the
CP/M-86 CP/M-86 is a discontinued version of the CP/M operating system that Digital Research (DR) made for the Intel 8086 and Intel 8088. The system commands are the same as in CP/M-80. Executable files used the relocatable .CMD file format. Digital Re ...
operating system, on CP/M-80, and for the
IBM PC The IBM Personal Computer (model 5150, commonly known as the IBM PC) is the first microcomputer released in the List of IBM Personal Computer models, IBM PC model line and the basis for the IBM PC compatible ''de facto'' standard. Released on ...
. In spite of Gilbreath's concern in the original article, by this time the code had become almost universal for testing, and one of the articles remarked that "The Sieve of Eratosthenes is a mandatory benchmark". It was included in the ''Byte'' UNIX Benchmark Suite introduced in August 1984.


Today

New versions of the code continue to appear for new languages, e.g
Rosetta Code
and
GitHub GitHub () is a Proprietary software, proprietary developer platform that allows developers to create, store, manage, and share their code. It uses Git to provide distributed version control and GitHub itself provides access control, bug trackin ...
has many versions available. It is often used as an example of
functional programming In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declarat ...
in spite of the common version not actually using the sieve algorithm.


Implementation

The provided implementation calculated odd primes only, so the 8191 element array actually represented primes less than 16385. As shown in a sidebar table, the 0th element represented 3, 1st element 5, 2nd element 7, and so on. This is the original BASIC version of the code presented in 1981. The dialect is not specified, but a number of details mean it does not run under early versions of
Microsoft BASIC Microsoft BASIC is the foundation software product of the Microsoft company and evolved into a line of BASIC interpreters and compiler(s) adapted for many different microcomputers. It first appeared in 1975 as Altair BASIC, which was the first v ...
(4.x and earlier), among these the use of long variable names like and . The lack of line numbers may suggest a minicomputer variety that reads source from a text file, but may have also been a printing error. REM Eratosthenes Sieve Prime Number Program in BASIC 1 SIZE = 8190 2 DIM FLAGS(8191) 3 PRINT "Only 1 iteration" 5 COUNT = 0 6 FOR I = 0 TO SIZE 7 FLAGS (I) = 1 8 NEXT I 9 FOR I = 0 TO SIZE 10 IF FLAGS (I) = 0 THEN 18 11 PRIME = I+I + 3 12 K = I + PRIME 13 IF K > SIZE THEN 17 14 FLAGS (K) = 0 15 K = K + PRIME 16 GOTO 13 17 COUNT = COUNT + 1 18 NEXT I 19 PRINT COUNT," PRIMES" And in C, with some whitespace adjustments from the original: #define true 1 #define false 0 #define size 8190 #define sizepl 8191 char flags izepl main()


Notes


References


Citations


Bibliography

* * * {{cite book , first=Donald , last=Knuth , title=The Art of Computer Programming Volume 2: Seminumerical Algorithms , date=1969 , publisher=Addison-Wesley , url=https://books.google.com/books?id=_NHDswEACAAJ , isbn=978-0-201-89684-8 Benchmarks (computing) History of computing Articles with example BASIC code Articles with example C code