The Whetstone benchmark is a synthetic
benchmark for evaluating the performance of
computer
A computer is a machine that can be Computer programming, programmed to automatically Execution (computing), carry out sequences of arithmetic or logical operations (''computation''). Modern digital electronic computers can perform generic set ...
s. It was first written in
ALGOL 60
ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a ...
in 1972 at the Technical Support Unit of the Department of Trade and Industry (later part of the
Central Computer and Telecommunications Agency
The Central Computer and Telecommunications Agency (CCTA), formerly the Central Computer Agency (CCA), was a UK government agency providing computer and telecoms support to government departments.
History
Formation
Archived records
C ...
) in the
United Kingdom
The United Kingdom of Great Britain and Northern Ireland, commonly known as the United Kingdom (UK) or Britain, is a country in Northwestern Europe, off the coast of European mainland, the continental mainland. It comprises England, Scotlan ...
. It was derived from statistics on program behaviour gathered on the
KDF9 computer at NPL
National Physical Laboratory, using a modified version of its Whetstone
ALGOL 60
ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a ...
compiler. The workload on the machine was represented as a set of frequencies of execution of the 124 instructions of the Whetstone Code. The Whetstone Compiler was built at the Atomic Power Division of the
English Electric
The English Electric Company Limited (EE) was a British industrial manufacturer formed after World War I by amalgamating five businesses which, during the war, made munitions, armaments and aeroplanes.
It initially specialised in industrial el ...
Company in
Whetstone, Leicestershire, England, hence its name. Dr. B.A. Wichman at NPL produced a set of 42 simple ALGOL 60 statements, which in a suitable combination matched the execution statistics.
To make a more practical benchmark Harold Curnow of TSU wrote a program incorporating the 42 statements. This program worked in its ALGOL 60 version, but when translated into
FORTRAN it was not executed correctly by the IBM optimizing compiler. Calculations whose results were not output were omitted. He then produced a set of program fragments which were more like real code and which collectively matched the original 124 Whetstone instructions. Timing this program gave a measure of the machine's speed in thousands of Whetstone instructions per second (). The Fortran version became the first general purpose benchmark that set industry standards of computer system performance. Further development was carried out by Roy Longbottom, also of TSU/CCTA, who became the official design authority.
The Algol 60 program ran under the Whetstone compiler in July 2010, for the first time since the last KDF9 was shut down in 1980, but now executed by a KDF9 emulator.
Benchmark content and enhancements
The benchmark employs 8 test procedures, with three executing standard floating point calculations, two with such as COS or EXP functions, one each for integer arithmetic, branching or memory assignments. Output from the original comprised parameters used for each test, numeric results produced and the overall KWIPS performance rating. In 1978, the program was updated to log running time of each of the tests, allowing
MFLOPS (Millions of Floating Point Operations Per Second) to be included in reports, along with an estimation of Integer MIPS (Millions of Instructions Per Second). In 1987, MFLOPS calculations were included in the log for the three appropriate tests and MOPS (Millions of Operations Per Second) for the others. Code changes were also carried out, including by Bangor University, necessary to identify unexpected behaviour, without changing the implementation of the original 124 Whetstone instructions. One necessary change was to maintain measurement accuracy at increasing CPU speeds, with self calibration to run for a noticeable finite time, typically set for 10 seconds or 100 for early PCs with low timer resolution.
Note that there are other versions of the Whetstone Benchmark available online, some claiming copyright, without reference to CCTA or the design authority.
Initial CCTA results
In conjunction with the undertaking controlled by the Contracts Division, CCTA engineers had responsibility to design and supervise acceptance trials
of all
UK Government
His Majesty's Government, abbreviated to HM Government or otherwise UK Government, is the central government, central executive authority of the United Kingdom of Great Britain and Northern Ireland. computers and those for centrally funded for Universities and
Research Councils
Research is creative and systematic work undertaken to increase the stock of knowledge. It involves the collection, organization, and analysis of evidence to increase understanding of a topic, characterized by a particular attentiveness to ...
, with systems varying from
minicomputers to
supercomputers
A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instru ...
. This provided the opportunity to gather verified Whetstone Benchmark results. Other results were obtained via new computer system appraisal activities.
CCTA records are now available in The
UK National Archives
National archives are the archives of a country. The concept evolved in various nations at the dawn of modernity based on the impact of nationalism upon bureaucratic processes of paperwork retention.
Conceptual development
From the Middle Ages i ...
,
including technical reports.
Original Whetstone Benchmark results are in the 1985 CCTA Technical Memorandum 1182, where overall speed is only shown as MWIPS. This contains more than 1000 results for 244 computers from 32 manufacturers, including the first for PCs and Supercomputers. The report might well be accessible from the Archive. The details were later included in a publicly available report (see Available Reports below).
Vector processing version
Roy Longbottom converted the original Whetstone Benchmark to fully exploit capabilities of the new
vector processors. Results were included in the paper “Performance of Multi-User Supercomputing Facilities” presented in the 1989 Fourth International Conference on Supercomputing, Santa Clara
.
This was also repeated in the Harold Curnow paper “Whither Whetstone? The synthetic benchmark after 15 years” presented at the “Evaluating supercomputers: strategies for exploiting, evaluating and benchmarking computers with advanced architecture” conference in 1990, in book
.
Whetstone benchmark influences
Harold also reported comments from the 1989 conference “Software for Parallel Computers” in a presentation by Gordon Bell, designer of the
Digital Equipment Corporation
Digital Equipment Corporation (DEC ), using the trademark Digital, was a major American company in the computer industry from the 1960s to the 1990s. The company was co-founded by Ken Olsen and Harlan Anderson in 1957. Olsen was president until ...
VAX
VAX (an acronym for virtual address extension) is a series of computers featuring a 32-bit instruction set architecture (ISA) and virtual memory that was developed and sold by Digital Equipment Corporation (DEC) in the late 20th century. The V ...
range of minicomputers, indicating that the range was designed to perform well on the Whetstone Benchmark.
The Whetstone Benchmark also had high visibility concerning floating point performance of
Intel
Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
CPUs and PCs, starting with the 1980 Intel
8087
The Intel 8087, announced in 1980, was the first floating-point coprocessor for the 8086 line of microprocessors. The purpose of the chip was to speed up floating-point arithmetic operations, such as addition, subtraction, multiplication, di ...
coprocessor. This was reported in the 1986 Intel Application Report “High Speed Numerics with the
80186/
80188 and
8087
The Intel 8087, announced in 1980, was the first floating-point coprocessor for the 8086 line of microprocessors. The purpose of the chip was to speed up floating-point arithmetic operations, such as addition, subtraction, multiplication, di ...
”
.
The latter includes hardware functions for exponential, logarithmic or trigonometric calculations, as used in two of the eight Whetstone Benchmark tests, where these can dominate running time. Only two other benchmarks were included in the Intel procedures, showing huge gains over the earlier software based routines on all three programs.
Later tests, by a SSEMC Laboratory, evaluated Intel
80486 compatible CPU chips using their Universal Chip Analyzer
.
Considering two floating point benchmarks, as used by Intel in the above report, they preferred Whetstone, stating “Whetstone utilizes the complete set of instructions available on early
x87 FPUs”. This might suggest that the Whetstone Benchmark influenced the hardware instruction set.
By the 1990s the Whetstone Benchmark and results had become relatively popular. A notable quotation in 1985 was in “A portable seismic computing benchmark” quoting "The only commonly used benchmark to my knowledge is the venerable Whetstone benchmark, designed many years ago to test floating point operations" in the
European Association of Geoscientists and Engineers Journal
.
Details of the Vector Whetstone Benchmark performance were also repeated, by Roy Longbottom, at the June 1990 Advanced Computing Seminar at
Natural Environment Research Council
The Natural Environment Research Council (NERC) is a British Research Councils UK, research council that supports research, training and knowledge transfer activities in the environmental sciences.
History
NERC began in 1965 when several envir ...
Wallingford. This led to
Council for the Central Laboratory of the Research Councils
The Council for the Central Laboratory of the Research Councils (CCLRC) was a UK government body that carried out civil research in science and engineering.
On 1 April 2007 CCLRC merged with PPARC to form the Science and Technology Facilities ...
Distributed Computing Support collecting results from running “on a variety of machines, including vector supercomputers, minisupers, super-workstations and workstations, together with that obtained on a number of vector CPUs and on single nodes of various MPP machines “. More than 200 results are included, up to 2006, in the report available on the
Wayback Machine
The Wayback Machine is a digital archive of the World Wide Web founded by Internet Archive, an American nonprofit organization based in San Francisco, California. Launched for public access in 2001, the service allows users to go "back in ...
Archive in entries to at least the year 2007 section
.
The report also indicated “The wide variety of standard functions exercised (sqrt, exp, cos etc.) consume a far larger fraction of the reported times.”.
The First 1 MIPS minicomputer and Dhrystone benchmark
On achieving 1 MWIPS, the
Digital Equipment Corporation
Digital Equipment Corporation (DEC ), using the trademark Digital, was a major American company in the computer industry from the 1960s to the 1990s. The company was co-founded by Ken Olsen and Harlan Anderson in 1957. Olsen was president until ...
VAX-11/780
The VAX-11 is a discontinued family of 32-bit superminicomputers, running the Virtual Address eXtension (VAX) instruction set architecture (ISA), developed and manufactured by Digital Equipment Corporation (DEC). Development began in 1976. In ad ...
minicomputer
A minicomputer, or colloquially mini, is a type of general-purpose computer mostly developed from the mid-1960s, built significantly smaller and sold at a much lower price than mainframe computers . By 21st century-standards however, a mini is ...
became accepted as the first commercially available 32-bit computer to demonstrate 1 MIPS (Millions of Instructions Per Second),
CERN
The European Organization for Nuclear Research, known as CERN (; ; ), is an intergovernmental organization that operates the largest particle physics laboratory in the world. Established in 1954, it is based in Meyrin, western suburb of Gene ...
,
not really appropriate for a benchmark dependent on floating point speed. This had an impact on the
Dhrystone
Dhrystone is a synthetic computing benchmark program developed in 1984 by Reinhold P. Weicker intended to be representative of system (integer) programming. The Dhrystone grew to become representative of general processor ( CPU) performance. T ...
Benchmark, the second accepted general purpose computer performance measurement program, with no floating point calculations. This produced a result of 1757 Dhrystones Per Second on the VAX 11/780, leading to a revised measurement of 1
DMIPS, (AKA Vax MIPS), by dividing the original result by 1757.
Later developments
Following retirement from CCTA, Roy Longbottom continued providing free benchmarking and stress testing programs available on his web site, latterly roylongbottom.org.uk, with most development using
C (programming language)
C (''pronounced'' '' – like the letter c'') is a general-purpose programming language. It was created in the 1970s by Dennis Ritchie and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of ...
, via
Microsoft Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
and
Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
based
Operating Systems
An operating system (OS) is system software that manages computer hardware and software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ...
on PCs. This was initially in conjunction with the
Compuserve
CompuServe, Inc. (CompuServe Information Service, Inc., also known by its initialism CIS or later CSi) was an American Internet company that provided the first major commercial online service provider, online service. It opened in 1969 as a times ...
Benchmarks and Standards Forum, see
Wayback Machine
The Wayback Machine is a digital archive of the World Wide Web founded by Internet Archive, an American nonprofit organization based in San Francisco, California. Launched for public access in 2001, the service allows users to go "back in ...
Archive,
covering PC hardware 1997 to 2008, providing numerous new benchmark results.
From 2008 to 2013 further PC results were collected privately. By then, PC processor operating clock speeds reached 4000 MHz and did not increase that much by the 2020s, reducing the need to gather results of the original scalar benchmark. In 2017 “Whetstone Benchmark History and Results”
was published for public access, with identified year of first delivery and purchase prices were added, also doubling the number of computers covered in the CCTA report. The most notable citation for this was by Tony Voellm, then Google Cloud Performance Engineering Manager, entitled “Cloud Benchmarking: Fight the black hole”
.
This considered available benchmarks and performance by time with detailed graphs, including those from the Whetstone reports. At a later stage, 504 of the results, by year, were included in the report “Techniques used for analyzing basic performance measurements”
.
During this period, versions of the Whetstone Benchmark were produced to access
Multithreading (computer architecture)
In computer architecture, multithreading is the ability of a central processing unit (CPU) (or a single core in a multi-core processor) to provide multiple threads of execution.
Overview
The multithreading paradigm has become more popular a ...
, initially for PCs running under
Microsoft Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
, the latest supporting up to 8 CPUs or CPU cores particularly for those known as 4 core/8 thread varieties.
Compiler and interpreter efficiency
The History report includes new sections for PC results, with CPUs from 1979, particularly those produced by up to 12 different compilers or interpreters, covering C/C++ ( up to 64 bit SSE level), Old Fortran, Basic and Java. These are based on the ratio MWIPS per MHz (multiplied by 100) to represent efficiency. Bottom line is one with a Core i7 CPU with ratings varying from 0.39, via the Basic Interpreter, to 311, via C, using 64 bit SSE options, then 1003 with the multithreading benchmarks, using all four CPU cores.
Results with individual test performance
Another report “Whetstone Benchmark Detailed Later Results”
was produced in 2017.
This document provides a summary of speeds of the eight test loops in the benchmark, as MfLOPS or MOPS plus the MWIPS ratings. There are 22 pages of results covering the same Windows based PCs as the Historic file with different compilers and compiling options, some with multithreaded versions. Later results cover PCs using Linux. Then there are others for a sample of Android phones and tablets and, at the time, the full range of Raspberry Pi computers. For the latter, Roy Longbottom had been recruited as a voluntary member of Raspberry Pi Foundation new products Alpha Testing Team.
Cray 1 supercomputer performance comparisons
Later scalar, vector and multithreading results were included in a 2022 report “
Cray 1 Supercomputer Performance Comparisons With Home Computers Phones and Tablets”
.
This included the following, originally in a report on the first Raspberry Pi computer:
"In 1978, the Cray 1 supercomputer cost $7 Million, weighed 10,500 pounds and had a 115 kilowatt power supply. It was, by far, the fastest computer in the world. The Raspberry Pi costs around $70 (CPU board, case, power supply, SD card), weighs a few ounces, uses a 5 watt power supply and is more than 4.5 times faster than the Cray 1."
This claim was based on the official average performance of the Livermore Loops Benchmark that was used to demonstrate that the first Cray 1 met the required contractual requirements. The scalar Whetstone Benchmark achieved a much higher gain of 16.7 times improvement.
The report includes comparisons with other supercomputers, a modern fairly fast laptop PC and the 2020 Raspberry Pi 400, where the latter obtained MWIPS gains over the Cray 1 of 155 times scalar, 38 vector and 593 scalar multithreading (4 CPU cores versus 1). The quad core laptop, using advanced
SIMD
Single instruction, multiple data (SIMD) is a type of parallel computer, parallel processing in Flynn's taxonomy. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneousl ...
compilations, obtained gains of 400, 215 and 3520 times respectively.
Detailed Results, Source and Executable Codes
Whetstone Benchmark source codes, compiled programs and reports including results are currently (at the time of writing) on Roy Longbottom’s website roylongbottom.org.uk, but this has a limited lifetime.
For main reference purposes the HTML based reports were converted to PDF format and uploaded to ResearchGate. Brief descriptions of all files are included in an indexing file
(download via More v for menu choices).
Unfortunately, the file structure was changed, disabling access to most older compressed files containing benchmark source codes and compiled programs.
The original website provides the same indexing format but includes the links
to access both local files and those at ResearchGate, the former having options to download program codes.
Presently, and hopefully for longtime future access, the website has been captured numerous times by the Wayback Machine Internet Archive site,
but all captures do not necessarily include compressed program files. If the file name is known, available captures can be found, such as for benchnt.zip (copy and modify link address),
Other benchmarks and references
The Whetstone benchmark primarily measures the
floating-point arithmetic
In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...
performance. A similar benchmark for
integer
An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
and string operations is the
Dhrystone
Dhrystone is a synthetic computing benchmark program developed in 1984 by Reinhold P. Weicker intended to be representative of system (integer) programming. The Dhrystone grew to become representative of general processor ( CPU) performance. T ...
.
See also
*
Dhrystone
Dhrystone is a synthetic computing benchmark program developed in 1984 by Reinhold P. Weicker intended to be representative of system (integer) programming. The Dhrystone grew to become representative of general processor ( CPU) performance. T ...
*
FLOPS
Floating point operations per second (FLOPS, flops or flop/s) is a measure of computer performance in computing, useful in fields of scientific computations that require floating-point calculations.
For such cases, it is a more accurate measu ...
*
Gibson Mix
*
LINPACK benchmarks
The LINPACK benchmarks are a measure of a system's floating-point computing power. Introduced by Jack Dongarra, they measure how fast a computer solves a dense ''n'' × ''n'' system of linear equations ''Ax'' = ''b'', which i ...
*
Million instructions per second
Instructions per second (IPS) is a measure of a computer's processor speed. For complex instruction set computers (CISCs), different instructions take different amounts of time, so the value measured depends on the instruction mix; even for c ...
(MIPS)
References
{{Reflist
External links
Benchmark Programs and Reports(see also
Netlib Netlib is a repository of software for scientific computing maintained by AT&T, Bell Laboratories, the University of Tennessee and Oak Ridge National Laboratory. Netlib comprises many separate programs and libraries. Most of the code is written in ...
)
Whetstone Algol Revisited, or Confessions of a compiler writer PDF file(B. Randell, 1964)
Benchmarks (computing)
Blaby
Computer-related introductions in 1972
History of computing in the United Kingdom
Science and technology in Leicestershire