HOME

TheInfoList



OR:

SPSS Statistics is a
statistical software The following is a list of statistical software. Open-source * ADaMSoft – a generalized statistical software with data mining algorithms and methods for data management * ADMB – a software suite for non-linear statistical modeling based on C+ ...
suite developed by
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
for
data management Data management comprises all disciplines related to handling data as a valuable resource, it is the practice of managing an organization's data so it can be analyzed for decision making. Concept The concept of data management emerged alongsi ...
, advanced analytics,
multivariate analysis Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable, i.e., '' multivariate random variables''. Multivariate statistics concerns understanding the differ ...
,
business intelligence Business intelligence (BI) consists of strategies, methodologies, and technologies used by enterprises for data analysis and management of business information. Common functions of BI technologies include Financial reporting, reporting, online an ...
, and
criminal investigation Criminal investigation is an applied science that involves the study of facts that are then used to inform criminal trials. A complete criminal investigation can include Search and seizure, searching, interviews, interrogations, Evidence (law), ...
. Long produced by SPSS Inc., it was acquired by
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
in 2009. Versions of the software released since 2015 have the brand name IBM SPSS Statistics. The software name originally stood for Statistical Package for the Social Sciences (SPSS), reflecting the original market, then later changed to Statistical Product and Service Solutions.


Overview

SPSS is a widely used program for
statistical analysis Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...
in
social science Social science (often rendered in the plural as the social sciences) is one of the branches of science, devoted to the study of societies and the relationships among members within those societies. The term was formerly used to refer to the ...
. It is also used by market researchers, health researchers, survey companies, government, education researchers, industries, marketing organizations, data miners, and others. The original SPSS manual (Nie, Bent & Hull, 1970) has been described as one of "sociology's most influential books" for allowing ordinary researchers to do their own statistical analysis. In addition to statistical analysis, data management (case selection, file reshaping and creating derived data) and data documentation (a
metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...
dictionary is stored in the
data Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
file) are features of the base software. The many features of SPSS Statistics are accessible via pull-down menus or can be programmed with a proprietary
4GL A fourth-generation programming language (4GL) is a high-level computer programming language that belongs to a class of languages envisioned as an advancement upon third-generation programming languages (3GL). Each of the programming language g ...
''command syntax language''. Command syntax programming has the benefits of reproducible output, simplifying repetitive tasks, and handling complex data manipulations and analyses. Additionally, some complex applications can only be programmed in syntax and are not accessible through the menu structure. The pull-down menu interface also generates command syntax: this can be displayed in the output, although the default settings have to be changed to make the syntax visible to the user. They can also be pasted into a syntax file using the "paste" button present in each menu. Programs can be run interactively or unattended, using the supplied Production Job Facility. A "macro" language can be used to write command language
subroutines In computer programming, a function (also procedure, method, subroutine, routine, or subprogram) is a callable unit of software logic that has a well-defined interface and behavior and can be invoked multiple times. Callable units provide a p ...
. A Python programmability extension can access the information in the data dictionary and data and dynamically build command syntax programs. This extension, introduced in SPSS 14, replaced the less functional SAX Basic "scripts" for most purposes, although SaxBasic remains available. In addition, the Python extension allows SPSS to run any of the statistics in the
free software Free software, libre software, libreware sometimes known as freedom-respecting software is computer software distributed open-source license, under terms that allow users to run the software for any purpose as well as to study, change, distribut ...
package R. From version 14 onwards, SPSS can be driven externally by a Python or a VB.NET program using supplied "plug-ins". (From version 20 onwards, these two scripting facilities, as well as many scripts, are included on the installation media and are normally installed by default.) SPSS Statistics places constraints on internal file structure,
data type In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
s,
data processing Data processing is the collection and manipulation of digital data to produce meaningful information. Data processing is a form of ''information processing'', which is the modification (processing) of information in any manner detectable by an o ...
, and matching files, which together considerably simplify programming. SPSS datasets have a two-dimensional table structure, where the rows typically represent cases (such as individuals or households) and the columns represent measurements (such as age, sex, or household income). Only two data types are defined: numeric and
text Text may refer to: Written word * Text (literary theory) In literary theory, a text is any object that can be "read", whether this object is a work of literature, a street sign, an arrangement of buildings on a city block, or styles of clothi ...
(or "string"). All data processing occurs sequentially case-by-case through the file (dataset). Files can be matched one-to-one and one-to-many, but not
many-to-many Many-to-many communication occurs when information is shared between groups. Members of a group receive information from multiple senders. Wikis are a type of many-to-many communication, where multiple editors collaborate to create content that is ...
. In addition to that cases-by-variables structure and processing, there is a separate Matrix session where one can process data as matrices using matrix and linear algebra operations. The
graphical user interface A graphical user interface, or GUI, is a form of user interface that allows user (computing), users to human–computer interaction, interact with electronic devices through Graphics, graphical icon (computing), icons and visual indicators such ...
has two views which can be toggled. The 'Data View' shows a
spreadsheet A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in c ...
view of the cases (rows) and variables (columns). Unlike spreadsheets, the data cells can only contain numbers or text, and formulas cannot be stored in these cells. The 'Variable View' displays the metadata dictionary, where each row represents a variable and shows the variable name, variable label, value label(s), print width, measurement type, and a variety of other characteristics. Cells in both views can be manually edited, defining the file structure and allowing data entry without using command syntax. This may be sufficient for small datasets. Larger datasets such as
statistical survey Survey methodology is "the study of survey methods". As a field of applied statistics concentrating on human-research surveys, survey methodology studies the sampling of individual units from a population and associated techniques of survey d ...
s are more often created in
data entry Data entry is the process of digitizing data by entering it into a computer system for organization and management purposes. It is a person-based process and is "one of the important basic" tasks needed when no machine-readable version of the in ...
software, or entered during computer-assisted personal interviewing, by scanning and using
optical character recognition Optical character recognition or optical character reader (OCR) is the electronics, electronic or machine, mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo ...
and
optical mark recognition Optical mark recognition (OMR) collects data from people by identifying markings on a paper. OMR enables the hourly processing of hundreds or even thousands of documents. A common application of this technology is used in exams, where students m ...
software, or by direct capture from online questionnaires. These datasets are then read into SPSS. SPSS Statistics can read and write data from
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
text files (including hierarchical files), other statistics packages,
spreadsheets A spreadsheet is a computer application for computation, organization, analysis and storage of data in tabular form. Spreadsheets were developed as computerized analogs of paper accounting worksheets. The program operates on data entered in ce ...
and
databases In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and ana ...
. It can also read and write to external relational database tables via
ODBC In computing, Open Database Connectivity (ODBC) is a standard application programming interface (API) for accessing database management systems (DBMS). The designers of ODBC aimed to make it independent of database systems and operating systems. An ...
and
SQL Structured Query Language (SQL) (pronounced ''S-Q-L''; or alternatively as "sequel") is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling s ...
. Statistical output is to a proprietary file format (*.spv file, supporting
pivot table A pivot table is a table of values which are aggregations of groups of individual values from a more extensive table (such as from a database, spreadsheet, or business intelligence program) within one or more discrete categories. The aggregatio ...
s) for which, in addition to the in-package viewer, a stand-alone reader can be downloaded. The proprietary output can be exported to text or
Microsoft Word Microsoft Word is a word processor program, word processing program developed by Microsoft. It was first released on October 25, 1983, under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platf ...
,
PDF Portable document format (PDF), standardized as ISO 32000, is a file format developed by Adobe Inc., Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, computer hardware, ...
, Excel, and other formats. Alternatively, output can be captured as data (using the OMS command), as text, tab-delimited text, PDF, XLS,
HTML Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
,
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
, SPSS dataset or a variety of graphic image formats (
JPEG JPEG ( , short for Joint Photographic Experts Group and sometimes retroactively referred to as JPEG 1) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degr ...
, PNG, BMP and EMF). Several variants of SPSS Statistics exist. SPSS Statistics Gradpacks are highly discounted versions sold only to students. SPSS Statistics Server is a version of the software with a client/server architecture. Add-on packages can enhance the base software with additional features (examples include complex samples, which can adjust for clustered and stratified samples, and custom tables, which can create publication-ready tables). SPSS Statistics is available under either an annual or a monthly subscription license. Version 25 of SPSS Statistics launched on August 8, 2017. This added new and advanced statistics, such as random effects solution results (GENLINMIXED), robust standard errors (GLM/UNIANOVA), and profile plots with error bars within the Advanced Statistics and Custom Tables add-on. V25 also includes new
Bayesian statistics Bayesian statistics ( or ) is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about ...
capabilities, a method of statistical inference, and publication ready charts, such as powerful new charting capabilities, including new default templates and the ability to share with Microsoft Office applications.


Versions and ownership history

* SPSS 1 - 1968 * SPSS 2 - 1983 * SPSS 5 - 1993 * SPSS 6.1 - 1995 * SPSS 7.5 - 1997 * SPSS 8 - 1998 * SPSS 9 - 1999 * SPSS 10 - 1999 * SPSS 11 - 2002 * SPSS 12 - 2004 * SPSS 13 - 2005 * SPSS 14 - 2006 * SPSS 15 - 2006 * SPSS 16 - 2007 * SPSS 17 - 2008 * PASW 17 - 2009 * PASW 18 - 2009 * SPSS 19 - 2010 * SPSS 20 - 2011 * SPSS 21 - 2012 * SPSS 22 - 2013 * SPSS 23 - 2015 * SPSS 24 - 2016, March * SPSS 25 - 2017, July * SPSS 26 - 2018 *SPSS 27 - 2019, June (and 27.0.1 in November, 2020) *SPSS 28 - 2021, May *SPSS 29 - 2022, Sept *SPSS 30 - 2024, Sept SPSS was released in its first version in 1968 as the Statistical Package for the Social Sciences (SPSS) after being developed by Norman H. Nie, Dale H. Bent, and C. Hadlai Hull. Those principals incorporated as SPSS Inc. in 1975. Early versions of SPSS Statistics were written in Fortran and designed for
batch processing Computerized batch processing is a method of running software programs called jobs in batches automatically. While users are required to submit the jobs, no other interaction by the user is required to process the batch. Batches may automatically ...
on mainframes, including for example
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
and ICL versions, originally using
punched cards A punched card (also punch card or punched-card) is a stiff paper-based medium used to store digital information via the presence or absence of holes in predefined positions. Developed over the 18th to 20th centuries, punched cards were wide ...
for data and program input. A processing run read a command file of SPSS commands and either a raw input file of fixed-format data with a single record type, or a 'getfile' of data saved by a previous run. To save precious computer time an 'edit' run could be done to check command syntax without analysing the data. From version 10 (SPSS-X) in 1983, data files could contain multiple record types. Prior to SPSS 16.0, different versions of SPSS were available for
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
,
Mac OS X macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
and
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
. SPSS Statistics version 13.0 for
Mac OS X macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
was not compatible with
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and Delaware General Corporation Law, incorporated in Delaware. Intel designs, manufactures, and sells computer compo ...
-based Macintosh computers, due to the Rosetta emulation software causing errors in calculations. SPSS Statistics 15.0 for Windows needed a downloadable hotfix to be installed in order to be compatible with
Windows Vista Windows Vista is a major release of the Windows NT operating system developed by Microsoft. It was the direct successor to Windows XP, released five years earlier, which was then the longest time span between successive releases of Microsoft W ...
. From version 16.0, the same version runs under Windows, Mac, and Linux. The
graphical user interface A graphical user interface, or GUI, is a form of user interface that allows user (computing), users to human–computer interaction, interact with electronic devices through Graphics, graphical icon (computing), icons and visual indicators such ...
is written in
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
. The Mac OS version is provided as a Universal binary, making it fully compatible with both PowerPC and Intel-based Mac hardware. SPSS Inc announced on July 28, 2009, that it was being acquired by
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
for US$1.2 billion. Because of a dispute about ownership of the name "SPSS", between 2009 and 2010, the product was referred to as PASW (Predictive Analytics SoftWare). As of January 2010, it became "SPSS: An IBM Company". Complete transfer of business to IBM was done by October 1, 2010. By that date, SPSS: An IBM Company ceased to exist. IBM SPSS is now fully integrated into the IBM Corporation, and is one of the brands under IBM Software Group's Business Analytics Portfolio, together with IBM Algorithmics, IBM Cognos and IBM OpenPages. Companion software in the "IBM SPSS" family are used for
data mining Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and ...
and
text analytics Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from plain text, text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information ...
( IBM SPSS Modeler), realtime credit scoring services ( IBM SPSS Collaboration and Deployment Services), and structural equation modeling ( IBM SPSS Amos). SPSS Data Collection and SPSS Dimensions were sold in 2015 to UNICOM Systems, Inc., a division of UNICOM Global, and merged into the integrated software suite UNICOM Intelligence (survey design, survey deployment, data collection, data management and reporting).


(Interactive Data Analysis)

IDA (Interactive Data Analysis) was a software package that originated at what was formerly the National Opinion Research Center ( NORC), at the
University of Chicago The University of Chicago (UChicago, Chicago, or UChi) is a Private university, private research university in Chicago, Illinois, United States. Its main campus is in the Hyde Park, Chicago, Hyde Park neighborhood on Chicago's South Side, Chic ...
. Initially offered on the HP-2000, somewhat later, under the ownership of SPSS, it was also available on MUSIC/SP. Regression analysis was one of ''IDA'''s strong points.


- Conversational / Columnar SPSS

SCSS was a software product intended for online use of IBM mainframes. Although the "C" was for "conversational", it also represented a distinction regarding how the data was stored: it used a column-oriented rather than a row-oriented (internal) database. This gave good interactive response time for the SPSS Conversational Statistical System (SCSS), whose strong point, as with SPSS, was Cross-tabulation.


Project NX

In October 2020, IBM announced the start of an Early Access Program for the "New SPSS Statistics", codenamed Project NX. It contains "many of your favorite SPSS capabilities presented in a new easy to use interface, with integrated guidance, multiple tabs, improved graphs and much more". In December, 2021, IBM opened up the Early Access Program for the next generation of SPSS Statistics for more users and shared more visuals about it.


See also

* Comparison of statistical packages *
JASP JASP (Harold Jeffreys, Jeffreys’s Amazing Statistics Program) is a free and open-source program for Statistics, statistical analysis supported by the University of Amsterdam. It is designed to be easy to use, and familiar to users of SPSS. It ...
and jamovi, both open-source and free of charge alternatives, offering frequentist and Bayesian models * PSPP, a free SPSS replacement from the
GNU Project The GNU Project ( ) is a free software, mass collaboration project announced by Richard Stallman on September 27, 1983. Its goal is to give computer users freedom and control in their use of their computers and Computer hardware, computing dev ...
*
SPSS Modeler IBM SPSS Modeler is a data mining and text analytics software application from IBM. It is used to build Predictive modelling, predictive models and conduct other analytic tasks. It has a visual interface which allows users to leverage statistica ...


References


Further reading

* * * * *


External links

*
Official SPSS User Community

50 years of SPSS history

Raynald Levesque's SPSS Tools
nbsp;– library of worked solutions for SPSS programmers ( FAQ, command syntax; macros; scripts; Python)
Archives of SPSSX-L Discussion
nbsp;– SPSS
Listserv The term Listserv (styled by the registered trademark licensee, L-Soft International, Inc., as LISTSERV) has been used to refer to electronic mailing list software applications in general, but is more properly applied to a few early instances of ...
active since 1996. Discusses programming, statistics and analysis
UCLA ATS Resources to help you learn SPSS
 – Resources for learning SPSS
UCLA ATS Technical Reports
 – Report 1 compares Stata, SAS, and SPSS against R ( R is a language and environment for statistical computing and graphics).
SPSS Community?ref=wikipedia
nbsp;– Support for developers of applications using SPSS products, including materials and examples of the Python and R programmability features
Biomedical Statistics - An educational website dedicated to statistical evaluation of biomedical data using SPSS software
{{Statistical software IBM software Business intelligence software Java platform software Science software for Linux Proprietary commercial software for Linux Data mining and machine learning software Statistical software Statistical programming languages Econometrics software Time series software Data warehousing Proprietary cross-platform software Extract, transform, load tools Mathematical optimization software Numerical software