Tidyverse
   HOME

TheInfoList



OR:

The tidyverse is a collection of
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
packages for the
R programming language R is a programming language for statistical computing and data visualization. It has been widely adopted in the fields of data mining, bioinformatics, data analysis, and data science. The core R language is extended by a large number of so ...
introduced by
Hadley Wickham Hadley Alexander Wickham (born 14 October 1979) is a New Zealand statistician known for his work on open-source software for the R (programming language), R statistical programming environment. He is the Chief scientific officer, chief scientist ...
and his team that "share an underlying design philosophy, grammar, and data structures" of tidy data. Characteristic features of tidyverse packages include extensive use of non-standard evaluation and encouraging
piping Within industry, piping is a system of pipes used to convey fluids (liquids and gases) from one location to another. The engineering discipline of piping design studies the efficient transport of fluid. Industrial process piping (and accomp ...
. As of November 2018, the tidyverse package and some of its individual packages comprise 5 out of the top 10 most downloaded R packages. The tidyverse is the subject of multiple books and papers. In 2019, the ecosystem has been published in the '' Journal of Open Source Software''. Its syntax has been referred to as "supremely readable", and some have argued that tidyverse is an effective way to introduce complete beginners to programming, as pedagogically it allows students to quickly begin doing data processing tasks. Moreover, some practitioners have pointed out that data processing tasks are intuitively easier to chain together with tidyverse compared to Python's equivalent data processing package, pandas. There is also an active R community around the tidyverse. For example, there is the TidyTuesday social data project organised by the Data Science Learning Community (DSLC), where varied real-world datasets are released each week for the community to participate, share, practice, and make learning to work with data easier. Critics of the tidyverse have argued it promotes tools that are harder to teach and learn than their built-in, base R equivalents and are too dissimilar to some programming languages. The tidyverse principles more generally encourage and help ensure that a universe of streamlined packages, in principle, will help alleviate dependency issues and compatibility with current and future features. An example of such a tidyverse principled approach is the pharmaverse, which is a collection of R packages for clinical reporting usage in pharma.


Packages

The core tidyverse packages, which provide functionality to model, transform, and visualize data, include: * ggplot2 – for data visualization * dplyr – for wrangling and transforming data *
tidyr
–'' help transform data specifically into tidy data, where each variable is a column, each observation is a row; each row is an observation, and each value is a cell. *
readr
–'' help read in common delimited, text files with data *
purrr
–'' a
functional programming In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declarat ...
toolkit *
tibble
–'' a modern implementation of the built-in data frame data structure *
stringr
–'' helps to manipulate string data types *
forcats
–'' helps to manipulate category data types Additional packages assist the core collection. Other packages based on the tidy data principles are regularly developed, such as tidytext for text analysis, tidymodels for machine learning, or tidyquant for financial operations.


References

{{R (programming language) Data analysis software Statistical software Free R (programming language) software R (programming language)