Crowdsourced

science Science is a systematic endeavor that Scientific method, builds and organizes knowledge in the form of Testability, testable explanations and predictions about the universe. Science may be as old as the human species, and some of the earli ...

(not to be confused with

citizen science Citizen science (CS) (similar to community science, crowd science, crowd-sourced science, civic science, participatory monitoring, or volunteer monitoring) is scientific research conducted with participation from the public (who are sometimes re ...

, a subtype of crowdsourced science) refers to collaborative contributions of a large group of people to the different steps of the research process in science. In

psychology Psychology is the scientific study of mind and behavior. Psychology includes the study of conscious and unconscious phenomena, including feelings and thoughts. It is an academic discipline of immense scope, crossing the boundaries betwe ...

, the nature and scope of the collaborations can vary in their application and in the benefits it offers.

What is crowdsourcing?

A complement to the traditional way of doing science

Crowdsourcing Crowdsourcing involves a large group of dispersed participants contributing or producing goods or services—including ideas, votes, micro-tasks, and finances—for payment or as volunteers. Contemporary crowdsourcing often involves digita ...

is a collaborative sourcing model in which a large and diverse number of people or organizations can contribute to a common goal or project. First examples of crowdsourcing science can be found during the 19th century. A

Yale University Yale University is a Private university, private research university in New Haven, Connecticut. Established in 1701 as the Collegiate School, it is the List of Colonial Colleges, third-oldest institution of higher education in the United Sta ...

professor launched a call for open participation of citizens to maximize the number and diversity of observations he could get on the Leonids of 1833 meteor storm phenomenon. Crowdsourced science has been set aside for a long time and has only recently gained popularity in science. Actually, it helps overcoming several limits of the classic model on which science has been conducted up till now. For example, in psychology, scientific research has often been limited by small sample sizes and a lack of diversity in studied populations. These limits can be tackled with a more collaborative approach of scientific research (i.e., crowdsourced science).

A two-dimensional concept

Crowdsourcing initiatives can be described along two different continuous axes. The first dimension represents the degree of communication between project members, ranging from very few collaborators and little communication, to a large crowd of collaborators with a rich amount of communication. The second dimension corresponds to the degree of inclusiveness, varying from only selecting groups of people with a high expertise in the field of interest, to open to anyone—with or without expertise—interested in collaborating. This twofold distinction draws four types of crowdsourced science projects. However, these two dimensions lie on a continuum: A single project is neither entirely inclusive (vs. selective) nor entirely high (vs. low) in communication.

Why crowdsource psychological science?

Limits of the traditional vertical model

In psychology, the traditional way of doing research follows a vertical model. In other words, most of the time, research is carried out by small teams from a specific lab or university, and organized around one or two main researchers (often referred to as

Principal Investigator In many countries, the term principal investigator (PI) refers to the holder of an independent grant and the lead researcher for the grant project, usually in the sciences, such as a laboratory study or a clinical trial. The phrase is also often us ...

, or PI). This small team contributes to all different stages of the research project: conception of the

research question A research question is "a question that a research project sets out to answer". Choosing a research question is an essential element of both quantitative and qualitative research. Investigation will require data collection and analysis, and the me ...

, design of the study, data collection, data analysis, discussion of the results, and publication of the manuscript. However, vertical science has its own limitations that can impede the progress of science. Conducting science within small independent teams makes it difficult to conduct large-scale projects (with large samples, large databases and high

statistical power In statistics, the power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H_0) when a specific alternative hypothesis (H_1) is true. It is commonly denoted by 1-\beta, and represents the chances ...

), which may limit the scope of research. In traditional science, researchers have access to less resources, data, and methodologies. Small, independent team projects are also limited in the feedback they can get from other perspectives. Nowadays, research teams do not necessarily have to make compromises anymore (e.g., small sample size, same stimuli, same contexts and no replication), because much of these limits could be tackled by a more crowdsourced research.

The replication crisis

In the scientific method, replicability is a fundamental criterion to qualify a theory as scientific. The

replication crisis The replication crisis (also called the replicability crisis and the reproducibility crisis) is an ongoing methodological crisis in which the results of many scientific studies are difficult or impossible to reproduce. Because the reproducibi ...

(or credibility crisis) is a methodological crisis in science that researchers began to acknowledge around the 2010s. The controversy revolves around the lack of

reproducibility Reproducibility, also known as replicability and repeatability, is a major principle underpinning the scientific method. For the findings of a study to be reproducible means that results obtained by an experiment or an observational study or in a ...

of many scientific findings, including those in psychology (e.g., among 100 studies, less than 50% of the findings were replicated). Some of its main underlying causes are referred to as “questionable research practices”. These include

p-hacking Data dredging (also known as data snooping or ''p''-hacking) is the misuse of data analysis to find patterns in data that can be presented as statistically significant, thus dramatically increasing and understating the risk of false positives. T ...

(i.e., exploiting researcher degrees of freedom until a significant result is obtained),

HARKing HARKing (hypothesizing after the results are known) is an acronym coined by social psychologist Norbert Kerr that refers to the questionable research practice of “presenting a post hoc hypothesis in the introduction of a research report as if it ...

(i.e., hypothesizing after results are known),

publication bias In published academic research, publication bias occurs when the outcome of an experiment or research study biases the decision to publish or otherwise distribute it. Publishing only results that show a significant finding disturbs the balance ...

(i.e., the tendency among scholarly journals to only publish significant results), statistical reporting errors, and low statistical power often due to small sample sizes. Among these various issues, small sample sizes and the lack of diversity within samples can be addressed through crowdsourced science—increasing the generalizability of findings and therefore their replicability as well. Indeed, samples in psychology often rely on college students and Western, Educated, Industrialized, Rich and Democratic (

WEIRD Weird derives from the Anglo-Saxon word Wyrd, meaning fate or destiny. In modern English it has acquired the meaning of “strange or uncanny”. It may also refer to: Places * Weird Lake, a lake in Minnesota, U.S. People *"Weird Al" Yankovic (b ...

) populations. For these reasons, the replication crisis has contributed to the rise of crowdsourced, large-scale projects, especially replication projects held at an international scale like the Many Labs project, and the Psychological Science Accelerator (PSA). These crowdsourced projects aim at solving some issues raised by the replication crisis, more specifically by assessing the replicability of studies and generalisation of the results to other populations and contexts.

Ambitions of the horizontal model

In contrast to the vertical model of doing research, the horizontal model—an inherent principle of crowdsourced science—mainly relies on variations in terms of inclusiveness and communication. Its core principle is about the non-authority of one or two researchers in terms of resources, ownership and expertise. Following this principle, the different tasks within a project are distributed among many researchers. The whole project is then supported by a team, which ensures the coordination of the contributors. A perfect horizontal model does not really exist because vertical and horizontal models are more conceptualized as extremes of a same continuum. This distributed model of science work is gaining popularity in the scientific community. In about 40 years and across different scientific disciplines, research teams have roughly doubled in size (from 2 to 4 people on average). By encouraging larger crowds to contribute to research projects in psychology, the horizontal model aims at reducing some limits of the vertical one. It has three distinct ambitions: * Carry out wide-ranging works * Encourage a democratized psychological science * Establish robust findings

Large-scale research projects

The first ambition of the horizontal model is to enable researchers to conduct more large-scale research projects (i.e., ambitious projects that cannot be conducted by small teams). By aggregating various skills and resources, it is possible to move from a model where research is defined by available means, to a model where research itself defines the necessary means to answer the research question.

Democratizing psychological science

The second ambition of the horizontal model is to compensate for inequalities (e.g., in terms of recognition, status, and success in a researcher's career). Psychology and more generally

social sciences Social science is one of the branches of science, devoted to the study of society, societies and the Social relation, relationships among individuals within those societies. The term was formerly used to refer to the field of sociology, the o ...

show a strong bias of what is called the

Matthew effect The Matthew effect of accumulated advantage, Matthew principle, or Matthew effect, is the tendency of individuals to accrue social or economic success in proportion to their initial level of popularity, friends, wealth, etc. It is sometimes summar ...

(i.e., academic advantages going to those who are already the most renowned). Early career researchers from less renowned institutions, less economically developed countries or underrepresented demographic groups are generally less likely to have access to high-profile projects. Crowdsourcing enables such researchers to contribute to impactful projects and gain recognition for their work.

Robust findings

The third ambition of the horizontal model is to improve the robustness, generalizability, and

reliability Reliability, reliable, or unreliable may refer to: Science, technology, and mathematics Computing * Data reliability (disambiguation), a property of some disk arrays in computer storage * High availability * Reliability (computer networking), ...

of findings in order to increase the credibility of psychological science. Horizontal collaboration between research teams facilitates the replication of studies and makes it easier to detect biased

effect size In statistics, an effect size is a value measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the ...

s, p-hacking, and publication bias—different problems raised by the replication crisis (see also #The replication crisis).

Crowdsourcing in practice

Contributions at different stages of research

This section aims at detailing how crowdsourcing can contribute to the different stages of the research process, from the generation of research ideas to the publication of the outcomes.

Ideation

Ideation is the first step of any research project. In psychology, it refers to the process of defining the general idea behind a project—purpose, research question, and hypotheses. This step can be done in collaboration between several researchers to scan a broader spectrum of ideas and select those of broadest interest and impact. Faced with a research question, the different collaborators can bring their expertise in the construction of hypotheses. Crowds can also help to generate new ideas to solve complex-problems, such as illustrated by the Polymath project.

Assembling resources

When assembling resources, crowdsourcing can be useful, especially for labs with less resources at their disposal. Online platforms such as Science Exchange and Studyswap allow researchers to establish new communication lines and share resources between labs. Matching resources from labs across the globe minimizes waste and facilitates research teams’ ability to meet their goals. Sharing tasks across labs can improve the efficiency of a research project, especially highly time-consuming ones. In biology, for example, studying the entire genome takes a lot of time. By distributing its investigation and combining resources across multiple labs, it is possible to accelerate the research process.

Study design

There can be many ways to design a study. Research teams across the world neither have the same theoretical background, nor are they all equipped with the same materials. Crowdsourcing can be useful in the case of conceptual replications (i.e., testing a same research question through different

operationalization In research design, especially in psychology, social sciences, life sciences and physics, operationalization or operationalisation is a process of defining the measurement of a phenomenon which is not directly measurable, though its existence is ...

s). When testing a same research question, variations in study designs can lead to strong variations in effect size estimations. Diversifying the methods used to test the same hypothesis across different populations—through collaborative projects—enables better estimations of the true consistency of a scientific claim.

Data collection

In psychology,

data collection Data collection or data gathering is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes. Data collection is a research com ...

often relies on samples drawn from Western, Educated, Industrialized, Rich and Democratic (WEIRD) populations, which impedes the overall generalizability of findings. Crowdsourcing the data collection process by relying on multi-lab data collection and online crowdsourcing platforms (e.g.,

Amazon Mechanical Turk Amazon Mechanical Turk (MTurk) is a crowdsourcing website for businesses to hire remotely located "crowdworkers" to perform discrete on-demand tasks that computers are currently unable to do. It is operated under Amazon Web Services, and is owned ...

, Prolific) makes it easier to reach a wider audience of participants from different cultural backgrounds and non-WEIRD populations. When the research question makes it possible to rely on internet samples, it is also an easy way to recruit larger samples of participants with minimal financial input and within short amounts of time. Most of the time, members of the general public are recruited to undergo studies as research participants but they can also be recruited to collect data and observations.

Data analysis

In research,

data analysis Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, en ...

refers to the process of

cleaning Cleaning is the process of removing unwanted substances, such as dirt, infectious agents, and other impurities, from an object or environment. Cleaning is often performed for aesthetic, hygienic, functional, environmental, or safety purposes. ...

, transforming, and modeling data using statistical tools, often with the purpose of answering a research question. Within a research project, this is typically done by a single analyst (or team) and results in a single analysis of a

dataset A data set (or dataset) is a collection of data. In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the d ...

. Analytic strategies can differ greatly from one team to another. For example, a study found that among 241 published articles on

fMRI Functional magnetic resonance imaging or functional MRI (fMRI) measures brain activity by detecting changes associated with blood flow. This technique relies on the fact that cerebral blood flow and neuronal activation are coupled. When an area o ...

, 223 different analytic strategies were used. Moreover, there can be many ways to test a single hypothesis from the same dataset. Although defensible, decisions in data analysis remain subjective, which can greatly affect research results. A way to counter this subjectivity is

transparency Transparency, transparence or transparent most often refer to: * Transparency (optics), the physical property of allowing the transmission of light through a material They may also refer to: Literal uses * Transparency (photography), a still ...

. When data, analytic plans, and analytic decisions are made transparent and

open access Open access (OA) is a set of principles and a range of practices through which research outputs are distributed online, free of access charges or other barriers. With open access strictly defined (according to the 2001 definition), or libre o ...

to the rest of the community, it facilitates criticism and gives the opportunity to explore alternative ways to analyse data. “Crowdsourcing the analysis of the data reveals the extent to which research conclusions are contingent on the defensible yet subjective decision made by different analysts.”—Uhlmann et al., 2019

Writing research reports

research report A research report is a publication that reports on the findings of a research project or alternatively scientific observations on or about a subject. Research reports are produced by many sectors including industry, education, government and non-g ...

is a document reporting the findings of a research project. The overall quality of a research report can benefit from crowdsourcing practices, especially during the writing process. Aggregating a large number of contributors increases the range of expertise and perspectives, which contributes to build more solid arguments. It also makes

proofreading Proofreading is the reading of a galley proof or an electronic copy of a publication to find and correct reproduction errors of text or art. Proofreading is the final step in the editorial cycle before publication. Professional Tradition ...

easier (e.g., catching grammatical errors, weird phrasing, typos, biases, factual errors, claim-checking). To facilitate collaborative writing, some researchers have suggested guidelines for writing manuscripts with many authors. There should always be a leading author (or a few, but with individual responsibilities) who takes care of managing the writing process and takes explicit responsibility for any mistake—avoiding

diffusion of responsibility Diffusion of responsibility is a sociopsychological phenomenon whereby a person is less likely to take responsibility for action or inaction when other bystanders or witnesses are present. Considered a form of attribution, the individual assume ...

in case of errors. It is also recommended that the leading author(s) follow the four general principles mentioned below: * Care in crediting the coauthors team * Clear and frequent mass communication * Make sure materials associated to the manuscript are well-organized * Early and deliberate decision-making

Peer review

Before getting published in an

academic journal An academic journal or scholarly journal is a periodical publication in which scholarship relating to a particular academic discipline is published. Academic journals serve as permanent and transparent forums for the presentation, scrutiny, and ...

, submitted papers undergo

peer review Peer review is the evaluation of work by one or more people with similar competencies as the producers of the work ( peers). It functions as a form of self-regulation by qualified members of a profession within the relevant field. Peer revie ...

(i.e., the process of having an author's academic work reviewed by experts from the same field). Typically, it is performed by a limited number of selected reviewers. Crowdsourcing the peer review process increases chances of getting reviews from a larger number of experts in the relevant domain. This is also a way to significantly increase opportunities for better criticism and faster fact-checking before an article gets published. Crowdsourced peer-review can be accomplished alongside open access peer-review (e.g., through centralized platforms dedicated to the discussion and criticism of research reports).

Examples in psychology

The replication crisis served as a prelude for the emergence of many large-scale collaborative projects. Some of the most important ones include the ManyLabs project, the ManyBabies project, the #EEGManyLabs project, the

Reproducibility Project The Reproducibility Project: Psychology was a crowdsourced collaboration of 270 contributing authors to repeat 100 published experimental and correlational psychological studies. This project was led by the Center for Open Science and its co-founder ...

, the Collaborative Replication and Education Project (CREP), and the Psychological Science Accelerator (PSA).

Projects surrounding the COVID-19 pandemic

In response to the

COVID-19 pandemic The COVID-19 pandemic, also known as the coronavirus pandemic, is an ongoing global pandemic of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The novel virus was first identified ...

, several initiatives to improve collaboration between scientists from a variety of fields have been launched. Studies exploring the impact of the COVID-19 pandemic on behavior and mental health are conducted and widely shared across social media such as Twitter. In this context, the PSA made a call for studies surrounding the COVID-19 disease and retrieved 66 study proposals from different experts within one week. At the date of May 2020, three studies have been selected and are being conducted worldwide. Two of these studies aim at reaching a better adoption of health behaviors to avoid the spread of COVID-19, and one aims at helping people regulate their negative emotions during the crisis. This project aims to get results that could help all countries to face the pandemic with means adapted to their population.

Challenges and future directions

Confronting the vertical and horizontal models

Although a horizontal model of conducting science seems promising to overcome some limits of the vertical model (see also #Limits of the traditional vertical model and #Ambitions of the horizontal model), it is difficult to empirically assess its benefits. It remains unclear whether two sets of research that study the same question—either through a vertical or horizontal way of doing science—would lead to different outcomes or not.

Financial independence

Currently, to sustain a collaborative project, researchers either have to use money from their own grants or apply for fundings. Such projects are thus limited in the studies they can conduct without financial independence. Coordinating hundreds of labs to conduct a study requires consequent administrative work. Structures like the Psychological Science Accelerator (PSA) have to ensure each participating lab retrieves ethics approval to conduct a given study (see also

Institutional review board An institutional review board (IRB), also known as an independent ethics committee (IEC), ethical review board (ERB), or research ethics board (REB), is a committee that applies research ethics by reviewing the methods proposed for research to ...

). Within the PSA, this background work is currently completed voluntarily by dedicated teams or researchers beside their main occupation. By reaching a financial independence (as it is the case for the CERN), these projects could optimize their functioning through the opening of jobs dedicated to these missions.

Authorship in the era of crowdsourced science

A researcher's career path in the academic world (i.e., job opportunities, grant attributions, etc.) depends on major contributions to research projects, which is often assessed through the amount of publications where one appears as the lead author. In psychology, it is common practice to list authors in order of contribution, with involvement decreasing as we move down the list. Truth is, it is not rare that multiple authors on the same paper contributed equally to the project, but in different ways. In that sense, contributors on a project do not always get the credit they deserve—as the order the authors are listed in does not capture well the contribution of each author. This is especially true for large-scale collaborative projects with many contributors (e.g., the PSA001 project has over 200 contributors). An alternative to the current authorship system is the CRediT taxonomy, a taxonomy describing 14 distinctive categories (e.g., conceptualization of the project, administration of the project, funding acquisition, investigation) that represent the roles typically played by contributors in a scientific project. Papers relying on this taxonomy allow for a more representative description of the involvement of each contributor on a project.

Developing open-science and crowdsourced practices

The enrolment of students through different collaborative projects could foster open science and crowdsourced practices early in a researcher's career. For instance, in the Collaborative Replication and Education Project (CREP), students are taught the roots and importance of such practices toward the replication of recent major findings in psychology. Editorial policies of scientific journals also play a role in the adoption of open science and crowdsourced practices, especially by defining new publication criteria. For instance, more than 200 journals now adopt an “in principle acceptance” format of peer-reviewing papers. In this publishing format, articles are accepted for publication prior to data collection, on the basis of the provided theoretical framework, methodology, and analysis plan.

Remaining issues

Flexibility in data analyses and other biases that collaborative projects should address by aggregating experts are not always overcome. It has also been shown that crowdsourced projects involving low-trained and low-involved actors (e.g., students) can lead to the falsification of data. Linking up a wide array of contributors can thus imply structural problems that may impact research outcomes. Both issues highlight the importance of educational practices on open science and crowdsourced practices (see also #Developing open-science and crowdsourced practices).

Controversies

Controversies surrounding crowdsourced science do not directly involve a criticism of crowdsourced science itself, but rather its costs—both in terms of financial and time investment. Collaborative practices in research remain very expensive and face an important number of challenges. Solutions to address these challenges require important structural changes within research institutions and have important repercussions on researchers’ academic careers (see also #Challenges and future directions). The shift from a vertical model toward a more horizontal one was partly motivated by the replication crisis in psychology. However, some authors are skeptical about the extent of this crisis in psychology. According to these authors, the failure to replicate most findings is overestimated and mostly due to a lack of fidelity in replication protocols. These claims mitigate whether large collaborative research projects are worth the cost, suggesting that a shift towards a horizontal model of doing science may not be necessary. Given the cost of crowdsourced projects and the resources they require, crowdsourcing may not always be the most optimal approach. Nonetheless, the crowdsourced science approach helped the development of tools from which any project—either collaborative or not—can benefit. An optimal approach would be a compromise between both vertical and horizontal models, which would depend on the research question at hand and on the constraints of each project.

External links

Psychological Science Accelerator

Science Exchange

Study Swap

References

{{Reflist Psychological methodology Crowdsourcing