E-Science or eScience is computationally intensive
science
Science is a systematic discipline that builds and organises knowledge in the form of testable hypotheses and predictions about the universe. Modern science is typically divided into twoor threemajor branches: the natural sciences, which stu ...
that is carried out in highly distributed
network environments, or science that uses immense
data
Data ( , ) are a collection of discrete or continuous values that convey information, describing the quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted for ...
sets that require
grid computing
Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished fro ...
; the term sometimes includes technologies that enable distributed collaboration, such as the
Access Grid. The term was created by John Taylor, the Director General of the United Kingdom's
Office of Science and Technology in 1999 and was used to describe a large funding initiative starting in November 2000. E-science has been more broadly interpreted since then, as "the application of computer technology to the undertaking of modern scientific investigation, including the preparation, experimentation, data collection, results dissemination, and long-term storage and accessibility of all materials generated through the scientific process. These may include data modeling and analysis, electronic/digitized laboratory notebooks, raw and fitted data sets, manuscript production and draft versions, pre-prints, and print and/or electronic publications."
[Bohle, S. "What is E-science and How Should it Be Managed?" Nature.com, Spektrum der Wissenschaft (Scientific American), http://www.scilogs.com/scientific_and_medical_libraries/what-is-e-science-and-how-should-it-be-managed/.] In 2014
IEEE eScience Conference Seriescondensed the definition to "eScience promotes innovation in collaborative, computationally- or data-intensive research across all disciplines, throughout the research lifecycle" in one of the working definitions used by the organizers. E-science encompasses "what is often referred to as
big data
Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data processing, data-processing application software, software. Data with many entries (rows) offer greater statistical power, while data with ...
hichhas revolutionized science...
uch as
Uch (;
), frequently referred to as Uch Sharīf (;
; ''"Noble Uch"''), is a historic city in the Pakistan's Punjab, Pakistan, Punjab province. Uch may have been founded as Alexandria on the Indus, a town founded by Alexander the Great during I ...
the Large Hadron Collider (LHC) at CERN...
hat
A hat is a Headgear, head covering which is worn for various reasons, including protection against weather conditions, ceremonial reasons such as university graduation, religious reasons, safety, or as a fashion accessory. Hats which incorpor ...
generates around 780 terabytes per year... highly data intensive modern fields of science...that generate large amounts of E-science data include:
computational biology
Computational biology refers to the use of techniques in computer science, data analysis, mathematical modeling and Computer simulation, computational simulations to understand biological systems and relationships. An intersection of computer sci ...
,
bioinformatics
Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...
, genomics"
and the human
digital footprint for the
social sciences
Social science (often rendered in the plural as the social sciences) is one of the branches of science, devoted to the study of society, societies and the Social relation, relationships among members within those societies. The term was former ...
.
[DT&SC 7-2: Computational Social Science. https://www.youtube.com/watch?v=TEo0Au1brHs From the DT&SC online course at the University of California: https://canvas.instructure.com/courses/949415]
Turing Award
The ACM A. M. Turing Award is an annual prize given by the Association for Computing Machinery (ACM) for contributions of lasting and major technical importance to computer science. It is generally recognized as the highest distinction in the fi ...
winner
Jim Gray imagined "data-intensive science" or "
e-science" as a "fourth paradigm" of science (
empirical
Empirical evidence is evidence obtained through sense experience or experimental procedure. It is of central importance to the sciences and plays a role in various other fields, like epistemology and law.
There is no general agreement on how t ...
,
theoretical, computational and now data-driven) and asserted that "everything about science is changing because of the impact of information technology" and the
data deluge.
E-Science revolutionizes both fundamental legs of the
scientific method
The scientific method is an Empirical evidence, empirical method for acquiring knowledge that has been referred to while doing science since at least the 17th century. Historically, it was developed through the centuries from the ancient and ...
:
empirical research
Empirical research is research using empirical evidence. It is also a way of gaining knowledge by means of direct and indirect observation or experience. Empiricism values some research more than other kinds. Empirical evidence (the record of one ...
, especially through digital
big data
Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data processing, data-processing application software, software. Data with many entries (rows) offer greater statistical power, while data with ...
; and
scientific theory
A scientific theory is an explanation of an aspect of the universe, natural world that can be or that has been reproducibility, repeatedly tested and has corroborating evidence in accordance with the scientific method, using accepted protocol (s ...
, especially through
computer simulation
Computer simulation is the running of a mathematical model on a computer, the model being designed to represent the behaviour of, or the outcome of, a real-world or physical system. The reliability of some mathematical models can be determin ...
model building.
[Hilbert, M. (2015). e-Science for Digital Development: ICT4ICT4D. Centre for Development Informatics, SEED, University of Manchester. ] These ideas were reflected by The White House's Office and Science Technology Policy in February 2013, which slated many of the aforementioned e-Science output products for preservation and access requirements under the memorandum's directive. E-sciences include particle physics, earth sciences and
social simulations.
Characteristics and examples
Most of the research activities into e-Science have focused on the development of new computational tools and infrastructures to support scientific discovery. Due to the complexity of the software and the backend infrastructural requirements, e-Science projects usually involve large teams managed and developed by research laboratories, large universities or governments. Currently there is a large focus in e-Science in the United Kingdom, where the UK e-Science programme provides significant funding. In Europe the development of computing capabilities to support the
CERN
The European Organization for Nuclear Research, known as CERN (; ; ), is an intergovernmental organization that operates the largest particle physics laboratory in the world. Established in 1954, it is based in Meyrin, western suburb of Gene ...
Large Hadron Collider
The Large Hadron Collider (LHC) is the world's largest and highest-energy particle accelerator. It was built by the CERN, European Organization for Nuclear Research (CERN) between 1998 and 2008, in collaboration with over 10,000 scientists, ...
has led to the development of e-Science and Grid infrastructures which are also used by other disciplines.
Consortiums
Example e-Science infrastructures include th
Worldwide LHC Computing Grid
a federation with various partners including th
European Grid Infrastructure the Open Science Grid and th
To support e-Science applications,
Open Science Grid combines interfaces to more than 100 nationwide clusters, 50 interfaces to geographically distributed storage caches, and 8 campus grids (Purdue, Wisconsin-Madison, Clemson, Nebraska-Lincoln, FermiGrid at FNAL, SUNY-Buffalo, and Oklahoma in the United States; and
UNESP in Brazil). Areas of science benefiting from Open Science Grid include:
*
astrophysics
Astrophysics is a science that employs the methods and principles of physics and chemistry in the study of astronomical objects and phenomena. As one of the founders of the discipline, James Keeler, said, astrophysics "seeks to ascertain the ...
,
gravitational physics,
high-energy physics
Particle physics or high-energy physics is the study of fundamental particles and forces that constitute matter and radiation. The field also studies combinations of elementary particles up to the scale of protons and neutrons, while the stu ...
,
neutrino physics,
nuclear physics
Nuclear physics is the field of physics that studies atomic nuclei and their constituents and interactions, in addition to the study of other forms of nuclear matter.
Nuclear physics should not be confused with atomic physics, which studies th ...
*
molecular dynamics
Molecular dynamics (MD) is a computer simulation method for analyzing the Motion (physics), physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamics ( ...
,
materials science
Materials science is an interdisciplinary field of researching and discovering materials. Materials engineering is an engineering field of finding uses for materials in other fields and industries.
The intellectual origins of materials sci ...
,
materials engineering
Materials science is an Interdisciplinarity, interdisciplinary field of researching and discovering materials. Materials engineering is an engineering field of finding uses for materials in other fields and industries.
The intellectual origi ...
,
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
,
computer engineering
Computer engineering (CE, CoE, or CpE) is a branch of engineering specialized in developing computer hardware and software.
It integrates several fields of electrical engineering, electronics engineering and computer science.
Computer engi ...
,
nanotechnology
Nanotechnology is the manipulation of matter with at least one dimension sized from 1 to 100 nanometers (nm). At this scale, commonly known as the nanoscale, surface area and quantum mechanical effects become important in describing propertie ...
*
structural biology
Structural biology deals with structural analysis of living material (formed, composed of, and/or maintained and refined by living cells) at every level of organization.
Early structural biologists throughout the 19th and early 20th centuries we ...
,
computational biology
Computational biology refers to the use of techniques in computer science, data analysis, mathematical modeling and Computer simulation, computational simulations to understand biological systems and relationships. An intersection of computer sci ...
,
genomics
Genomics is an interdisciplinary field of molecular biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, ...
,
proteomics
Proteomics is the large-scale study of proteins. Proteins are vital macromolecules of all living organisms, with many functions such as the formation of structural fibers of muscle tissue, enzymatic digestion of food, or synthesis and replicatio ...
,
medicine
Medicine is the science and Praxis (process), practice of caring for patients, managing the Medical diagnosis, diagnosis, prognosis, Preventive medicine, prevention, therapy, treatment, Palliative care, palliation of their injury or disease, ...
UK programme
After his appointment as Director General of the Research Councils in 1999 John Taylor, with the support of the Science Minister
David Sainsbury and the Chancellor of the Exchequer
Gordon Brown
James Gordon Brown (born 20 February 1951) is a British politician who served as Prime Minister of the United Kingdom and Leader of the Labour Party (UK), Leader of the Labour Party from 2007 to 2010. Previously, he was Chancellor of the Ex ...
, bid to
HM Treasury
His Majesty's Treasury (HM Treasury or HMT), and informally referred to as the Treasury, is the Government of the United Kingdom’s economic and finance ministry. The Treasury is responsible for public spending, financial services policy, Tax ...
to fund a programme of e-infrastructure development for science which would provide the foundation for UK science and industry to be a world leader in the
knowledge economy
The knowledge economy, or knowledge-based economy, is an economic system in which the production of goods and services is based principally on knowledge-intensive activities that contribute to advancement in technical and scientific innovation. ...
which motivated the
Lisbon Strategy
The Lisbon Strategy, also known as the Lisbon Agenda or Lisbon Process, was an action and development plan devised in 2000, for the economy of the European Union between 2000 and 2010. A pivotal role in its formulation was played by the Portugue ...
for sustainable economic growth that the UK government committed to in March 2000.
In November 2000 John Taylor announced £98 million for a national UK e-Science programme. An additional £20 million contribution was planned from UK industry in matching funds to projects that they participated in. From this budget of £120 million over three years, £75 million was to be spent on grid application pilots in all areas of science, administered by the Research Council responsible for each area, while £35 million was to be administered by the
EPSRC as a Core Programme to develop "industrial strength" Grid middleware. Phase 2 of the programme for 2004-2006 was supported by a further £96 million for application projects, and £27 million for the EPSRC core programme. Phase 3 of the programme for 2007-2009 was supported by a further £14 million for the EPSRC core programme and a further sum for applications. Additional funding for UK e-Science activities was provided from European Union funding, from
university funding council SRIF funding for hardware, and from
Jisc
Jisc is a United Kingdom not-for-profit organisation that provides network and IT services and digital resources in support of further and higher education and research, as well as the public sector. Its head office is based in Bristol with ...
for networking and other infrastructure.
The UK e-Science programme comprised a wide range of resources, centres and people including the National e-Science Centre (NeSC) which is managed by the Universities of
Glasgow
Glasgow is the Cities of Scotland, most populous city in Scotland, located on the banks of the River Clyde in Strathclyde, west central Scotland. It is the List of cities in the United Kingdom, third-most-populous city in the United Kingdom ...
and
Edinburgh
Edinburgh is the capital city of Scotland and one of its 32 Council areas of Scotland, council areas. The city is located in southeast Scotland and is bounded to the north by the Firth of Forth and to the south by the Pentland Hills. Edinburgh ...
, with facilities in both cities.
Tony Hey led the core programme from 2001 to 2005.
Within the UK regional e-Science centres support their local universities and projects, including:
White Rose Grid e-Science Centre(WRGeSC)
Belfast e-Science Centre(BeSC)
Centre for eResearch Bristol(CeRB)
Cambridge e-Science Centre(CeSC)
STFC e-Science Centre(STFCeSC)
e-Science North West(eSNW)
*
National Grid Service (NGS)
OMII-UKLancaster University Centre for e-ScienceLondon e-Science Centre(LeSC)
North East Regional e-Science Centre(NEReSC)
Oxford e-Science Centre(OeSC)
Southampton e-Science Centre (SeSC)
Welsh e-Science Centre (WeSC)
(MeSC)
There are also various centres of excellence and research centres.
In addition to centres, the grid application pilot projects were funded by the Research Council responsible for each area of UK science funding.
The
EPSRC funded 11 pilot e-Science projects in three phases (for about £3 million each in the first phase):
* First Phase (2001–2005) were CombEchem, DAME,
Discovery Net, GEODISE,
myGrid and RealityGrid.
* Second phase (2004–2008) were GOLD and Integrative biology
* Third phase (2005–2010) were PMSEG (MESSAGE), CARMEN and NanoCMOS
The
PPARC/
STFC funded two projects:
GridPP (phase 1 for £17 million, phase 2 for £5.9 million, phase 3 for £30 million and a 4th phase running from 2011 to 2014) and Astrogrid (£14 million over 3 phases).
The remaining £23 million of phase one funding was divided between the application projects funded by BBSRC, MRC and NERC:
*
BBSRC: Biomolecular Grid, Proteome Annotation Pipeline, High-Throughput Structural Biology, Global Biodiversity
*
MRC: Biology of Ageing, Sequence and Structure Data, Molecular Genetics, Cancer Management, Clinical e-Science Framework, Neuroinformatics Modeling Tools
*
NERC: Climateprediction.com, Oceanographic Grid, Molecular Environmental Grid, NERC DataGrid
The funded UK e-Science programme was reviewed on its completion in 2009 by an international panel led by
Daniel E. Atkins, director of the Office of
Cyberinfrastructure of the US
NSF. The report concluded that the programme had developed a skilled pool of expertise, some services, and had led to cooperation between academia and industry, but that these achievements were at a project level rather than by generating infrastructure or transforming disciplines to adopt e-Science as a normal method of work, and that they were not self-sustainable without further investment.
United States
United States-based initiatives, where the term
cyberinfrastructure is typically used to define e-Science projects, are primarily funded by the
National Science Foundation
The U.S. National Science Foundation (NSF) is an Independent agencies of the United States government#Examples of independent agencies, independent agency of the Federal government of the United States, United States federal government that su ...
office of cyberinfrastructure (NSF OCI) and
Department of Energy (in particular the Office of Science). After the conclusion of
TeraGrid
TeraGrid was an e-Science grid computing infrastructure combining resources at eleven partner sites. The project started in 2001 and operated from 2004 through 2011.
The TeraGrid integrated high-performance computers, data resources and tools, an ...
in 2011, th
ACCESS programwas established and funded by the National Science Foundation to help researchers and educators, with or without supporting grants, to utilize the nation’s advanced computing systems and services.
The Netherlands
Dutch eScience research is coordinated by th
Netherlands eScience Centerin Amsterdam, an initiative founded b
NWOan
SURF
Europe
Plan-Europe is a Platform of National e-Science/Data Research Centers in Europe, as established during the constituting meeting 29–30 October 2014 in Amsterdam, the Netherlands, and which is based on agreed Terms of Reference. PLAN-E has a kernel group of active members and convenes twice annually. More can be found o
PLAN-E
Sweden
Two academic research projects have been carried out in Sweden by two different groups of universities, to help researches share and access scientific computing resources and knowledge:
* Swedish e-Science Research Center (SeRC):
Kungliga Tekniska högskolan (KTH),
Stockholm University (SU),
Karolinska institutet (KI) and
Linköping University (LiU)
* eSSENCE, The e-Science Collaboration (eSSENCE):
Uppsala University
Uppsala University (UU) () is a public university, public research university in Uppsala, Sweden. Founded in 1477, it is the List of universities in Sweden, oldest university in Sweden and the Nordic countries still in operation.
Initially fou ...
,
Lund University
Lund University () is a Public university, public research university in Sweden and one of Northern Europe's oldest universities. The university is located in the city of Lund in the Swedish province of Scania. The university was officially foun ...
and
Umeå University
Comparison with traditional science
Traditional science is representative of two distinct philosophical traditions within the history of science, but e-Science, it is being argued, requires a
paradigm shift
A paradigm shift is a fundamental change in the basic concepts and experimental practices of a scientific discipline. It is a concept in the philosophy of science that was introduced and brought into the common lexicon by the American physicist a ...
, and the addition of a third branch of the sciences. "The idea of
open data
Open data are data that are openly accessible, exploitable, editable and shareable by anyone for any purpose. Open data are generally licensed under an open license.
The goals of the open data movement are similar to those of other "open(-so ...
is not a new one; indeed, when studying the history and philosophy of science,
Robert Boyle
Robert Boyle (; 25 January 1627 – 31 December 1691) was an Anglo-Irish natural philosopher, chemist, physicist, Alchemy, alchemist and inventor. Boyle is largely regarded today as the first modern chemist, and therefore one of the foun ...
is credited with stressing the concepts of
skepticism
Skepticism ( US) or scepticism ( UK) is a questioning attitude or doubt toward knowledge claims that are seen as mere belief or dogma. For example, if a person is skeptical about claims made by their government about an ongoing war then the p ...
, transparency, and reproducibility for independent verification in
scholarly publishing in the 1660s. The scientific method later was divided into two major branches, deductive and empirical approaches. Today, a theoretical revision in the scientific method should include a new branch,
Victoria Stodden advocate
that of the computational approach, where like the other two methods, all of the computational steps by which scientists draw conclusions are revealed. This is because within the last 20 years, people have been grappling with how to handle changes in
high performance computing
High-performance computing (HPC) is the use of supercomputers and computer clusters to solve advanced computation problems.
Overview
HPC integrates systems administration (including network and security knowledge) and parallel programming into ...
and simulation."
As such, e-science aims at combining both empirical and theoretical traditions,
while
computer simulations can create artificial data, and real-time big data can be used to calibrate theoretical simulation models.
Conceptually, e-Science revolves around developing new methods to support scientists in conducting
scientific research
The scientific method is an empirical method for acquiring knowledge that has been referred to while doing science since at least the 17th century. Historically, it was developed through the centuries from the ancient and medieval world. The ...
with the aim of making new scientific discoveries by analyzing vast amounts of data accessible over the internet using vast amounts of computational resources. However, discoveries of value cannot be made simply by providing computational tools, a
cyberinfrastructure or by performing a pre-defined set of steps to produce a result. Rather, there needs to be an original, creative aspect to the activity that by its nature cannot be automated. This has led to various research that attempts to define the properties that e-Science platforms should provide in order to support a new paradigm of doing science, and new rules to fulfill the requirements of preserving and making computational data results available in a manner such that they are reproducible in traceable, logical steps, as an intrinsic requirement for the maintenance of modern scientific integrity that allows an extenuation of "Boyle's tradition in the computational age".
Modelling e-Science processes
One view
argues that since a modern discovery process instance serves a similar purpose to a mathematical proof it should have similar properties, namely it allows results to be deterministically reproduced when re-executed and that intermediate results can be viewed to aid examination and comprehension. In this case, simply modelling the
provenance
Provenance () is the chronology of the ownership, custody or location of a historical object. The term was originally mostly used in relation to works of art, but is now used in similar senses in a wide range of fields, including archaeology, p ...
of data is not sufficient. One has to model the provenance of the hypotheses and results generated from analyzing the data as well so as to provide evidence that support new discoveries.
Scientific workflows have thus been proposed and developed to assist scientists to track the evolution of their data, intermediate results and final results as a means to document and track the evolution of discoveries within a piece of scientific research.
Science 2.0
Other views include
Science 2.0 where e-Science is considered to be a shift from the publication of final results by well-defined collaborative groups towards a more open approach, which includes the public sharing of raw data, preliminary experimental results, and related information. To facilitate this shift, the Science 2.0 view is on providing tools that simplify communication, cooperation and collaboration between interested parties. Such an approach has the potential to: speed up the process of scientific discovery; overcome problems associated with academic publishing and peer review; and remove time and cost barriers, limiting the process of generating new knowledge.
See also
*
Citizen science
The term citizen science (synonymous to terms like community science, crowd science, crowd-sourced science, civic science, participatory monitoring, or volunteer monitoring) is research conducted with participation from the general public, or am ...
*
Cyberinfrastructure
*
Distributed computing
Distributed computing is a field of computer science that studies distributed systems, defined as computer systems whose inter-communicating components are located on different networked computers.
The components of a distributed system commu ...
*
E-research
The term e-Research (alternately spelled eResearch) refers to the use of information technology to support existing and new forms of research. This extends cyber-infrastructure practices established in Science, technology, engineering, and mathem ...
*
e-Science librarianship
*
e-Social Science
*
Grid computing
Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished fro ...
*
List of e-Science infrastructures
*
Science 2.0
*
Scientific workflow system A scientific workflow system is a specialized form of a workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or workflow, in a scientific application. Scientific workflow syst ...
References
External links
DOE and NSF Open Science GridThe eScience Institute at the University of WashingtonThe Dutch Virtual Laboratory for e-science (VL-e) projectUK Research Council's e-Science programe-science : personnalisation des résultats de recherches Google et sociologies du web UK National Centre for e-Social Scienceand thei
Wiki on e-Social Science{Dead link, date=December 2019 , bot=InternetArchiveBot , fix-attempted=yes
NSF TeraGrid ProjectArts and Humanities E-Science Support Centre (AHESSC)E-Science and Data Services Collaborative (EDSC)The European Commission's e-Infrastructures activitySwedish e-Science Research CentreeSSENCE the e-Science Collaboration
Cyberinfrastructure