Biomedical Data Science
   HOME

TheInfoList



OR:

Biomedical data science is a multidisciplinary field which leverages large volumes of data to promote biomedical innovation and discovery. Biomedical data science draws from various fields including
Biostatistics Biostatistics (also known as biometry) is a branch of statistics that applies statistical methods to a wide range of topics in biology. It encompasses the design of biological experiments, the collection and analysis of data from those experimen ...
,
Biomedical informatics Health informatics combines communications, information technology (IT), and health care to enhance patient care and is at the forefront of the medical technological revolution. It can be viewed as a branch of engineering and applied science. ...
, and
machine learning Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
, with the goal of understanding biological and medical data. It can be viewed as the study and application of
data science Data science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processing, scientific visualization, algorithms and systems to extract or extrapolate knowledge from potentially noisy, stru ...
to solve biomedical problems. Modern biomedical datasets often have specific features which make their analyses difficult, including: * Large numbers of feature (sometimes billions), typically far larger than the number of samples (typically tens or hundreds) * Noisy and missing data * Privacy concerns (e.g., electronic health record confidentiality) * Requirement of interpretability from decision makers and regulatory bodies Many biomedical data science projects apply machine learning to such datasets. These characteristics, while also present in many data science applications more generally, make biomedical data science a specific field. Examples of biomedical data science research include: *
Computational genomics Computational genomics refers to the use of computational and statistical analysis to decipher biology from genome sequences and related data, including both DNA and RNA sequence as well as other "post-genomic" data (i.e., experimental data obtained ...
* Computational imaging *
Electronic health record An electronic health record (EHR) is the systematized collection of electronically stored patient and population health information in a digital format. These records can be shared across different health care settings. Records are shared thro ...
s data mining * Biomedical network science


Training in Biomedical Data Science

The
National Library of Medicine The United States National Library of Medicine (NLM), operated by the United States federal government, is the world's largest medical library. Located in Bethesda, Maryland, the NLM is an institute within the National Institutes of Health. I ...
of the
US National Institutes of Health The National Institutes of Health (NIH) is the primary agency of the United States government responsible for biomedical and public health research. It was founded in 1887 and is part of the United States Department of Health and Human Service ...
(NIH) identified key biomedical data scientist attributes in an NIH-wide review: general biomedical subject matter knowledge; programming language expertise;
predictive analytics Predictive analytics encompasses a variety of Statistics, statistical techniques from data mining, Predictive modelling, predictive modeling, and machine learning that analyze current and historical facts to make predictions about future or other ...
, modeling, and machine learning; team science and communication; and responsible data stewardship.


University Departments and Programs

*
Johns Hopkins University The Johns Hopkins University (often abbreviated as Johns Hopkins, Hopkins, or JHU) is a private university, private research university in Baltimore, Maryland, United States. Founded in 1876 based on the European research institution model, J ...
’s Department of Biomedical Engineering offers biomedical data science training at the undergraduate, master's, and PhD levels. They were the first university to offer programs at both undergraduate and graduate levels. * Dartmouth College's
Geisel School of Medicine The Geisel School of Medicine is the medical school of Dartmouth College located in Hanover, New Hampshire. The fourth oldest medical school in the United States, it was founded in 1797 by New England physician Nathan Smith (physician, born 1762) ...
houses the Department of Biomedical Data Science where Quantitative Biomedical Sciences programs are available at the master's and PhD levels. *
Imperial College London Imperial College London, also known as Imperial, is a Public university, public research university in London, England. Its history began with Prince Albert of Saxe-Coburg and Gotha, Prince Albert, husband of Queen Victoria, who envisioned a Al ...
’s
Faculty of Medicine A medical school is a tertiary educational institution, professional school, or forms a part of such an institution, that teaches medicine, and awards a professional degree for physicians. Such medical degrees include the Bachelor of Medicine, ...
and Data Science Institute offer an MRes in Biomedical Research (Data Science). *
Mount Sinai Mount Sinai, also known as Jabal Musa (), is a mountain on the Sinai Peninsula of Egypt. It is one of several locations claimed to be the Mount Sinai (Bible), biblical Mount Sinai, the place where, according to the sacred scriptures of the thre ...
’s Icahn School of Medicine offers a Master of Science in Biomedical Data Science. *
Stanford University Leland Stanford Junior University, commonly referred to as Stanford University, is a Private university, private research university in Stanford, California, United States. It was founded in 1885 by railroad magnate Leland Stanford (the eighth ...
’s Department of Biomedical Data Science offers multiple biomedical informatics graduate programs (MS, PhD, and MD/PhD). * The
University of Exeter The University of Exeter is a research university in the West Country of England, with its main campus in Exeter, Devon. Its predecessor institutions, St Luke's College, Exeter School of Science, Exeter School of Art, and the Camborne School of ...
’s College of Healthcare and Medicine offers an MSc in Health Data Science.


Biomedical Data Science Research in Academia


Scholarly Journals

The first journal dedicated to biomedical data science appeared in 2018 – '' Annual Review of Biomedical Data Science''.
“The ''Annual Review of Biomedical Data Science'' provides comprehensive expert reviews in biomedical data science, focusing on advanced methods to store, retrieve, analyze, and organize biomedical data and knowledge. The scope of the journal encompasses informatics, computational, and statistical approaches to biomedical data, including the sub-fields of bioinformatics, computational biology, biomedical informatics, clinical and clinical research informatics, biostatistics, and imaging informatics. The mission of the journal is to identify both emerging and established areas of biomedical data science, and the leaders in these fields.”
Other journals have a more general scope than biomedical data science, but regularly publish biomedical data science research such as Health Data Science and Nature Machine Intelligence. Data science would not exist without curated datasets and the field has seen the rise of journals that are dedicated to describing and validating such datasets, some of which are useful for biomedical applications, including Scientific Data, Biomedical Data, and Data.


Example

The
Human Genome Project The Human Genome Project (HGP) was an international scientific research project with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both a ...
(HGP), which uncovered the DNA sequences that compose human genes, would not have been possible without biomedical data science. Significant computational resources were required to process the data in the HGP, as the
human genome The human genome is a complete set of nucleic acid sequences for humans, encoded as the DNA within each of the 23 distinct chromosomes in the cell nucleus. A small DNA molecule is found within individual Mitochondrial DNA, mitochondria. These ar ...
contains over 6 billion DNA base pairs. Scientists constructed the genome by piecing together small fragments of DNA, and computing overlaps between these sequences alone required over 10,000 CPU hours. At this massive data scale, scientists relied on advanced algorithms to perform data processing steps such as
sequence assembly In bioinformatics, sequence assembly refers to aligning and merging fragments from a longer DNA sequence in order to reconstruct the original sequence. This is needed as DNA sequencing technology might not be able to 'read' whole genomes in one g ...
and
sequence alignment In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural biology, structural, or evolutionary relationships between ...
for quality control. Some of these algorithms, such as
BLAST Blast or The Blast may refer to: *Explosion, a rapid increase in volume and release of energy in an extreme manner *Detonation, an exothermic front accelerating through a medium that eventually drives a shock front *A planned explosion in a mine, ...
, are still used in modern bioinformatics. Scientists in the HGP also had to address complexities often associated with biomedical data including noisy data, such as DNA read errors, and privacy rights of the research subjects.{{cite journal , title=The sequence of the human genome , journal=Science , date=2001 , pages=1304–1351 , doi=10.1126/science.1058040 , pmid=11181995 , url=https://www.science.org/doi/full/10.1126/science.1058040, last1=Venter , first1=J. Craig , last2=Adams , first2=Mark D. , last3=Myers , first3=Eugene W. , last4=Li , first4=Peter W. , last5=Mural , first5=Richard J. , last6=Sutton , first6=Granger G. , last7=Smith , first7=Hamilton O. , last8=Yandell , first8=Mark , last9=Evans , first9=Cheryl A. , last10=Holt , first10=Robert A. , last11=Gocayne , first11=Jeannine D. , last12=Amanatides , first12=Peter , last13=Ballew , first13=Richard M. , last14=Huson , first14=Daniel H. , last15=Wortman , first15=Jennifer Russo , last16=Zhang , first16=Qing , last17=Kodira , first17=Chinnappa D. , last18=Zheng , first18=Xiangqun H. , last19=Chen , first19=Lin , last20=Skupski , first20=Marian , last21=Subramanian , first21=Gangadharan , last22=Thomas , first22=Paul D. , last23=Zhang , first23=Jinghui , last24=Gabor Miklos , first24=George L. , last25=Nelson , first25=Catherine , last26=Broder , first26=Samuel , last27=Clark , first27=Andrew G. , last28=Nadeau , first28=Joe , last29=McKusick , first29=Victor A. , last30=Zinder , first30=Norton , volume=291 , issue=5507 , bibcode=2001Sci...291.1304V , display-authors=1 , url-access=subscription The HGP, completed in 2004, has had immense impact both biologically, shedding light on
human evolution ''Homo sapiens'' is a distinct species of the hominid family of primates, which also includes all the great apes. Over their evolutionary history, humans gradually developed traits such as Human skeletal changes due to bipedalism, bipedalism, de ...
, and medically, launching the field of
bioinformatics Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...
and leading to technologies such as genetic screening and
gene therapy Gene therapy is Health technology, medical technology that aims to produce a therapeutic effect through the manipulation of gene expression or through altering the biological properties of living cells. The first attempt at modifying human DNA ...
.


References

Data science