HOME

TheInfoList



OR:

Target is the name of a collaborative research project specialising in
big data Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data processing, data-processing application software, software. Data with many entries (rows) offer greater statistical power, while data with ...
processing and management in northern Netherlands. It is a public-private cooperation, initiated in 2009 and supported by government subsidies. It is run by a consortium of ten academic and computer industry partners, coordinated by the
University of Groningen The University of Groningen (abbreviated as UG; , abbreviated as RUG) is a Public university#Continental Europe, public research university of more than 30,000 students in the city of Groningen (city), Groningen, Netherlands. Founded in 1614, th ...
, and researches
data management Data management comprises all disciplines related to handling data as a valuable resource, it is the practice of managing an organization's data so it can be analyzed for decision making. Concept The concept of data management emerged alongsi ...
of science projects in the areas of
astronomy Astronomy is a natural science that studies celestial objects and the phenomena that occur in the cosmos. It uses mathematics, physics, and chemistry in order to explain their origin and their overall evolution. Objects of interest includ ...
,
life sciences This list of life sciences comprises the branches of science that involve the scientific study of life – such as microorganisms, plants, and animals including human beings. This science is one of the two major branches of natural science, ...
,
artificial intelligence Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...
and
medical diagnosis Medical diagnosis (abbreviated Dx, Dx, or Ds) is the process of determining which disease or condition explains a person's symptoms and signs. It is most often referred to as a diagnosis with the medical context being implicit. The information ...
. Cooperating in the Target project are various divisions of the University of Groningen, its medical center,
IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
,
Oracle An oracle is a person or thing considered to provide insight, wise counsel or prophetic predictions, most notably including precognition of the future, inspired by deities. If done through occultic means, it is a form of divination. Descript ...
,
ASTRON ASTRON is the Netherlands Institute for Radio Astronomy. Its main office is in Dwingeloo in the Dwingelderveld National Park in the province of Drenthe. ASTRON is part of the institutes organization of the Dutch Research Council (NWO). History A ...
and Dutch IT firms Elkoog/ Heeii and Nspyre. Target's computer center is hosted by the Center for Information Technology, the computing center of the University of Groningen, and consist of more than 10
petabytes The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable un ...
of storage based on IBM's
GPFS GPFS (General Parallel File System, brand name IBM Storage Scale and previously IBM Spectrum Scale) is a high-performance clustered file system software developed by IBM. It can be deployed in shared-disk or shared-nothing distributed parallel ...
storage technology, a
high-performance computing High-performance computing (HPC) is the use of supercomputers and computer clusters to solve advanced computation problems. Overview HPC integrates systems administration (including network and security knowledge) and parallel programming into ...
cluster and a grid cluster, which is a part of the
European Grid Infrastructure EGI (originally an initialism for European Grid Infrastructure) is a federation of computing and storage resource providers that deliver advanced computing and data analytics services for research and innovation. The Federation is governed by i ...
.


History

The project was initiated to transfer expertise of astronomers in massive data processing to other areas of science. Target builds on a
distributed computing environment The Distributed Computing Environment (DCE) is a software system developed in the early 1990s from the work of the Open Software Foundation (OSF), a consortium founded in 1988 that included Apollo Computer (part of Hewlett-Packard from 1989), IBM, ...
called Astro-WISE. Astro-WISE itself originated as an initiative of the OPTICON Wide Field Imaging Working Group, which was set up to consider a standardised European survey system to facilitate research,
data reduction Data reduction is the transformation of numerical or alphabetical digital information derived empirically or experimentally into a corrected, ordered, and simplified form. The purpose of data reduction can be two-fold: reduce the number of data rec ...
and
data mining Data mining is the process of extracting and finding patterns in massive data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and ...
using data from the new generation of wide field survey cameras. The Target project launched in 2009 after receiving 32 million euros of funding for a period of five years from the European Fund for Regional Development, the
Dutch Ministry of Economic Affairs The Ministry of Economic Affairs (; EZ) is the Netherlands' ministry responsible for international trade, commercial, industrial, investment, technology, space policy, as well as tourism. The Ministry was created in 1905 as the Ministry of Ag ...
("Pieken in de Delta" project), and the provinces of
Groningen Groningen ( , ; ; or ) is the capital city and main municipality of Groningen (province), Groningen province in the Netherlands. Dubbed the "capital of the north", Groningen is the largest place as well as the economic and cultural centre of ...
and
Drenthe Drenthe () is a province of the Netherlands located in the northeastern part of the country. It is bordered by Overijssel to the south, Friesland to the west, Groningen to the north, and the German state of Lower Saxony to the east. As of Jan ...
. The project runs under the auspices of the Northern Netherlands Provinces Alliance (SNN) and the Groningen municipality.


Technological findings

At the start of the project, one aim was to develop a single integrated processing system, consisting of a multi-petabyte scale file system and several different types of grid and compute clusters. During the first few years, it became apparent that different
e-Science E-Science or eScience is computationally intensive science that is carried out in highly distributed network environments, or science that uses immense data sets that require grid computing; the term sometimes includes technologies that enable dis ...
disciplines have different data processing requirements. In some areas, a massive data streaming effort takes place, as in
Lofar LOFAR may refer to: * Low-Frequency Array, a large radio telescope system based in the Netherlands * Low Frequency Analyzer and Recorder and Low Frequency Analysis and Recording, for low-frequency sounds {{disambiguation ...
. In astronomy, the number of data objects may run in the billions, with a limited number of data
columns A column or pillar in architecture and structural engineering is a structural element that transmits, through compression, the weight of the structure above to other structural elements below. In other words, a column is a compression member ...
. In
genomics Genomics is an interdisciplinary field of molecular biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, ...
, only a small number of rows are needed, but the number of columns can range in the hundreds of thousands. Other areas, such as visual text retrieval in the Monk
search engine A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...
for historical manuscripts, are at an intermediate position, with hundreds of millions of rows and thousands of dimensions. Furthermore, genomics applications often require stringent
access control In physical security and information security, access control (AC) is the action of deciding whether a subject should be granted or denied access to an object (for example, a place or a resource). The act of ''accessing'' may mean consuming ...
, whereas other disciplines do not need to prioritize privacy. Consequently, the various sub-projects within Target adopted a pragmatic approach on which aspects of the WISE technology and components of the Target hardware infrastructure were applicable to their specific field.


Projects

Target participates in a number of data-intensive scientific projects in astronomy,
Big Data Big data primarily refers to data sets that are too large or complex to be dealt with by traditional data processing, data-processing application software, software. Data with many entries (rows) offer greater statistical power, while data with ...
visualization (collaboration with the eScience center in Amsterdam),
handwritten text recognition Handwriting recognition (HWR), also known as handwritten text recognition (HTR), is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other d ...
algorithms, medical research on healthy aging, development of diagnostic tools for
Parkinson's disease Parkinson's disease (PD), or simply Parkinson's, is a neurodegenerative disease primarily of the central nervous system, affecting both motor system, motor and non-motor systems. Symptoms typically develop gradually and non-motor issues become ...
, and more.


LOFAR Long-term Archive

Much of the data from the
LOFAR LOFAR may refer to: * Low-Frequency Array, a large radio telescope system based in the Netherlands * Low Frequency Analyzer and Recorder and Low Frequency Analysis and Recording, for low-frequency sounds {{disambiguation ...
telescope is stored, accessed from, and archived on the LOFAR Long-Term archive, designed by ASTRON and Target. The data will be hosted at the Target data center and several other European centers.


Monk

Monk is a system that was developed by Schomaker and his group at the Artificial Intelligence Institute (ALICE) at the
University of Groningen The University of Groningen (abbreviated as UG; , abbreviated as RUG) is a Public university#Continental Europe, public research university of more than 30,000 students in the city of Groningen (city), Groningen, Netherlands. Founded in 1614, th ...
. It uses pattern-recognition and machine-learning algorithms for handwritten text recognition in a variety of existing archives. Currently a number of books from the Dutch National Archives, as well as more than 70 international historical collections, with text ranging from Western, medieval to handwritten Chinese manuscripts have been ingested into Monk. The systems applies continuous ('24/7') machine learning over internet, yielding fundamental results. The MONK system employs the computational and storage resource of Target. It recently became part of a collaboration, led by Prof. Popovic from the Department of Theology and Religious Studies at the University of Groningen, which will use a combination of carbon dating, paleography, and text/image recognition techniques to try and pinpoint the authors of the
Dead Sea Scrolls The Dead Sea Scrolls, also called the Qumran Caves Scrolls, are a set of List of Hebrew Bible manuscripts, ancient Jewish manuscripts from the Second Temple period (516 BCE – 70 CE). They were discovered over a period of ten years, between ...
manuscripts.


LifeLines

LifeLines is a long-term medical research project run by the University Medical Center Groningen (UMCG). An array of
genotype The genotype of an organism is its complete set of genetic material. Genotype can also be used to refer to the alleles or variants an individual carries in a particular gene or genetic location. The number of alleles an individual can have in a ...
and
phenotype In genetics, the phenotype () is the set of observable characteristics or traits of an organism. The term covers the organism's morphology (physical form and structure), its developmental processes, its biochemical and physiological propert ...
data will be gathered from 165000 people once every five years for a total period of thirty years, and the accumulated data will be used by researchers and medical specialists to gain insights into the processes related to
aging Ageing (or aging in American English) is the process of becoming Old age, older until death. The term refers mainly to humans, many other animals, and fungi; whereas for example, bacteria, perennial plants and some simple animals are potentiall ...
in an attempt to understand why age-related health degradation varies so widely from person to person. Target provides LifeLines with the infrastructure for data storage, access, and processing. Data from LifeLines, as well as the
SURFsara SURF (short for Samenwerkende Universitaire Rekenfaciliteiten, "Cooperating University Computing Facilities") is an organization that develops, implements and maintains the national research and education network (NREN) of the Netherlands. It opera ...
and Target infrastructure were used in the Genome of the Netherlands project, run by a consortium of the UMCG, LUMC, Erasmus MC, UMCU, and
Free University of Amsterdam The (abbreviated as ''VU Amsterdam'' or simply ''VU'' when in context) is a public research university in Amsterdam, Netherlands, founded in 1880. The VU Amsterdam is one of two large, publicly funded research universities in the city, the othe ...
. Results from the project, using whole-genome sequencing to deduce population structure and demographic history of the Dutch population, were published in the
Nature Genetics ''Nature Genetics'' is a peer-reviewed scientific journal published by Nature Portfolio. It was established in 1992. It covers research in genetics. The chief editor is Tiago Faial. The journal encompasses genetic and functional genomic studies ...
journal.


GLIMPS

Run by K. Leenders, a professor of neurology at the UMCG, GLIMPS is a research project set to find faster and more reliable diagnostic tools for
Parkinson's disease Parkinson's disease (PD), or simply Parkinson's, is a neurodegenerative disease primarily of the central nervous system, affecting both motor system, motor and non-motor systems. Symptoms typically develop gradually and non-motor issues become ...
. GLIMPS explores the possibilities of using complex image-based algorithms and
PET scans Positron emission tomography (PET) is a functional imaging technique that uses radioactive substances known as radiotracers to visualize and measure changes in metabolic processes, and in other physiological activities including blood flow, re ...
for early detection of Parkinson's. To test the effectiveness of such algorithms, GLIMPS is building a large database of PET scans from several hospitals in the Netherlands. Target is responsible for building and maintaining the GLIMPS database as well as ensuring the smooth running of the image-based algorithms on its computing facilities.


Others

Additionally, Target is involved in the data management for other astronomical projects, such as the KiDs/VIKING astronomical survey using OmegaCAM, the ESO's MUSE instrument (mounted on the
Very Large Telescope The Very Large Telescope (VLT) is an astronomical facility operated since 1998 by the European Southern Observatory, located on Cerro Paranal in the Atacama Desert of northern Chile. It consists of four individual telescopes, each equipped with ...
), and MICADO (to be mounted on the
E-ELT The Extremely Large Telescope (ELT) is an astronomical observatory under construction. When completed, it will be the world's largest optical and near-infrared extremely large telescope. Part of the European Southern Observatory (ESO) agency, ...
). In addition, the data-centric approach to data management prompted by Target has been adopted by the ESA's Euclid mission. The project's spin-off company Target Holding B.V. also manages a number of commercial projects with private businesses in the North of the Netherlands. Public outreach and education is also part of the project remit and Target has organised many public events. The Infoversum 3D theatre is a spin-off of the Target project and provides a facility for the visualisation and explanation of scientific data for large groups.


References

{{Reflist, 30em Research and development in Europe University of Groningen 2009 establishments in the Netherlands Projects established in 2009 Research projects Computer science research organizations Big data