The Chemistry Development Kit (CDK) is computer
software
Software consists of computer programs that instruct the Execution (computing), execution of a computer. Software also includes design documents and specifications.
The history of software is closely tied to the development of digital comput ...
, a
library
A library is a collection of Book, books, and possibly other Document, materials and Media (communication), media, that is accessible for use by its members and members of allied institutions. Libraries provide physical (hard copies) or electron ...
in the programming language
Java
Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
, for
chemoinformatics and
bioinformatics
Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...
. It is available for
Windows
Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
,
Linux
Linux ( ) is a family of open source Unix-like operating systems based on the Linux kernel, an kernel (operating system), operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically package manager, pac ...
,
Unix
Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
, and
macOS
macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
. It is
free and open-source software
Free and open-source software (FOSS) is software available under a license that grants users the right to use, modify, and distribute the software modified or not to everyone free of charge. FOSS is an inclusive umbrella term encompassing free ...
distributed under the
GNU Lesser General Public License
The GNU Lesser General Public License (LGPL) is a free-software license published by the Free Software Foundation (FSF). The license allows developers and companies to use and integrate a software component released under the LGPL into their own ...
(LGPL) 2.0.
History
The CDK was created by
Christoph Steinbeck
Christoph Steinbeck (born 1966 in Neuwied) is a German chemist and has a professorship for analytical chemistry, cheminformatics and chemometrics at the Friedrich-Schiller-Universität Jena in Thuringia.
Education
Steinbeck received his PhD ...
, Egon Willighagen and Dan Gezelter, then developers of
Jmol
Jmol is computer software for molecular modelling of chemical structures in 3 dimensions.
It is an open-source Java viewer for chemical structures in 3D.
The name originated from ''Jva (the programming language) + olcules, and also the m ...
and
JChemPaint, to provide a common code base, on 27–29 September 2000 at the
University of Notre Dame
The University of Notre Dame du Lac (known simply as Notre Dame; ; ND) is a Private university, private Catholic research university in Notre Dame, Indiana, United States. Founded in 1842 by members of the Congregation of Holy Cross, a Cathol ...
. The first source code release was made on 11 May 2011. Since then more than 100 people have contributed to the project, leading to a rich set of functions, as given below. Between 2004 and 2007, ''CDK News'' was the project's newsletter of which all articles are available from a public archive. Due to an unsteady rate of contributions, the newsletter was put on hold.
Later, unit testing, code quality checking, and
Javadoc
Javadoc (also capitalized as JavaDoc or javadoc) is an API documentation generator for the Java programming language. Based on information in Java source code, Javadoc generates documentation formatted as HTML and other formats via extensions. ...
validation was introduced. Rajarshi Guha developed a nightly build system, named Nightly, which is still operating at
Uppsala University
Uppsala University (UU) () is a public university, public research university in Uppsala, Sweden. Founded in 1477, it is the List of universities in Sweden, oldest university in Sweden and the Nordic countries still in operation.
Initially fou ...
. In 2012, the project became a support of the
InChI Trust, to encourage continued development. The library uses JNI-InChI to generate
International Chemical Identifiers (InChIs).
In April 2013, John Mayfield (né May) joined the ranks of release managers of the CDK, to handle the development branch.
Library
The CDK is a library, instead of a user program. However, it has been integrated into various environments to make its functions available. CDK is currently used in several applications, including the programming language
R, CDK-Taverna (a
Taverna workbench plugin),
Bioclipse, PaDEL, and Cinfony. Also, CDK extensions exist for Konstanz Information Miner (
KNIME
KNIME (), the Konstanz Information Miner, is a data analytics, reporting and integrating platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining "Building Blocks of Analytics" con ...
) and for
Excel, called LICSS
.
In 2008, bits of GPL-licensed code were removed from the library. While those code bits were independent from the main CDK library, and no copylefting was involved, to reduce confusions among users, the ChemoJava project was instantiated.
Major features
Chemoinformatics
* 2D
molecule editor
A notable molecule editor is a computer program for creating and modifying representations of chemical structures.
Molecule editors can manipulate chemical structure representations in either a simulated two-dimensional space or three-dimensional ...
and generator
* 3D geometry generation
* ring finding
*
substructure search using exact structures and
Smiles arbitrary target specification (SMARTS) like
query language
A query language, also known as data query language or database query language (DQL), is a computer language used to make queries in databases and information systems. In database systems, query languages rely on strict theory to retrieve informa ...
*
QSAR descriptor calculation
* fingerprint calculation, including the ECFP and FCFP fingerprints
*
force field calculations
* many input-output
chemical file format
A chemical file format is a type of data file which is used specifically for depicting molecular data. One of the most widely used is the chemical table file format, which is similar to ''Structure Data Format'' (SDF) files. They are text files ...
s, including
simplified molecular-input line-entry system
Simplification, Simplify, or Simplified may refer to:
Mathematics
Simplification is the process of replacing a expression (mathematics), mathematical expression by an equivalent one that is simpler (usually shorter), according to a well-founded or ...
(SMILES),
Chemical Markup Language (CML), and
chemical table file (MDL)
* structure generators
*
International Chemical Identifier support, via JNI-InChI
Bioinformatics
* protein active site detection
* cognate ligand detection
* metabolite identification
* pathway databases
* 2D and 3D protein descriptors
General
*
Python wrapper; see Cinfony
*
Ruby
Ruby is a pinkish-red-to-blood-red-colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapph ...
wrapper
* active
user community
A virtual community is a social network of individuals who connect through specific social media, potentially crossing geographical and political boundaries in order to pursue mutual interests or goals. Some of the most pervasive virtual commu ...
See also
*
Bioclipse – an Eclipse–RCP based chemo-bioinformatics workbench
*
Blue Obelisk
Blue Obelisk is an informal group of chemists who promote open data, Open-source model, open source, and open standards; it was initiated by Peter Murray-Rust and others in 2005. Multiple open source cheminformatics projects associate themselves w ...
*
JChemPaint – Java 2D
molecule editor
A notable molecule editor is a computer program for creating and modifying representations of chemical structures.
Molecule editors can manipulate chemical structure representations in either a simulated two-dimensional space or three-dimensional ...
, applet and application
*
Jmol
Jmol is computer software for molecular modelling of chemical structures in 3 dimensions.
It is an open-source Java viewer for chemical structures in 3D.
The name originated from ''Jva (the programming language) + olcules, and also the m ...
– Java 3D renderer, applet and application
*
JOELib – Java version of
Open Babel,
OELib
*
List of free and open-source software packages
This is a list of free and open-source software (FOSS) packages, computer software licensed under free software licenses and open-source licenses. Software that fits the Free Software Definition may be more appropriately called free software; ...
*
List of software for molecular mechanics modeling
This is a list of computer programs that are predominantly used for molecular mechanics calculations.
See also
* Car–Parrinello molecular dynamics
* Comparison of force-field implementations
* Comparison of nucleic acid simulation softwar ...
References
External links
*
CDK Wiki– the community wiki
Planet CDK- a blog planet
OpenScience.org
{{Chemistry software
Bioinformatics software
Chemistry software for Linux
Computational chemistry software
Free chemistry software
Free software programmed in Java (programming language)