HOME

TheInfoList



OR:

The hierarchical editing language for macromolecules (HELM) is a method of describing complex biological molecules. It is a notation that is machine readable to render the composition and structure of
peptides Peptides (, ) are short chains of amino acids linked by peptide bonds. Long chains of amino acids are called proteins. Chains of fewer than twenty amino acids are called oligopeptides, and include dipeptides, tripeptides, and tetrapeptides. A p ...
,
proteins Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, respo ...
, oligonucleotides, and related small molecule linkers. HELM was developed by a consortium of pharmaceutical companies in what is known as the Pistoia Alliance. Development began in 2008. In 2012 the notation was published openly and for free. The HELM open source project can be found on GitHub.


HELM

The need for HELM became obvious as researchers began working on modeling and computational projects involving molecules and engineered biomolecules of this type. There was not a language to describe the entities in an accurate manner which described both the composition and the complex branching and structure common in these entity types. Protein sequences can describe larger proteins and chemical language files such as mol files can describe simple peptides. But the complexity of new research biomolecules makes describing large complex molecules difficult with chemical formats, and peptide formats are not sufficiently flexible to describe non-natural amino acids and other chemistries. In HELM, molecules are represented at a four levels in a hierarchy: **Complex polymer **Simple polymer **Monomer **Atom Monomers are assigned short unique identifiers in internal HELM databases and can be represented by the identifier in strings. The approach is similar to that used in Simplified molecular-input line-entry system (SMILES). An exchangeable file format allows sharing of data between companies who have assigned different identifiers to monomers. In 2014 ChEMBL announced plans to adopt HELM by 2014. The informatics company BIOVIA developed a modified Molfile format called the Self-Contained Sequence Representation (SCSR) A standard which can incorporate individual attempts to solve the problem and be used universally and avoid proliferating standards is a goal of HELM.


Tools

An editor tool is needed to visualize and work with biomolecules at the correct level of detail. The editor is needed to "zoom out" to see a large molecule at the amino-acid sequence level, then "zoom in" to the atomic level at a particular site of conjugation or derivatization. The HELM Editor and HAbE (HELM Antibody Editor) are two client tools which may in the future be released as web-based applications.


Pistoia Alliance

At a conference in Pistoia, Italy, a group of researchers from Pfizer AstraZeneca,
GlaxoSmithKline GSK plc, formerly GlaxoSmithKline plc, is a British multinational pharmaceutical and biotechnology company with global headquarters in London, England. Established in 2000 by a merger of Glaxo Wellcome and SmithKline Beecham. GSK is the ten ...
, and Novartis formed what came to be known as the Pistoia Alliance. All parties were interested in solving problems for data aggregation, data sharing and analytics for pharmaceutical research. The alliance was incorporated in 2008. The alliance is now composed of informatics experts and researchers from industry, academia and life science service organizations.


See also

* Simplified molecular-input line-entry system (SMILES) * International Chemical Identifier (InChI) * Molecular Query Language * Molecule editor *
Chemical table file Chemical table file (CT File) is a family of text-based chemical file formats that describe molecules and chemical reactions. One format, for example, lists each atom in a molecule, the x-y-z coordinates of that atom, and the bonds among the atoms. ...

PLN protein line notation
* Molecular graphics * FASTA


References

{{DEFAULTSORT:Hierarchical Editing Language for Macromolecules Chemical nomenclature Encodings Chemical file formats Bioinformatics