HOME

TheInfoList



OR:

OrthoFinder is a
command-line A command-line interface (CLI) is a means of interacting with software via commands each formatted as a line of text. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user-friendly alternativ ...
software tool A programming tool or software development tool is a computer program that is used to software development, develop another computer program, usually by helping the developer manage computer files. For example, a programmer may use a tool called ...
for
comparative genomics Comparative genomics is a branch of biological research that examines genome sequences across a spectrum of species, spanning from humans and mice to a diverse array of organisms from bacteria to chimpanzees. This large-scale holistic approach c ...
. OrthoFinder determines the correspondence between genes in different organisms (also known as orthology analysis). This correspondence provides a framework for understanding the evolution of life on Earth, and enables the extrapolation and transfer of biological knowledge between organisms. OrthoFinder takes
FASTA FASTA is a DNA and protein sequence alignment software package first described by David J. Lipman and William R. Pearson in 1985. Its legacy is the FASTA format which is now ubiquitous in bioinformatics. History The original FASTA program ...
files of protein sequences as input (one per species) and as output provides: * Orthogroups * Rooted
Phylogenetic trees A phylogenetic tree or phylogeny is a graphical representation which shows the evolutionary history between a set of species or taxa during a specific time.Felsenstein J. (2004). ''Inferring Phylogenies'' Sinauer Associates: Sunderland, MA. In o ...
of all orthogroups * A rooted species tree for the set of species included in the input dataset * Hierarchical orthogroups for each node in the species tree *
Orthologs Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a spec ...
between all species *
Gene duplication Gene duplication (or chromosomal duplication or gene amplification) is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene ...
events mapped to branches in the species tree * Comparative genomic statistics As of August 2021, the tool has been referenced by more than 1500 published studies.{{cite web, url=https://scholar.google.co.nz/scholar?cites=10931031451104788868&as_sdt=2005&sciodt=0,5&hl=en, website=Google Scholar, title=Citations for Emms & Kelly 2015, access-date=6 August 2021


See also

*
Bioinformatics Bioinformatics () is an interdisciplinary field of science that develops methods and Bioinformatics software, software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, ...
*
Homology (biology) In biology, homology is similarity in anatomical structures or genes between organisms of different taxa due to shared ancestry, ''regardless'' of current functional differences. Evolutionary biology explains homologous structures as retained her ...
*
Sequence homology Sequence homology is the homology (biology), biological homology between DNA sequence, DNA, RNA sequence, RNA, or Protein primary structure, protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments ...
*
Protein family A protein family is a group of evolutionarily related proteins. In many cases, a protein family has a corresponding gene family, in which each gene encodes a corresponding protein with a 1:1 relationship. The term "protein family" should not be ...
*
Sequence clustering In bioinformatics, sequence clustering algorithms attempt to group biological sequences that are somehow related. The sequences can be either of genomic, " transcriptomic" ( ESTs) or protein origin. For proteins, homologous sequences are typically ...


References

Evolutionary biology Bioinformatics software Phylogenetics