HOME

TheInfoList



OR:

The Nussinov algorithm is a
nucleic acid structure prediction Nucleic acid structure prediction is a computational method to determine ''secondary'' and ''tertiary'' nucleic acid structure from its sequence. Secondary structure can be predicted from one or several nucleic acid sequences. Tertiary structure ...
algorithm used in
computational biology Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has fo ...
to predict the folding of an
RNA Ribonucleic acid (RNA) is a polymeric molecule essential in various biological roles in coding, decoding, regulation and expression of genes. RNA and deoxyribonucleic acid ( DNA) are nucleic acids. Along with lipids, proteins, and carbohydra ...
molecule that makes use of dynamic programming principles. The algorithm was developed by
Ruth Nussinov Ruth Nussinov (Hebrew: פרופסור רות נוסינוב) is an Israeli-American biologist who works as a Professor in the Department of Human Genetics, School of Medicine at Tel Aviv University and is the Senior Principal Scientist and Princi ...
in the late 1970s.


Background

RNA origami RNA origami is the nanoscale folding of RNA, enabling the RNA to create particular shapes to organize these molecules. It is a new method that was developed by researchers from Aarhus University and California Institute of Technology. RNA origami ...
occurs when an RNA molecule "folds" and binds to itself. This folding often determines the function of the RNA molecule. RNA folds at different levels, this algorithm predicts the secondary structure of the RNA.


Algorithm


Scoring

We score a solution by counting the total number of paired bases. Thus, attempting to maximize the score that maximizes the total number of bonds between bases.


Motivation

Consider an RNA sequence S whose elements are taken from the set \. Let us imagine we have an optimal solution to the subproblem of folding S_i to S_, and an optimal solution for folding S_u to S_v i\leq u\leq v\leq j-1 . Now, to align S_i to S_, we have two options: # Leave S_ unpaired, and keep the structure of S_i to S_. The score for this alignment will be equal to the score of the alignment of S_i to S_, as no new base pairs were created. # Pair S_ with S_, where i\leq k. The score for this alignment will be the score of the base pairing, plus the score of the best alignment of S_i to S_ and S_ to S_.


Algorithm

Consider an RNA sequence S of length n such that S_i\in \. Construct an n\times n matrix M. Initialize M such that M(i, i)=0 M(i, i-1) = 0 for 1\leq i\leq n. M(i,j) will contain the maximum score for the subsequence S_i...S_j. Now, fill in entries of M up and to the right, so that M(i,j) = \max_\beginM(i, k-1)+M(k+1, j-1)+\text(S_k,S_j) \\ M(i, j-1)\end where \text(S_k,S_j)=\begin1,&S_k\textS_j \text\\ 0,&\text\end After this step, we have a matrix M where M(i,j) represents the optimal score of the folding of S_i...S_j. To determine the structure of the folded RNA by traceback, we first create an empty list of pairs P. We initialize with i=1,j=n. Then, we follow one of three scenarios. # If j\leq i, the procedure stops. # If M(i,j)=M(i,j-1), then set i=i,j=j-1 and continue. # Otherwise, for all k: i\leq k, ifS_k and S_j are complementary and M(i,j)=M(i,k-1)+M(k+1,j-1)+1, append (k,j) to P, then traceback both with i=i,j=k-1 and i=k+1,j=j-1. When the traceback finishes, P contains all of the paired bases.


Limitations

The Nussinov algorithm does not account for the three-dimensional shape of RNA, nor predict RNA pseudoknots.{{Cite web, title=RNA Structure and RNA Structure Prediction, url=https://math.mit.edu/classes/18.417/Slides/rna-prediction-nussinov.pdf Furthermore, in its basic form, it does not account for a minimum stem loop size. However, it is still useful as a fast algorithm for basic prediction of secondary structure.


References

Bioinformatics algorithms