Protein structure prediction is the prediction of three dimensional structure of a protein from its amino acids sequence that is the prediction of secondary,tertiary and quaternary structure from its primary structure. Protein structure prediction is one of the most important goals perceived by bioinformatics and theoretical chemistry. It is highly important in medicine and technology.
Secondary structure prediction is the set of techniques in bioinformatics that aim to predict the local secondary stuctures of protien and rna sequences based only on the knowledge of primary structure amino acids or nucleotide sequence. For p and a prediction consist of assigning the regions of amino acids sequences as likely alpha helices beta stands or turns. The success of a prediction is determined by comparing it to results DSSP algorithm applied to the crystal structure of protein for nucleic acids. It may be determined from the hydrogen bonding.
Early methods of secondary structure prediction introduced in 1960’s and 1970’s focused on identifying of a likely alpha helices and were based mainly on helix coil transition models. The evolutionary conservation of secondary structures can be exploited by simultaneously assessing many homologous sequences in multiple sequence alignment. Limitations are also imposed by secondary structure prediction inability to account for tertiary structure.
The chou fasman method was among the first secondary structure prediction algorithm developed and relies predominantly on probability parameters determined from relative frequencies of each amino acids appearance in each type of secondary structures. It produced poor results. This method is roughly 50-60% accurate in predicting secondary structure.
The gor method named for the three scientist who developed it garnier,osguthrope and robson is an information theory,based method developed not longer after chou fasman. It uses a more powerful probabilistic techniques of Bayesian inference. The gor method takes into account not only the probability of each amino acid having the particular secondary structure. The approach is both more sensitive and more accurate than that of chou fasman method because the amino acid structural propensities are only strong for only small number of amino acids such as praline and glycine. This method is roughly 65% accurate.
Neural network methods used training sets of solved stuctures to identify common sequence motives associated with particular arrangements of secondary structures. This methods are over 70% accurate in their prediction. All the beta strands are still often and predicted due to their lack of 3-D structural information that would assessment of hydrogen bonding patterns that can promote formation of the extended confirmation required for the presence of complete beta sheet.
It is reported that in addition to the protein sequence secondary structure formation depends on other factors. For example: it is reported that secondary structure tendencies depend also on local environment solvent accessibility of residues,protein structural class and even the organism from which protein is obtained. Based on such observations some studies have shown that sencondary structure prediction can be improved by addition of information about the protein structural class,solvent accessibility and also contact number of residues.
Sequence covariation methods rely on existence of data set composed of multiple homologous rna sequences. With related but dissimilar sequences. These methods analyze the covariation of individual based sites in evolution, maintainence at two widely separated sites of a pair of base pairing nucleotides indicates the presence of structurally required hydrogen bonds between these positions.