What is protein homology detection?
Motivation: Remote homology detection between protein sequences is a central problem in computational biology. These kernels measure the similarity between two sequences by summing up scores obtained from local alignments with gaps of the sequences.
Which is used for comparing a variety of distantly related proteins?
Profile analysis is a method for detecting distantly related proteins by sequence comparison. Tests with globin and immunoglobulin sequences show that profile analysis can distinguish all members of these families from all other sequences in a database containing 3800 protein sequences.
Do you think local or global alignment is best for finding similar parts of distantly related proteins?
While global alignment algorithms produce more accurate alignments for proteins of similar length, local alignment algorithms are better at identifying similar regions within sequences when the sequences are not related over their entire length.
What is homology detection?
Abstract. Protein remote homology detection is one of the most fundamental and central problems for the studies of protein structures and functions, aiming to detect the distantly evolutionary relationships among proteins via computational methods.
What is remote homology detection?
Motivation: Remote homology detection is the problem of detecting homology in cases of low sequence similarity. The motif content of a pair of sequences is used to define a similarity that is used as a kernel for a Support Vector Machine (SVM) classifier.
What is a BLOSUM matrix used for?
In bioinformatics, the BLOSUM (BLOcks SUbstitution Matrix) matrix is a substitution matrix used for sequence alignment of proteins. BLOSUM matrices are used to score alignments between evolutionarily divergent protein sequences. They are based on local alignments.
How are BLOSUM matrices generated?
BLOSUM stands for BLOcks SUbstitution Matrices (Henikoff & Henikoff, 1992), and were created by observing substitution frequencies in local ungapped multiple sequence alignments. The score reflects the chance (log-odds) one amino acid is substituted for another in a set of protein multiple sequence alignments.
Which alignment is useful to detect the highly similar sequences?
Quasi-alignment
Conclusion: Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences.
Which alignment method is most suited to align closely related sequences?
local alignments
Global and local alignments Global alignments, which attempt to align every residue in every sequence, are most useful when the sequences in the query set are similar and of roughly equal size. (This does not mean global alignments cannot start and/or end in gaps.)
What is remote homolog?
Remote homologs are pairs of proteins that have similar structures and functions but lack easily detectable sequence similarity. Many remote homologs have been discovered by a systematic structural neighbouring procedure [1].