Bioinformatics Vol. 19 Suppl. 1 2003
Pages i26-i33
© 2003 Oxford University Press
Remote homology detection: a motif based approach
Department of Biochemistry, B400 Beckman Center, Stanford University, CA 94305-5307, USA
Received on January 6, 2003
; accepted on February 20, 2003
Motivation: Remote homology detection is the problem of detecting homology in cases of low sequence similarity. It is a hard computational problem with no approach that works well in all cases.
Results: We present a method for detecting remote homology that is based on the presence of discrete sequence motifs. The motif content of a pair of sequences is used to define a similarity that is used as a kernel for a Support Vector Machine (SVM) classifier. We test the method on two remote homology detection tasks: prediction of a previously unseen SCOP family and prediction of an enzyme class given other enzymes that have a similar function on other substrates. We find that it performs significantly better than an SVM method that uses BLAST or Smith-Waterman similarity scores as features.
Availability: The software is available from the authors upon request.
Contact: asa.benhur{at}stanford.edu
Keywords: remote homology, discrete sequence motifs, sequence similarity, Support Vector Machines, kernel methods
* To whom correspondence should be addressed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
K. R. Kalari, T. L. Casavant, and T. E. Scheetz A knowledge-based approach to predict intragenic deletions or duplications Bioinformatics, September 15, 2008; 24(18): 1975 - 1979. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Shah, C. S. Oehmen, and B.-J. Webb-Robertson SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection Bioinformatics, March 15, 2008; 24(6): 783 - 790. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Fariselli, I. Rossi, E. Capriotti, and R. Casadio The WWWH of remote homolog detection: The state of the art Brief Bioinform, March 1, 2007; 8(2): 78 - 87. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Lingner and P. Meinicke Remote homology detection based on oligomer distances Bioinformatics, September 15, 2006; 22(18): 2224 - 2231. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q.-w. Dong, X.-l. Wang, and L. Lin Application of latent semantic analysis to protein remote homology detection Bioinformatics, February 1, 2006; 22(3): 285 - 290. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Rangwala and G. Karypis Profile-based direct kernels for remote homology detection and fold recognition Bioinformatics, December 1, 2005; 21(23): 4239 - 4247. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Lu, S. Keles, S. J. Wright, and G. Wahba Framework for kernel regularization with application to protein clustering PNAS, August 30, 2005; 102(35): 12332 - 12337. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Weston, C. Leslie, E. Ie, D. Zhou, A. Elisseeff, and W. S. Noble Semi-supervised protein classification using cluster kernels Bioinformatics, August 1, 2005; 21(15): 3241 - 3247. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Wang and R. Samudrala FSSA: a novel method for identifying functional signatures from structural alignments Bioinformatics, July 1, 2005; 21(13): 2969 - 2977. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Liu, G. Cutler, W. Li, Z. Pan, S. Peng, T. Hoey, L. Chen, and X. B. Ling Multiclass cancer classification and biomarker discovery using GA-based algorithms Bioinformatics, June 1, 2005; 21(11): 2691 - 2697. [Abstract] [Full Text] [PDF] |
||||


