Bioinformatics, Vol 14, 232-243, Copyright © 1998 by Oxford University Press
J Kleffe, K Hermann, W Vahrson, B Wittig and V Brendel
MOTIVATION: We developed GeneGenerator because of the need for a tool to
predict gene structure without knowing in advance how to score potential
exons and introns in order to obtain the best results, pertinent in
particular to less well-studied organisms for which suitable training sets
are small. GeneGenerator is a very flexible algorithm which for a given
genomic sequence generates a number of feasible gene structures satisfying
user-defined constraints. The specific implementation described in detail
requires minimum scoring for translation start and donor and acceptor
splice sites according to previously trained logitlinear models. In
addition, potential exons and introns are required to exceed specified
minimal lengths and threshold scores for coding or non-coding potential
derived as log-likelihood ratios of appropriate Markov sequence models.
RESULTS: A database of 46 non-redundant genomic sequences from maize is
used for illustration. It is shown that the correct gene structures do not
always maximize the considered target function. However, in most cases, the
correct or nearly correct structures are found in a small set of
high-scoring structures. A critical review of the generated structures
sometimes allows the choices to be narrowed by considering additional
variables such as predicted splice site strength or local optimality of
splice site scores. Summary statistics for prediction accuracy over all 46
maize genes are derived under cross-validation and non-cross-validation
training conditions for the Markov sequence models. The algorithm achieved
exon sensitivity of 0.81 and specificity of 0.75 on an independent set of
14 novel maize genomic segments. AVAILABILITY: GeneGenerator runs under
Borland-Pascal 7.0 using MS-DOS and C on UNIX work stations. The source
code is available upon request. CONTACT: jkleffe@euler.grumed.fu-berlin-de
ARTICLES
GeneGenerator--a flexible algorithm for gene prediction and its application to maize sequences
Freie Universitat Berlin, Abteilung Molekularbiologie und Bioinformatik, Institut fur Molekularbiologie und Biochemie, Arnimallee 22, 14195 Berlin, Germany. jkleffe@euler.grumed.fu-berlin-de
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C. Mathe, M.-F. Sagot, T. Schiex, and P. Rouze Current methods of gene prediction, their strengths and weaknesses Nucleic Acids Res., October 1, 2002; 30(19): 4103 - 4117. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Chopra, V. Brendel, J. Zhang, J. D. Axtell, and T. Peterson Molecular characterization of a mutable pigmentation phenotype and isolation of the first active transposable element from Sorghum bicolor PNAS, December 21, 1999; 96(26): 15330 - 15335. [Abstract] [Full Text] [PDF] |
||||

