Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (15)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Gotoh, O.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gotoh, O.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 16 no. 3 2000
Pages 190-202
© 2000 Oxford University Press

Homology-based gene structure prediction: simplified matching algorithm using a translated codon (tron) and improved accuracy by allowing for long gaps

Osamu Gotoh 1

1 Saitama Cancer Center Research Institute, 818 Komuro Ina-machi, Saitama 362-0806, Japan

Received on August 23, 1999 ; accepted on October 21, 1999

Motivation: Locating protein-coding exons (CDSs) on a eukaryotic genomic DNA sequence is the initial and an essential step in predicting the functions of the genes embedded in that part of the genome. Accurate prediction of CDSs may be achieved by directly matching the DNA sequence with a known protein sequence or profile of a homologous family member(s).

Results: A new convention for encoding a DNA sequence into a series of 23 possible letters (translated codon or tron code) was devised to improve this type of analysis. Using this convention, a dynamic programming algorithm was developed to align a DNA sequence and a protein sequence or profile so that the spliced and translated sequence optimally matches the reference the same as the standard protein sequence alignment allowing for long gaps. The objective function also takes account of frameshift errors, coding potentials, and translational initiation, termination and splicing signals. This method was tested on Caenorhabditis elegans genes of known structures. The accuracy of prediction measured in terms of a correlation coefficient (CC) was about 95% at the nucleotide level for the 288 genes tested, and 97.0% for the 170 genes whose product and closest homologue share more than 30% identical amino acids. We also propose a strategy to improve the accuracy of prediction for a set of paralogous genes by means of iterative gene prediction and reconstruction of the reference profile derived from the predicted sequences.

Availability: The source codes for the program ‘aln’ written in ANSI-C and the test data will be available via anonymous FTP at ftp.genome.ad.jp/pub/genomenet/saitama-cc.

Contact: gotoh{at}cancer-c.pref.saitama.jp


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
O. Gotoh
A space-efficient and accurate method for mapping and aligning cDNA sequences onto genomic sequence
Nucleic Acids Res., May 1, 2008; 36(8): 2630 - 2638.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Stanke, O. Keller, I. Gunduz, A. Hayes, S. Waack, and B. Morgenstern
AUGUSTUS: ab initio prediction of alternative transcripts.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W435 - W439.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Nagasaki, M. Arita, T. Nishizawa, M. Suwa, and O. Gotoh
Automated classification of alternative splicing and transcriptional initiation and construction of visual database of classified patterns
Bioinformatics, May 15, 2006; 22(10): 1211 - 1216.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
M. Csuros and I. Miklos
Statistical Alignment of Retropseudogenes and Their Functional Paralogs
Mol. Biol. Evol., December 1, 2005; 22(12): 2457 - 2471.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Pohler, N. Werner, R. Steinkamp, and B. Morgenstern
Multiple alignment of genomic sequences using CHAOS, DIALIGN and ABC
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W532 - W534.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Brudno, R. Steinkamp, and B. Morgenstern
The CHAOS/DIALIGN WWW server for multiple alignment of genomic sequences
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W41 - W44.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
L. Taher, O. Rinner, S. Garg, A. Sczyrba, and B. Morgenstern
AGenDA: gene prediction by cross-species sequence comparison
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W305 - W308.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Stanke, R. Steinkamp, S. Waack, and B. Morgenstern
AUGUSTUS: a web server for gene finding in eukaryotes
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W309 - W312.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Mathe, M.-F. Sagot, T. Schiex, and P. Rouze
Current methods of gene prediction, their strengths and weaknesses
Nucleic Acids Res., October 1, 2002; 30(19): 4103 - 4117.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
A. Louis, E. Ollivier, J.-C. Aude, and J.-L. Risler
Massive Sequence Comparisons as a Help in Annotating Genomic Sequences
Genome Res., July 1, 2001; 11(7): 1296 - 1303.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
W. J. Kent
BLAT---The BLAST-Like Alignment Tool
Genome Res., April 1, 2002; 12(4): 656 - 664.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.