Bioinformatics Vol. 17 no. 90001 2001
Pages S56-S64
© 2001 Oxford University Press
Gene recognition based on DAG shortest paths
Department of Computer Science, University of Illinois at Urbana-Champaign, Digital Computing Laboratory, Urbana, Illinois, 61801, USA
Received on February 5, 2001
; revised on April 2, 2001
; accepted on April 2, 2001
We describe DAGGER, an ab initio gene recognition program which combines the output of high dimensional signal sensors in an intuitive gene model based on directed acyclic graphs. In the first stage, candidate start, donor, acceptor, and stop sites are scored using the SNoW learning architecture. These sites are then used to generate a directed acyclic graph in which each source-sink path represents a possible gene structure. Training sequences are used to optimize an edge weighting function so that the shortest source-sink path maximizes exon-level prediction accuracy. Experimental evaluation of prediction accuracy on two benchmark data sets demonstrates that DAGGERis competitive with ab initio gene finding programs based on Hidden Markov Models.
Contact: jsc{at}ocf.berkeley.edu, danr{at}cs.uiuc.edu