Bioinformatics Advance Access published online on January 29, 2004
Bioinformatics, doi:10.1093/bioinformatics/btg486
Bioinformatics © Oxford University Press 2004; all rights reserved
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY 14260
* To whom correspondence should be addressed. E-mail: chuntang{at}cse.buffalo.edu.
Motivation: DNA arrays permit rapid, large-scale screening for patterns of gene expression and simultaneously yield the expression levels of thousands of genes for samples. Since the number of samples is usually limited, such data sets are very sparse in high-dimensional gene space. Furthermore, most of the genes collected may not necessarily be of interest and uncertainty about which genes are relevant makes it difficult to construct an informative gene space. Unsupervised empirical sample pattern discovery and informative genes identification of such sparse high-dimensional data sets present interesting but challenging problems. Results: A new model called empirical sample pattern detection (ESPD) is proposed to delineate pattern quality with informative genes. By integrating statistical metrics, data mining, and machine learning techniques, this model dynamically measures and manipulates the relationship between samples and genes while conducting an iterative detection of informative space and the empirical pattern. The performance of the proposed method with various array data sets is illustrated. Availability: Software code is available by request from the first author. All programs were written in MATLAB.
Revised August 9, 2003
Accepted October 16, 2003
Article
ESPD: a pattern detection model underlying gene expression profiles
2 Department of Pharmaceutical Sciences, State University of New York at Buffalo, Buffalo, NY 14260
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?