Skip Navigation

This Article
Right arrow FREE Full Text (Print PDF) Freely available
Right arrow FREE Full Text (Screen PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (4)
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Ramu, C.
Right arrow Articles by Gibson, T. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Ramu, C.
Right arrow Articles by Gibson, T. J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Bioinformatics Vol. 16 no. 7 2000
Pages 628-638
© 2000 Oxford University Press


Original Paper

Object-oriented parsing of biological databases with Python

Chenna Ramu 1,*, Christine Gemünd 1 and Toby J. Gibson 1

1 European Molecular Biological Laboratory, Meyerhofstrasse 1, Postfach 10.2209, Heidelberg, Germany

Received on December 21, 1999 ; revised on February 23, 2000 ; accepted on March 8, 2000

Motivation: While database activities in the biological area are increasing rapidly, rather little is done in the area of parsing them in a simple and object-oriented way.

Results: We present here an elegant, simple yet powerful way of parsing biological flat-file databases. We have taken EMBL, SWISSPROT and GENBANK as examples. EMBL and SWISS-PROT do not differ much in the format structure. GENBANK has a very different format structure than EMBL and SWISS-PROT. Extracting the desired fields in an entry (for example a sub-sequence with an associated feature) for later analysis is a constant need in the biological sequence-analysis community: this is illustrated with tools to make new splice-site databases. The interface to the parser is abstract in the sense that the access to all the databases is independent from their different formats, since parsing instructions are hidden.

Availability: The modules are available at http://shag.embl-heidelberg.de:8000/Biopy/

Contact: chenna{at}embl-heidelberg.de

Supplementary information: http://shag.embl-heidelberg.de:8000/Biopy/

* To whom correspondence should be addressed.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
F. Diella, C. M. Gould, C. Chica, A. Via, and T. J. Gibson
Phospho.ELM: a database of phosphorylation sites update 2008
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D240 - D244.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Bimpikis, A. Budd, R. Linding, and T. J. Gibson
BLAST2SRS, a web server for flexible retrieval of related protein sequences in the SWISS-PROT and SPTrEMBL databases
Nucleic Acids Res., July 1, 2003; 31(13): 3792 - 3794.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.