Senior Scholar Award in Global Infectious Disease
David S. Roos, Ph.D.
University of Pennsylvania

Designing and Mining Pathogen Genome Databases: The Apicoplast as a Novel Drug Target in Plasmodium Parasites…and Other Stories

The excitement engendered by the completion of pathogen, vector, and human genome sequences is coupled with a concern: how are researchers to effectively exploit these data for drug and vaccine development?

Genomics research is characterized by the production of large-scale datasets derived from DNA sequencing, RNA and protein _expression profiling, high-throughput structural data, global analysis of protein-protein interactions, metabolic pathway studies, journal publications, clinical outcomes data, etc. Successful exploitation of these resources requires access to the underlying datasets, databases designed to integrate this information, tools for data analysis, practical resources for querying the data, education of scientists in the use these tools, and new paradigms for providing credit to the numerous investigators contributing to large-scale projects.

The Plasmodium genome database -- PlasmoDB -- establishes a single location for storing, downloading, browsing and querying the data emerging from various projects focused on malaria parasites, including the complete genome of P. falciparum. A relational architecture allows researchers to form queries integrating genetic and physical maps, DNA sequence, gene and protein predictions, RNA and protein _expression data, taxonomic relationships with related parasite species, and comparisons with data from GenBank and other databases. Similar databases have been established for Toxoplasma and other parasites.

Success stories attributable to the Plasmodium genome project and the PlasmoDB database include the identification of novel secretory and surface antigens, and cataloging of all metabolic pathways associated with the apicoplast. Phylogenetic studies demonstrate that the apicoplast -- an essential organelle -- was acquired when an ancestral parasite 'ate' a eukaryotic alga, and retained the algal chloroplast. Integrated computational, genomic, genetic, biochemical, cell biological, and pharmacological studies have identified several hundred nuclear-encoded plastid genes (~15% of the parasite genome), providing a virtually complete picture of the apicoplast "metabolome", and several promising targets for parasiticidal drug design.

Similar strategies can be exploited to identify other targets for drug and vaccine design. For example, efforts to develop an anti-malaria vaccine would benefit from the identification of potential surface antigens that are predicted to be immunodominant and show evidence of (positive) immune selection. Drug development efforts would benefit from the identification of enzymes that are expressed in bloodstream parasites and differ significantly from their human counter-parts, providing a potential therapeutic window.


Contact Dr. Roos.