CEBS Phase III Integrate Microarray Gene Expression and Protein Expression Databases using a Gene Protein Group Strategy

The integration of microarray/gene expression and protein expression data is a critical step that will require development of knowledge of gene/protein functional relationships and gene/protein groups and the development of algorithms that will increase our knowledge of the functions of these groups through actual experimentation. To build knowledge, we will mine the published literature for genes and groups of functionally related genes or protein products that are relevant to known endpoints in toxicology, pathology, cell regulatory processes, metabolism, and the like. This literature mining and analysis process will utilize vetted gene names, and the output will be groups of genes/proteins that represent putative functional groups based on the literature. We will then develop algorithms to test these putative functional gene groups derived from the literature against treatment-related expression profiles and against clustered genes (and coregulated ESTs) to confirm gene grouping based on phenotype, as illustrated in Figure 10.9.

This literature-based functional classification of gene groups and their association with known toxicant-responsive pathways will begin to define the relationships between gene and protein expression and our conventional understanding of metabolism, toxicology/pathology, modulation and homeostasis, cell regulation, and cell signalling. It will also offer an opportunity for discovery of yet-unidentified genes (ESTs) that are coregulated with known genes.

To the extent possible, we will confirm gene group membership by sequence analysis, and we will develop statistical procedures and algorithms (Wolfinger et al. 2001) to continually refine our knowledge of gene/protein groups and their relationship to functional pathways. With full sequence definition of all genes, proteins, and gene/ protein group members, it will be possible to begin to BLAST outlier genes and proteins from new experimental datasets against datasets already contained in the CEBS database. This will begin to facilitate and inform the integration of transcriptomics

Fig. 10.9 Literature-derived putative functional gene groups validated against actual expression profiles of known toxicant-responsive pathways.

and proteomics datasets across treatment, dose, time, tissue type, and phenotypic severity. We also propose to integrate metabonomics datasets into CEBS Phase III, because of the pivotal role that metabolism plays in experimental and clinical toxicology as well as in hazard identification and risk assessment (Nicholson et al. 1999; Holmes et al. 2000; Holmes et al. 2001; Bundy et al. 2002; Nicholson et al. 2002).

0 0

Post a comment