SGN httpsgncornelledu

The SOL Genomics Network (SGN) (Mueller et al. 2005a), the front end website of the SOL project, is a comprehensive resource based on the clade-oriented database (COD) principle. Essentially, data from an entire clade of organisms is integrated into the database rather than data from a single model species. This allows more meaningful integration of diverse datasets for different organisms in a comparative and phylogenetic context. In this vein, SGN contains extensive comparative mapping data, sequence data from EST sequencing projects for tomato species, potato (S. tuberosum), pepper (C. annuum), eggplant (S. melongena), petunia (P. hybrida), N. tabacum, and species from closely related families such as coffee (Coffea canephora var. robusta) and snapdragon (Antirrhinum majus). All data are related to Ara-bidopsis and the emerging tomato genome sequence. The EST sequences are assembled into unigene sets that are annotated extensively based on annotation from sequence matches to Arabidopsis and GenBank sequences, Interpro domains, and detected features such as signal peptides, as well as other analyses. The unigene sequences form the basis of pre-calculated gene family "tribes" using the TribeMCL program (Enright et al. 2002). Member sequences of a multigene family are aligned and gene trees are calculated and stored in the database. These can be browsed online using the alignment viewer and tree browser tools. SGN recently introduced a locus database that contains annotated genetic loci from the literature for all species in the database. The locus database can be updated by the SGN user community. Each locus has an associated editor with privileges to edit the locus name, symbol, descriptions, and chromosomal location by logging on to the SGN website. Similarly, a phe-notype database was introduced that allows users to submit mutant descriptions, images and other data about mutants for any of the species in the database.

SGN is one of the bioinformatics nodes of the tomato sequencing project (Mueller et al. 2005b), providing tools to the ten sequencing partners such as a BAC registry, project statistics, sequence repository, and viewers for the annotated sequence. Data in SGN is being mapped comprehensively to the emerging tomato reference sequence.

Tools available on SGN include BLAST (Altschul et al. 1990), the Intron Finder for Solanaceae ESTs, the CAPS Designer, and bulk downloads. A comprehensive FTP site with complete data sets is also available.

Was this article helpful?

0 0

Post a comment