Comparison of the DNA sequences of different species is a powerful method for decoding genomic information, since functional sequences tend to evolve at a slower rate than nonfunctional sequences. Analysis of conservation helps to identify coding sequences (1-4) and conserved noncoding elements with regulatory

From: Methods in Molecular Biology, vol. 338: Gene Mapping, Discovery, and Expression: Methods and Protocols Edited by: M. Bina © Humana Press Inc., Totowa, NJ

functions (5,6), and also to determine which sequences are unique for a given species. Several groups have aligned entire vertebrate genome assemblies, such as human, mouse, dog, chicken, and others, thus allowing for comprehensive statistical data on the patterns of DNA conservation among these species (7-10). Recently published reviews on comparative sequence analysis (11-15) describe this fast growing field and present computational resources available for a wide range of biological investigations.

All VISTA tools (16-19) utilize global (or global/local) alignment strategy that assumes monotonous end-to-end correspondence of DNA intervals. The AVID (20) and LAGAN (21) programs allow for fast global alignment of up to megabase-long sequences. Alignment is visualized as a continuous curve showing the level of conservation in a moving window of a pre-defined length. A global alignment strategy together with an additional, mapping component as a first step was also used in the pairwise and three-way alignment of whole genome assemblies performed in our group (22,23). A different web-accessible software package for comparative genomics, PipMaker (24-26), is based on a local alignment approach. Comparison of PipMaker and VISTA (14) as well as comparative assessment of several alignment methods and programs (27) were published elsewhere.

The web page serves as a portal for access to the suite of the VISTA tools. VISTA Browser gives access to precomputed pair-wise and multiple alignments of whole-genome assemblies of different groups of organisms. The three main VISTA servers (GenomeVISTA, mVISTA, and rVISTA) offer a range of options for comparative analysis of submitted by a user sequences. GenomeVISTA aligns and compares a single sequence (draft or finished) with whole genomes. mVISTA is designed for comparison of ortho-logous sequences from different species. rVISTA (19) takes into consideration conservation among species to improve prediction of transcription factor binding sites (TFBS). VISTA pages offer extensive help on selecting a type of analysis, finding optimal parameters for a particular project, and navigating the web site.

The VISTA web site also provides access to the results of comparative analysis of specific sets of genes, as well as other relevant internal and external resources. In addition to a description of these services, widely used by the biological community, a short overview of recently developed tools and techniques is given. The capabilities of our programs are illustrated by analyzing an arbitrarily selected disease-associated gene, RUNX1, located on human chromosome 21.

0 0

Post a comment