The methods described below outline: (1) the prerequisites and assumptions required to perform this analysis, (2) where to obtain genome assemblies of eukaryotic genomes, (3) the process for installing the BLAST suite of programs, and (4) the procedure for creating BLAST databases. To identify segmental duplications in eukaryotic genomes, the methods summarize: (5) the procedure for performing sequence alignments of all possible pairs of chromosomes using MegaBLAST, (6) how to convert MegaBLAST alignments into Generic Feature Format (GFF) format, and (7) the criteria for filtering GFF records and (8) chain alignments together. Furthermore, we describe how to identify gene duplicates by (9) mapping RefSeq genes to segmental duplications and (10) using the Gene Ontology to characterize gene duplicates by function.

