Cluster Heat Maps

One of the most powerful methods to mine and visualize high-dimensional data is through cluster analyses. Details on the mathematical basis of these clustering algorithms are not within the scope of this chapter, but simply speaking, cluster analyses attempt to detect natural groups in data using a combination of distance metrics and linkages. GEO provides nine classic varieties of precom-puted unsupervised hierarchical clusters, as well as user-defined K-means and K-median clustering (Fig. 1). Columns (Samples), and independently, the rows (genes) are rearranged to place rows with similar response patterns near each other and columns with similar response patterns near each other. Cluster results are graphically represented as "heat maps," whereby high through low expression levels are presented as a two-color spectrum that allows the user to easily identify groups of interesting genes through visual pattern recognition. Each distinct colored "island" in the heat map represents a coordinated transcriptional response, based on the assumption that genes having similar expression profiles across a set of conditions are likely to be involved in the same biological processes. Such biologically relevant clusters can lead to the formulation of testable predictions and can infer functional roles for previously uncharacterized genes.

The GEO cluster heat map images are interactive; using a moveable box, users can select a region, or regions, of interest. This region can be enlarged and the raw data downloaded, plotted as line charts, or linked out to the corresponding profiles in Entrez GEO Profiles (Fig. 3; see Note 3).

