Centromere Organization

Eukaryotic Centromere Organization

The sequences that make up the centromeres of diverse organisms are extremely variable. Most well-characterized centromeres contain repetitive DNA with an AT-richness greater than that of the genome average (9,10). However, individual organisms have evolved different genomic structures to create a locus capable of chromosome segregation (11).

The simplest centromere organization is found in the chromosomes of the yeast, Saccharo-myces cerevisiae. Only approx 125 bp is required for centromere function in the budding yeast, and this sequence motif is shared among the centromeres of all 16 chromosomes. Unlike other characterized centromeres, the budding yeast centromere consists of largely unique DNA (12).

In contrast to the simple centromere of the budding yeast, the fission yeast centromere is more similar to the centromeres of higher eukaryotes in its size and complexity. Schizo-saccharomyces pombe centromeres are made up of inner and outer inverted repeats flanking a nonrepetitive central core (12), and each of these regions is AT-rich. Among the three S. pombe chromosomes, the centromeres are similar, but not identical in organization (Fig. 1). Overall, S. pombe centromeres are 35-110 kb in size (12), spanning a few percent of the linear length of each chromosome. The inner repeats and central core are necessary for centromere function and bind spindle microtubules (13,14), whereas the outer repeats recruit heterochro-matin proteins and are more likely responsible for functions such as heterochromatin formation and sister chromatid cohesion (15,16).

Centromeres in several other organisms are characterized by long stretches of so-called "satellite DNA." The Drosophila centromere has been defined by a 420-kb region of a minichromosome that is required for chromosome transmission. The fly centromeric region consists of two adjacent blocks of short microsatellites, based on AATAT and AAGAG repeats, that are interspersed with transposons as well as AT-rich DNA (11,17,18). Normal fly centromeres have not been fully sequenced, owing to the difficulty of sequencing and assembling highly heterochromatic regions of the genome (19). However, the chromatin environment of endogenous Drosophila centromeres has been very well characterized (20-22).

Fig. 1. Schematic representation of genomic organization of centromeres. In the center of the figure are representative chromosomes drawn to scale from the human (Homo sapiens), rice (Oryza sativa), Arabidopsis (Arabidopsis thaliana), and fission yeast (Schizosaccharomyces pombe) genomes. For each, the extent of the centromere region is indicated by the gray oval and comprises a few percent of each chromosome. At the right is an expanded view of a centromere from S. pombe. Each S. pombe centromere contains an approx 4-kb central core (gray box), bordered by approx 6 kb of imperfect repeats on the chromosome arms. The organization of the outer repeats is more variable among chromosomes, but all belong to the same subfamilies of repeats. At the left is an expanded view of a centromere from a human chromosome. A typical centromere region spans several megabases and consists of 1000 or more tandem copies of a-satellite higher-order repeats (indicated by arrows). Higher-resolution views are available from refs. 33 and 34 and in Fig. 2.

Fig. 1. Schematic representation of genomic organization of centromeres. In the center of the figure are representative chromosomes drawn to scale from the human (Homo sapiens), rice (Oryza sativa), Arabidopsis (Arabidopsis thaliana), and fission yeast (Schizosaccharomyces pombe) genomes. For each, the extent of the centromere region is indicated by the gray oval and comprises a few percent of each chromosome. At the right is an expanded view of a centromere from S. pombe. Each S. pombe centromere contains an approx 4-kb central core (gray box), bordered by approx 6 kb of imperfect repeats on the chromosome arms. The organization of the outer repeats is more variable among chromosomes, but all belong to the same subfamilies of repeats. At the left is an expanded view of a centromere from a human chromosome. A typical centromere region spans several megabases and consists of 1000 or more tandem copies of a-satellite higher-order repeats (indicated by arrows). Higher-resolution views are available from refs. 33 and 34 and in Fig. 2.

Plant centromeres are very similar to the satellite- and transposon-rich fly centromeres (23). The major component of the Arabidopsis thaliana centromere is an AT-rich 180-bp repeat unit, spanning 400 kb to 1.4 Mb among chromosomes (24). The Arabidopsis centromere is also enriched for retrotransposons not usually found on chromosome arms. A similar picture has emerged for the rice (Oryza sativa) centromere, which is comprised predominantly of a 155-bp tandem repeat unit arranged in arrays ranging from 65 kb to 2 Mb among the 12 rice chromosomes (25) (Fig. 1). These arrays are interspersed with gypsy-class retrotransposons. The Arabidopsis and rice centromeres have recently been defined at the level of chromatin, and chromatin immunoprecipitation experiments using antibodies to proteins required for centromere function have been conducted in both species. As expected, centromere proteins are associated with satellite repeats in both Arabidopsis (26) and rice (27). However, within the functional domain of the smallest rice centromere, there are also four expressed genes. This finding is surprising because centromeres are classically thought of as heterochromatic regions resistant to gene expression (28,29).

As opposed to all other normal centromeres described, the centromeres of Caenorhabditis elegans appear to be completely sequence-independent. C. elegans chromosomes are holocentric, meaning that many sites along the chromosome act as a centromere, capable of recruiting centromere proteins necessary for segregation (30-32). Holocentric chromosomes are a curious contrast to the monocentric chromosomes found in most other species typically containing repetitive AT-rich DNA at the centromeres.

Human Centromere Organization

The human centromere is made up ofhighly repetitive DNA known as a-satellite, which together comprise an estimated 2-3% of the human genome (7,33). All normal human centromeres are comprised of a-satellite DNA, although the organization of a-satellite varies from centromere to centromere (8). The most basic unit of a-satellite DNA is an approx 171-bp monomer, and monomers may be arranged in one of two configurations of a-satellite, designated "higher-order" or "monomeric" (6,33,34). Higher-order a-satellite is made up of monomers organized in highly homogeneous higher-order repeat units (8,34). For example, the higher-order a-satellite array found on chromosome 17, D17Z1, is made up of 16 monomers arranged head to tail to form a 2.7-kb higher-order repeat unit that is in turn repeated in tandem over a thousand times at the chromosome 17 centromere (7). Higher-order a-satellite has been found at all human centromeres, and higher-order arrays on individual chromosomes in the population range from a few hundred kilobases on some Y chromosomes to nearly 5 Mb in size for some autosomes (Fig. 1).

a-Satellite with a less homogeneous monomeric organization is also found at most, if not all, human centromeric regions, and this type of a-satellite by definition lacks any higher-order periodicity (33,34). Where monomeric a-satellite has been described, it has been found adjacent to higher-order a-satellite and is less abundant than the megabase-sized arrays of higherorder a-satellite. Unlike higher-order a-satellite, monomeric a-satellite is regularly interspersed with other repeat elements and with duplicated sequences, as well as some unique sequences (34,35).

Although higher-order a-satellite has been linked to centromere function, there is no evidence for monomeric a-satellite contributing to proper chromosome segregation. Thus, higherorder and monomeric a-satellites occupy physically and functionally distinct regions of each chromosome (Fig. 2A). To reflect these distinctions, the arrays of higher-order a-satellite and adjacent regions including monomeric a-satellite and other sequences are most clearly termed the "centromere" and "pericentromere," respectively (36,37).

a-SATELLITE EVOLUTION

The organization of a-satellite is a product of concerted evolutionary processes (7), and, thus, these sequences typically exhibit higher sequence identity within a species than between

Fig. 2. Genomic organization and annotation of human centromeres. (A) General model of centromeric region. The centromere itself is contained within a large array of higher-order a-satellite (light gray) that spans several megabases and is only rarely interrupted by nonsatellite sequences such as transposable elements. The array is flanked by shorter segments of monomeric a-satellite, which is interspersed with other satellite sequences, duplicated sequence, a high frequency of transposable elements, and occasional single-copy sequences before transitioning into the euchromatin of the chromosome arms. (B) Because the human genome sequence covers mostly euchromatic sequence, the available contigs do not contain much (and in most cases, any) of the higher-order repeat arrays (34) on individual chromosomes. Thus, by comparison to the model in (A), the current map and sequence of each chromosome has a large centromere gap, whose size can only be approximated. Two of the most complete annotated maps are shown for chromosomes 8 and the X, indicating the location of known higher-order repeat a-satellite, monomeric a-satellite, other satellite sequences, and genes close to the centromere.

Fig. 2. Genomic organization and annotation of human centromeres. (A) General model of centromeric region. The centromere itself is contained within a large array of higher-order a-satellite (light gray) that spans several megabases and is only rarely interrupted by nonsatellite sequences such as transposable elements. The array is flanked by shorter segments of monomeric a-satellite, which is interspersed with other satellite sequences, duplicated sequence, a high frequency of transposable elements, and occasional single-copy sequences before transitioning into the euchromatin of the chromosome arms. (B) Because the human genome sequence covers mostly euchromatic sequence, the available contigs do not contain much (and in most cases, any) of the higher-order repeat arrays (34) on individual chromosomes. Thus, by comparison to the model in (A), the current map and sequence of each chromosome has a large centromere gap, whose size can only be approximated. Two of the most complete annotated maps are shown for chromosomes 8 and the X, indicating the location of known higher-order repeat a-satellite, monomeric a-satellite, other satellite sequences, and genes close to the centromere.

species (38). Although a-satellite has been found at all primate centromeres studied, the organization and types of a-satellite vary among species (6,7) and have begun to inform hypotheses about centromere evolution.

Higher-order a-satellite has been found at some of the centromeres of chimpanzees, gorillas, and orangutans (7). Notably, higher-order a-satellite has not been found in more distant primates; indeed, only monomeric a-satellite has been found in Old World monkeys, New World monkeys, and prosimians. As the centromeres from these monkeys have not been fully analyzed, however, the apparent absence of higher-order a-satellite should be interpreted with caution. Nonetheless, these findings are consistent with a model of a-satellite evolution in which higher-order evolved relatively recently from monomeric a-satellite (6,39,40).

Although a number ofprocesses, collectively referred to as "molecular drive" (38) and including mechanisms such as unequal crossover, gene conversion, and transposition, may be participating in a-satellite evolution to some extent, the homogenization of a-satellite can largely be accounted for by unequal crossover. Recurring rounds of crossovers will homogenize tandem repeats, leading to nearly identical repeat units. This process can explain not only the emergence of a-satellite DNA approx 30-50 millions of years ago, but also the initial homogenization of subsets of monomeric a-satellite to form the higher-order repeat units that subsequently expanded to make up the megabase-sized arrays currently present on human centromeres.

The relationships among a-satellite on different chromosomes, homologs of the same chromosome, and sister chromatids are very informative for determining the relative rates of unequal crossover events predicted to occur in a-satellite evolution. With the exception of the centromeres on the acrocentric chromosomes, higher-order a-satellite in the human genome is chromosome-specific, meaning that higher-order a-satellite on one chromosome may be distinguished from that on another chromosome (8). This can best be explained by unequal crossover events limited to homologous chromosomes that homogenized a-satellite into a chromosome-specific higher-order array (40-42). The high sequence identity among thousands of higher-order repeat units on a given chromosome argues that /¬ętrachromosomal exchange (i.e., within and between homologs) is an efficient mechanism for homogenizing a-satellite (7,38).

However, there is also evidence of ancient /¬ęterchromosomal exchanges involving a-satellite. Higher-order repeats from different chromosomes have related organizations and fall into suprachromosomal families (7,43). Although the related higher-order arrays on different chromosomes provides evidence for ancient interchromosomal exchanges, the overall sequence variation among higher-order repeats within a suprachromosomal family suggests that this type of exchange event occurs much less frequently than intrachromosomal exchanges between homologous chromosomes (42).

Was this article helpful?

0 0

Post a comment