Structures of Laglidadg Homing Endonucleases

The structures of six LAGLIDADG enzymes bound to their DNA targets have been determined. These include two isoschizomeric homodimers (I-Crel: Heath et al. 1997; Jurica et al. 1998; Chevalier et al. 2001,2003 and I-Msol: Chevalier et al. 2003), which are both encoded within group I introns in the 23S rDNA of the green algae Chlamydomonas reinhardtii and Monomastix; two pseudo-symmetric monomers (I-Anil: Bolduc et al. 2003 and I-Scel: Moure et al. 2003), which are encoded in mitochondrial introns of the fungi Aspergillus nidulans and Saccharomyces cerevisiae; one artificially engineered chimera (H-Drel: Chevalier et al. 2002, which is composed of a domain of the monomeric archae-al enzyme I-Dmol fused to a subunit of I-Crel); and an intein-associated endonuclease from yeast (Pl-Scel: Moure et al. 2002). Structures of two additional enzymes have also been determined in the absence of DNA: the archaeal in-tron-encoded I-Dmol (encoded within an intron in the 23S rRNA gene of Des-ulfurococcus mobilis; Silva et al. 1999), and the archaeal intein-encoded Pl-Pful (found in the ribonucleotide reductase gene of Pyrococcus furiosus; Ichiyanagi et al. 2000). These crystallographic structures illustrate the structural and functional significance of the LAGLIDADG motif, the mechanism of DNA recognition and binding, and the structure and likely mechanism of their active sites.

LAGLIDADG enzyme domains form an elongated protein fold that consists of a core fold with mixed a/p topology (a-P-P-a-P-P-a). The overall shape of this domain is a half-cylindrical "saddle" that averages approximately 25x25x35 A, with the longest dimension along a groove formed by the underside of the saddle. The surface of the groove is formed by an antiparallel, four-stranded P-sheet that presents a large number of exposed basic and polar residues for DNA contacts and binding. Each individual P-strand crosses the groove axis at an angle of -45° and displays a continuous N- to C-termi-nal bend. The length of the core protein domain is often increased by extended loops connecting the P-strands at the periphery of the P-sheet structure. The P-sheets are stabilized by hydrophobic packing between the tops of the sheets and the a-helices of the core enzyme fold.

In the case of homodimeric enzymes, the full endonuclease structure is generated by a two-fold symmetry axis located at the N-termini of the individual subunits. For monomeric LAGLIDADG enzymes, a pseudo dyad symmetry axis at the same position arranges individual domains from a single peptide chain into similar relative positions (Fig. 1). For the monomeric enzymes the C- and N-terminal helices of the two related core domains (Dal-gaard et al. 1997) are connected by flexible linker peptides with lengths between 3 residues to over 100 residues. In either enzyme subfamily, the complete DNA-binding surfaces of the full-length enzymes are 70-85 A long, and thus can accommodate DNA targets of up to 24 base pairs.

Fig. 1. Ribbon diagrams of homodimeric I-Crel (top) and asymmetric, monomeric I-Anil (bottom) endonucleases. In the latter, the core a-pp-a-pp-a domain fold is duplicated within the single polypeptide chain, and a long flexible linker (highlighted in yellow) connects their N- and C-termini, respectively. In the two structures, a perfect dyad symmetry axis, or a pseudo-symmetry axis, extends vertically in the plane of the page between the central two helices at the domain interface. The DNA target of I-Crel is 22base pairs long and is a pseudo-palindrome; the DNA target of I-Anil is 19base pairs long and asymmetric. Both enzymes use a four-stranded, antiparallel p-sheet to contact individual base pairs in the major groove of each DNA half-site. The DNA is in a very similar, slightly bent conformation in both structures. In I-Crel, there are three bound metal ions visible in the presence of manganese or magnesium. In the I-Anil structure, only two bound metal ions are visible, as discussed in the text.

Fig. 1. Ribbon diagrams of homodimeric I-Crel (top) and asymmetric, monomeric I-Anil (bottom) endonucleases. In the latter, the core a-pp-a-pp-a domain fold is duplicated within the single polypeptide chain, and a long flexible linker (highlighted in yellow) connects their N- and C-termini, respectively. In the two structures, a perfect dyad symmetry axis, or a pseudo-symmetry axis, extends vertically in the plane of the page between the central two helices at the domain interface. The DNA target of I-Crel is 22base pairs long and is a pseudo-palindrome; the DNA target of I-Anil is 19base pairs long and asymmetric. Both enzymes use a four-stranded, antiparallel p-sheet to contact individual base pairs in the major groove of each DNA half-site. The DNA is in a very similar, slightly bent conformation in both structures. In I-Crel, there are three bound metal ions visible in the presence of manganese or magnesium. In the I-Anil structure, only two bound metal ions are visible, as discussed in the text.

The LAGLIDADG motif plays three distinct, but interrelated, roles in the structure and function of this enzyme family (Fig. 2). The first seven amino acid residues of each conserved motif form the last two turns of the N-termi-nal helices in each folded domain, which are packed against one another. Individual side chains from these helices participate either in core packing within individual domains or in contacts across the interdomain interface. The final three conserved residues (typically a Gly-Asp/Glu-Gly sequence) facilitate a tight turn from the N-terminal a-helix into the first P-strand of each DNA-binding surface. The conserved acidic residues of these sequences are positioned in the active sites and bind divalent cations that are essential for catalytic activity.

The structure and packing of the parallel, two-helix bundle in the domain interface of the LAGLIDADG enzymes are strongly conserved among the otherwise highly diverged members of this enzyme family. Helix packing at this interface is not mediated by a classic "ridges into grooves" strategy, but rather by small residues such as glycine and alanine that allow van der Waals contacts between backbone atoms along the helix-helix interface. The first two glycine and/or alanine residues in the LAGLIDADG motif participate directly in the dimer interface and allow tight packing of the helices. The close packing of the interface helices in these enzymes reflects the need to pack two symmetry-related endonuclease active sites less than 10A apart, to facilitate cleavage of homing site DNA across the narrow minor groove.

Despite little primary sequence homology among the LAGLIDADG homing endonucleases outside of the motif itself, the topologies of the endonuclease domains of the enzymes visualized to date, and the shape of their DNA-

Fig.2. The LAGLIDADG helices at the domain interface of I-Crel. Other enzymes in the family that have been visualized crystallo-graphically have very similar motifs and structures. Note that a series of hydrophobic residues (Phe 9, Leu 10, 11 and 13, and Val 17) pack into the core of the individual enzyme domains, while other aromatic residues and small residues are involved in packing around and between the helices, respectively. The conserved acidic residues (Asp 20 and 20') are contributed to the active sites where they participate in metal binding

Leu 11 Leu11

Leu 11 Leu11

bound P-sheets, are remarkably similar. A structural alignment of endonucle-ase domains and subunits in their DNA-bound conformation indicates that the structure of the central core of the P-sheets is well conserved (Bolduc et al. 2003). At least 12 Ca positions within these P-sheets are in close juxtaposition and have a Ca root-mean-square deviation (RMSD) of approximately 1 Á. These positions correspond to residues that make contacts to base pairs ±lto 6in each DNA half-site (see below). The conformations of the more distant ends of the P-strands and connecting turns are more poorly conserved, displaying RMSD values of over 3 Á for DNA-contacting residues. Similar alignments of intein-associated endonuclease domains indicate a more diverged structure of the P-sheet motifs.

In contrast to the LAGLIDADG enzymes, which contain a relatively compact structure in which DNA-binding and catalytic activities are intimately connected, the HNH and GIY-YIG homing endonuclease families have been shown by sequence analyses and by structural comparisons to display bipartite structures with separable catalytic and DNA-binding domains (Dalgaard et al. 1997; Derbyshire et al. 1997; VanRoey et al. 2001, 2002; Sitbon and Pie-trokovski 2003; Shen et al. 2004). These enzymes often share common DNA-binding domain structures, which may indicate a common ancestral origin for a useful and reuseable binding domain. For example, both the GIY-YIG enzyme I-TevI and the HNH enzyme I-Hmul share a common helix-turn-helix motif at their C-termini that is critical for DNA recognition and binding (VanRoey et al. 2001; Shen et al. 2004). This pattern of swapping structural domains (which are usually part of tandemly arranged functional regions) is generally not observed for the LAGLIDADG family. However, recent analyses of homing endonuclease sequence alignments indicate that, in rare cases, the core fold of LAGLIDADG enzymes can be tethered to additional functional domains involved in DNA binding, usually termed NUMODS (nuclease-as-sociated modular DNA-binding domains; Sitbon and Pietrokovski 2003). For example, a single copy of a canonical NUMOD1 region is found downstream (C-terminal) from the LAGLIDADG core of the intron-associated gene product of ORF Q0255 in yeast. This motif is similar to a conserved region of the bacterial sigma54-activator DNA-binding protein, and its C-terminal 15 amino acids are also similar to the N-terminal helix of typical helix-turn-helix (HTH) DNA-binding domains (Wintjens and Rooman 1996). In HTH domains, this helix is responsible for sequence-specific interactions with DNA.

0 0

Post a comment