Much of our understanding of His-Cys box homing endonuclease structure and function is derived from studies of I-Ppol. Although very high sequence divergence within the family challenges generalization of the information garnered from these studies, it is likely that common themes of protein structure stabilization, DNA recognition and catalytic mechanism exist.
3.1 Structure of a His-Cys Box Endonuclease: I-Ppol
Currently, I-Ppol is the only member of the His-Cys box family of homing endonucleases with a known structure (Flick et al. 1998). This small enzyme (163 aa) forms a stable homodimer, and X-ray structures have been solved of the DNA-bound and apo forms of the protein. I-Ppol displays a fold of mixed a/p topology with the dimensions of 25x35x80 A (Fig. 2A). The extended length of the dimer allows the protein to fully interact with its 18-bp DNA homing site. Each monomer of I-Ppol contains three antiparallel P-sheets flanked by two long a-helices and a long carboxy-terminal tail. The folded structure is stabilized by two zinc ions that are 15 A apart in the monomer. The central interface of the enzyme dimer is small and highly solvated, with subunit contacts that bury only 700 A2 of surface area. The C-terminal tails (residues 146-
I-Dirl HVKGTARKKQ DHGLVGLKDLLDTLEYHKLQTS—DRPRKTIKRSHTVLHLKR-----------RLAQLAGQRVSC F
I-Ni 11 ---MVTIR KMVRRLRK RL RIR5TTKRPKTITVKIDRESFKNGYDPLVKSIDYGYSKMA KITVAKCDRL5 KLîfN RKQ
I-NgrI ---MVT IKQM V RTLRN KV RASSTTK RPKTIT VKIDL ES FKN GY DPLVK Ni DYGYSKMAKITVPGNDRLS KLGN RKT
I-Njal ---MVS!KQKVTRLRNKL RS RSTRNGPKTITVKIDRKSFKNGYDPLVDTIDYGYS KMAKIT VN XNDQLAKLKNCKQ
I-NanI ---MVSIKQKVTRLRN KL RARSTLSGHKTITVKIDRKSFKNGY DPLVNTIDYGYSKMA KITVN KNDQLA KLKNC KK
I-Di rI G-YLTA3VLGMSLEG KGGYLAN LVNSGPMPTA EWF HEQDVKL-TKEYGMC LMS PR— PHHQMRIQTSKGRAG RFKIQ
I-NgrI -----ARNVFH DWLSTRKGEKGRSGKQK PFCFDELEKLDVC K—HEFGEC LIG AATKTKSGLRFVFtWKKGSDSYVH
I-NanI -----AVNIFNEWLSNRKGDKGRSGKQKPYCFDELKKLDVC K--H E FGECLIGAANKT KS G FKVR FKN DKGSDSYVH
I-Ppol ---------MALTNAQILAVIDS WE ETVGQFPVITH HVPLGGGLQGTLHC Y EIPLA-APYGVGFAKNGPTRWQ----
I -Dir I ASALQAVLVNNPSSHDELVEQVKGLIDRETTTFHSSHLCKGDGSCMELKHTLRV?AQTMLADHELCPAF-------V
I-NgrI H VS AFAMS TNEDCIHS VEKLETVSSS KK DS EARTISHLC-GNGGCS RPGB LRIEKKSVNDER-THC -HFLLRRSQSL
I-Njal H VS VFAXSTCEN CIH S RKMLETVSSS KK DPDARTISHLC-GNGGCARPGH LRIEK KS VNDER-THC-H FLLRRSQSV
I PpoI YKRTINQWHRWGSHTVPFLLE- PD-NIHGKTCTASHLC- HNTRCHNPltf LCWES LDDNKGR-NWC-----------
I -Di r I VIYGN LVN LCTCS AJiEGRQCLVPGRR FNFANKARVY APLMTTFLKPKAGTGIVNK
I-NitX NQSEH:RLACPES ~-PK-CFVNHYK ï N TPY Y
I-Njal AQSEMIRLACPHT---PR-CFVN LYKIN KPY Y
I-NanI AQSEMIRLAC PHT---PR-CFVNIYKINKPYY
Fig. 2. (a) The structure of I-Ppol bound to its DNA homing site, (b) Close-up of the two I-Ppol zinc-binding sites created by conserved histidine and cysteine residues, (c) Sequence alignment of the known active His-Cys box homing endonucleases. Conserved residues are indicated in bold. * designates residues involved in zinc binding. @ designates active site residues. The C-terminal dimerization domain of I-Ppol is underlined
163) are domain-swapped and extend 34 A across opposite monomers, burying an additional 900 A2 per subunit (Fig. 2C).
Eight of the residues conserved among the known His-Cys box homing endonucleases are involved in zinc coordination (Fig. 2B, C). In contrast to other zinc-binding motifs (e.g. RING finger) in DNA-binding proteins that are primarily associated with DNA recognition, the I-Ppol zinc-binding motifs play a central role in stabilizing the folded structure of the enzyme. In fact, the zinc ions appear to substitute for a tightly packed hydrophobic core, allowing this rather small protein to have an extended footprint for DNA binding. The first bound zinc ion is coordinated by a cluster of three cysteine and one histidine ligands. One side chain (Cys 41)is contributed by an amino-ter-minal 0-strand (02) and the remaining three side chains (Cys 100, Cys 105 and His 110) are donated from a short loop between 07 and 08. The second zinc ion is coordinated by a short cluster of four side chains (Cys 125, Cys 132, His 134 and Cys 138).
The remaining conserved residues of the His-Cys box homing endonucleases are positioned in the active site of the enzyme (Fig. 2C). A conserved as-paragine coordinates a divalent metal ion that in turn contacts the 3' hydroxyl of the cleaved DNA and four bound water molecules. The metal is positioned to interact with the scissile phosphate and cannot position a water molecule for in-line nucleophilic attack. Instead, a conserved histidine residue (His 98) is positioned to activate a water molecule.
As noted, the zinc binding and active site residues found in the I-Ppol structure are conserved among the His-Cys box family members. However, outside of these sequences, the family is highly divergent (Fig. 2C). Most of the other His-Cys box enzymes contain an additional long N-terminal extension that make them nearly 50% larger than I-Ppol. Also, the C-terminal tail of I-Ppol that serves as a dimerization motif does not appear to be conserved in most of the other His-Cys box enzymes. As all these enzymes cleave nearly symmetric homing sites (see below for details), it is likely they also dimerize but may have evolved alternate means for dimer stabilization.
The characterized His-Cys box endonucleases recognize extended (up to 20 nt) pseudopalindromic homing sites. Members of the family cleave to generate 4-5 nt 3' overhangs, indicating cleavage across the minor groove. Unlike restriction sites, homing sites may vary in sequence at many of their individual nucleotide positions while still being recognized and cleaved. Site preference for I-Ppol was explored by an in vitro selection strategy using a plasmid library containing partially randomized cleavage sites (Argast 1998). Select ed sites revealed that I-Ppol tolerates base-pair substitutions at several positions within the homing site with some positions being more stringently recognized.
I-Ppol was crystallized with a 20-nt DNA duplex containing a perfect palindrome variant of its recognition sequence (Flick et al. 1998). The twofold symmetry of the recognition site is mirrored in the homodimer structure of the enzyme with each monomer contacting either half of the palindrome. I-Ppol employs an antiparallel P-sheet motif that adopts a curvature and twist complementary to the DNA and forms an extended interface with the DNA major groove. The enzyme places five alternating side chains from the second P-sheet (P3-P4-P5) of each enzyme monomer in the major groove of each DNA half-site to contact base pairs 5-9 (Fig. 3). Additional DNA contacts are made in the center of the complex within the minor groove and with several phosphate groups in the cleavage site.
Unlike most restriction endonucleases, I-Ppol does not fully read the DNA by contacting all the possible hydrogen bond donors and acceptors present-
Fig. 3. DNA recognition by I-Ppol. (a) Close-up of the antiparallel p-strands in the major groove of the recognition half-site, (b) Schematic representation of contacts made by I-Ppol in a recognition half-site with the sequence-specific hydrogen bond donors and acceptors presented in the major groove. Where alternate base pairs can be recognized by I-Ppol both nucleotide identities are shown
Fig. 3. DNA recognition by I-Ppol. (a) Close-up of the antiparallel p-strands in the major groove of the recognition half-site, (b) Schematic representation of contacts made by I-Ppol in a recognition half-site with the sequence-specific hydrogen bond donors and acceptors presented in the major groove. Where alternate base pairs can be recognized by I-Ppol both nucleotide identities are shown ed by the homing sequence (Fig. 3). It directly satisfies only 8 of 24 hydrogen-bonding possibilities in the major groove (Jurica and Stoddard 1999). These limited contacts explain much of the variability allowed within I-Ppol's target site. Positions where I-Ppol makes multiple contacts with a given base pair were highly conserved for a single base-pair orientation in the homing site selection experiments (Argast 1998). For example, the bipartite contact between Gin 63 and Ade +6 observed in the structure nicely explains why an adenine was selected at this position. Conversely, positions with a single protein contact were found with two of the four possible base-pair identities.
The screen for homing site variants revealed an absolute requirement for A-T base pairs in the central four positions of the homing site (Argast 1998). However, no sequence-specific contacts between protein and DNA were observed in this region in the co-crystal structures (Flick et al. 1998; Galburt et al. 2000). The preference for these base pairs probably reflects the DNA sequence requirements of the severely bent conformation of the target across these base pairs. The distortion results in a compaction of the major groove and an expansion of the minor groove such that at the center of the homing site the minor groove is 5Á wider than the major groove. The bent conformation increases the buried surface between protein and DNA, thus stabilizing the complex. At the same time, the bend results in the proper positioning of the substrate phosphodiester bonds relative to the two active sites. I-Ppol distorts the DNA substrate to widen the minor groove, thus separating the two scissile phosphates and facilitating the positioning of two sets of active site residues across the usually narrow minor groove.
Does I-Ppol induce bending or selectively bind bent DNA? Circular permutation experiments suggest that the homing site does not exhibit a stable kink; however, the structure of I-Ppol in the apo form is virtually unchanged from the DNA-bound form, indicating that the protein does not bind unbent B-form DNA (Galburt et al. 2000). One residue, Leu 116, is positioned near the DNA bend in the complex crystal structure and makes an edge-on contact to Ade +2 and a face-on contact to Gua +3. Substitution of an alanine at this position (L116A) has a dramatic effect on both binding and catalysis (Galburt et al. 2000). The structure of an L116A/DNA complex is similar to the wild-type complex except the DNA is less bent and the central AATT base pairs are disordered. This suggests that the surface area buried by Leu 116 and nucleotides bracketing the bend are crucial for stable binding of the deformed conformation of the DNA.
It is interesting to note that a subfamily of His-Cys box endonucleases, typified by I-Njal, has been shown to generate 5-base, 3' overhangs (Elde et al. 1999). In these cases, the homing site scissile phosphates would be further apart in unperturbed B-form DNA (15 Á instead of 10 Á), and their corresponding homing endonucleases do not contain residues that correspond to
Leu 116 in I-Ppol. This increased spacing of active sites and the corresponding lack of a leucine residue make it likely that the DNA-binding mode of these enzymes differs to the mode observed for I-Ppol. In conclusion, even homing endonucleases within the same family appear to have evolved a variety of DNA-binding modes and active site chemistries that accomplish the same biological function.
Enzymatic cleavage by I-Ppol has been studied both biochemically and structurally. I-Ppol cleaves its target site with a kcat/Km of 108 M"1 s_1 and is activated by many divalent metal ions (in order of activity: Mg2+>Mn2+>Ca2+= Co2+>Ni2+>Zn2+; Lowery et al. 1992; Ellison and Vogt 1993; Wittmayer and Raines 1996). A series of I-PpoI-DNA complex structures have been solved along the nucleolytic reaction pathway (Galburt et al. 1999). The enzyme-substrate (ES) complex of I-Ppol has been trapped by substitution of a monovalent cation for an activating divalent cation in the active site and by substitution of alanine for the His 98 general base (H98A). Both of these substitutions inhibit DNA cleavage. The enzyme-product (EP) complex has been solved in the presence of magnesium.
In the structure of the ES complex, the bound Mg2+ ion is coordinated in a six-fold geometry by the side-chain oxygen of Asn 119, the bridging 3' oxygen and a single non-bridging oxygen atom of the scissile phosphate, and three well-ordered water molecules. An ideal octahedral coordination geometry is observed in the cleaved product complex. In contrast, both structures of the trapped substrate complex demonstrate that the bond angle from the non-bridging phosphate oxygen, through the bound cation to the 3' bridging oxygen, is significantly strained.
The single bound metal ion in I-Ppol appears to serve three distinct roles in catalysis. Direct interaction of the bound metal ion with the scissile phosphate indicates that Mg2+ stabilizes the phosphoanion intermediate and the 3' hydroxylate leaving group. Second, a water molecule in the inner coordination sphere of the metal is appropriately positioned to donate a proton to the 3' hydroxyl leaving group. The metal ion decreases the pKa of this water molecule and accelerates proton transfer. Finally, the bound metal forms a geometrically strained octahedral complex with surrounding protein, DNA and solvent atoms that is relaxed after DNA bond cleavage.
3.3.2 Alignment and Activation of the Hydrolytic Water
In both ES complexes, an ordered water molecule is observed positioned for in-line attack on the scissile phosphate (Fig. 4A, B). Thus, the appearance and position of the bound solvent molecule are not simply a result of the H98A mutation, but are consistent features of the uncleaved enzyme-DNA complex. The 6N of His 98 is directly hydrogen-bonded to the water molecule. This his-tidine appears to act as a general base by activating the water molecule and may also participate in stabilization of the phosphoanion transition state.
Because the observed nucleophilic water molecule is not associated with a bound cation or any other electrophilic group, its pKa is likely to be higher than the metal-bound water nucleophiles that are postulated for enzymes
Fig. 4. I-Ppol active site structures in both substrate and product states, (a) Substrate form trapped by substituting sodium for magnesium, (b) Substrate form trapped by H98A mutation. (c) Product form h98a mutant
Fig. 4. I-Ppol active site structures in both substrate and product states, (a) Substrate form trapped by substituting sodium for magnesium, (b) Substrate form trapped by H98A mutation. (c) Product form such as BamHI or EcoRV. However, the pKa of an uncharged histidine residue is only about 6, so it would seem likely that such a side chain must be rendered a stronger base through an interaction with a hydrogen bond acceptor in order to effectively deprotonate a water molecule. In both substrate complexes the backbone carboxylate oxygen of Cys 105 is 2.8 A from the His 98 eN, and is positioned to form a linear hydrogen bond. The structure of the Serratia nuclease displays a similar active site architecture (Miller and Krause 1996). Here, the 6N of His 89, the putative general base, is similarly stabilized by Asn 106. However, there are currently no reported pH-dependence studies of the chemical step of the I-Ppol reaction, nor has the importance of this interaction been experimentally tested by mutation of Asn 106 in Serratia nuclease or by measurement of the pKa of His 98 in I-Ppol.
3.3.3 Conformational Changes and Transition State Stabilization
A series of conformational changes are observed in the active site as a result of DNA bond cleavage (Fig. 4C). The free 5'-phosphate moves by over 2.5 A from its position in the substrate complex and forms a 2.8-A electrostatic bond with a guanido nitrogen of Arg 61, which moves by -0.5 A. The movement of the 5'-phosphate disrupts the interaction between its non-bridging oxygen and the bound metal ion. A fourth well-resolved water molecule is added to the inner metal coordination sphere, which assumes a more ideal octahedral geometry. The metal ion does not move significantly upon cleavage, and maintains interactions with Asn 119 and the 3' oxygen leaving group of the cleaved phosphodiester bond.
These structures indicate that the phosphoanion transition state is stabilized through contacts with the bound metal ion and the imidazole ring of His 98. This contact exists in the ES complex as a polar interaction with the hy-drolytic water molecule and is maintained in the free 5'-phosphate group of the EP complex. Arg 61 does not appear to play a role in transition state stabilization, because the distance from this side chain to the scissile phosphate before bond cleavage is too long, at 5.5 A. Arg 61 does, however, appear to stabilize the final product complex, and thus may help to drive the reaction forward by inhibiting re-ligation on the enzyme.
All the important active site residues are conserved among the His-Cys box endonucleases, suggesting that even though their overall folded structure might differ, their active site architectures will be very similar. In addition to members of the His-Cys box family, there are examples of other nucleases with similar active site architectures. We have already mentioned the Serratia nuclease, but a recent structure of an HNH homing endonuclease family member reveals a similar active site geometry and chemistry.
Was this article helpful?