Architecture of Inteins A Two Domain Organization

Despite their divergent primary sequences, the structures of Pl-Scel and PI-Pful (Fig. 1) reveal a common architecture that consists of two domains with independent functions. In Pl-Scel, the splicing domain (residues 1-180 and 411-454) is formed by segments from the amino- or N-terminal and carboxy-or C-terminal ends and contains residues involved in protein splicing. The endonuclease domain (residues 181-410) harbors the catalytic sites involved in double-stranded DNA cleavage. This domain is also present in intron-encod-ed homing endonucleases. The functional independence of the two domains was demonstrated by deleting the endonuclease domain in Pl-Scel and PI-

MtuI, the recA intein of Mycobacterium tuberculosis, which did not cause significant reduction in protein-splicing activity in vivo, although kinetics of the reaction were not determined (Chong and Xu 1997; Derbyshire et al. 1997). Virtually all inteins that are observed to be associated with homing endonucleases belong to the LAGLIDADG family (except for Ssp GyrB and Npu gyrB of cyanobacteria, which are associated with an HNH endonuclease domain). These inteins should share a common architecture with PI-SceI and PI-PfuI.

The free-standing HO endonuclease, which is responsible for the mat-ing-type switch in Saccharomyces cerevisiae, has a strong sequence similarity (50%) to PI-SceI. A model of this protein has been built using the PI-SceI structure as a template (Bakhrat et al. 2004). The HO endonuclease contains

Endonuclease

Fig. 1. Domain architecture of LAGLIDADG homing endonucleases, inteins (above) and intron-encoded homing endonucleases (below). The structures depicted represent the unbound state except for I-Scel for which only the DNA-bound structure has been determined. The domains and the p-saddle DNA-binding motif are labeled in PI-SceI. Two active sites are located at the end of the LADLIDADG helices

Fig. 1. Domain architecture of LAGLIDADG homing endonucleases, inteins (above) and intron-encoded homing endonucleases (below). The structures depicted represent the unbound state except for I-Scel for which only the DNA-bound structure has been determined. The domains and the p-saddle DNA-binding motif are labeled in PI-SceI. Two active sites are located at the end of the LADLIDADG helices all the motifs that typify inteins but does not auto-splice, presumably because of mutations in essential residues in the intein domain active site (experiments to recover splicing activity by reverting these mutations have not been reported). In addition, it contains a 53 residue Zn-finger motif in its C-termi-nal end that is essential for endonuclease activity.

The two-domain architecture of intein-associated homing endonucleas-es and the functional independence of their domains suggest that they originated by the invasion of a homing endonuclease gene into a gene that encoded for a self-splicing mini-intein. In this association, both components would benefit from each other (Duan et al. 1997). The intein provides a convenient means of excising itself from the host protein without causing deleterious effects while the homing endonuclease function allows for their mobility and persistence. To this end, an artificial bifunctional intein was made by inserting a homing endonuclease, the intron-encoded I-Crel, into the GyrA mini-in-tein of Mycobacterium xenopi (Fitzsimons Hall et al. 2002).

2.1 The Splicing Domain

The splicing domain resembles a horseshoe and consists almost exclusively of P-strands that have a two-fold pseudo-symmetric arrangement (Fig. 1). This suggests that self-splicing inteins originated from the gene duplication and fusion of two genes that encoded proteins with cleavage activity (Hall et al. 1997). The splicing domain was later shown to have the same fold as the Hedgehog C-terminal autoprocessing domain from Drosophila melanogaster (PDB entry 1AT0; Hall et al. 1997), which indicated a common ancestor of both inteins and Hedgehog signaling domains. This fold is now termed the Hint module (for Hedgehog and intein; see Dassa and Pietrokovski, this Vol.). The structures of the mini-intein Mxe GyrA (PDB entry 1AM2; Klabunde et al. 1998) of Mycobacterium xenopi that lacks an endonuclease domain and the splicing domain of the Ssp dnaB intein of Synechocystis (PDB entry 1MI8; Ding et al. 2003) also contain a Hint fold. Recently, new types of Hint domains which harbor protein-splicing activity but are not inteins have been found in bacteria (Amitai et al. 2003). The conserved residues that are involved in protein splicing in Pl-Scel (the N-terminal Cysl, His 79, and C-terminal His 453 and Asn 454) are located within the hydrophobic core of the Hint domain in close proximity to each other. The corresponding residues in Pl-Pful, Mxe GyrA mini-intein, and Ssp DnaB intein are superimposable on Pl-Scel, indicating a conservation in the geometry of the splicing catalytic site. The protein-splicing domains in Pl-Scel and Pl-Pful contain unique subdomain insertions. For example, in Pl-Scel, the DNA recognition region (DRR, residues 90-133) is inserted in the N-terminal region of the splicing domain. This sub-

domain is composed of three antiparallel P-strands and a single a-helix and contains residues that are critical for DNA binding (He et al. 1998). Moreover, the loop that resides at the interface of the splicing and endonuclease domains is also unique to PI-SceI and is also involved in DNA binding (Wende et al. 2000). In contrast, PI-PfuI contains a stirrup subdomain (residues 339414) at the interface between the endonuclease and the C-terminal region of the splicing domain. This subdomain contains a three-stranded P-sheet with positively charged residues lining its surface, which suggests a likely DNA-binding activity (Ichiyanagi et al. 2000).

2.2 Endonuclease Domain

The endonuclease domains of PI-SceI and PI-PfuI have an a/p fold with an internal pseudo-two-fold axis (Duan et al. 1997) that is described in detail by Chevalier et al. in the chapter on free-standing LAGLIDADG endonucleases. The domain has been proposed to emerge from gene duplication and fusion as well. The domain is characterized by the presence of two conserved LAGLIDADG sequence motifs that are located along the two helices that define the interface between the N-terminal and C-terminal subdomains. It is similar to the structures of intron-encoded homing endonucleases such as the ho-modimeric I-Crel (PDB entry 1AF5; Heath et al. 1997) and I-Msol (PDB entry 1M5X; Chevalier et al. 2003), in which each monomer provides a LAGLIDADG motif, and the monomeric I-Dmol (PDB entry 1B24; Silva et al. 1999) and I-Scel (Moure et al. 2003), which contain two motifs. The structures of I-Crel and I-Scel are depicted in Fig. 1.

0 0

Post a comment