Introduction

Homing endonucleases are intron- and intein-encoded proteins that initiate the mobility of their particular host elements. They recognize an intron-or intein-less version of their host gene and introduce a double-strand break in the DNA. This break is then repaired by copying the intron- or intein-plus allele, using the hosts cellular machinery. This results in a gene-conversion event whereby both alleles become intron- or intein-plus (Belfort et al. 2002).

Homing endonucleases can be classified into four distinct families, based on the presence of conserved sequence elements: the LAGLIDADG, GIY-YIG, His-Cys box, and HNH families (Belfort et al. 2002). However, the His-Cys box and HNH families have been hypothesized to constitute a single ppa-Me family, and recent structural data support this classification (Kuhlmann et al. 1999; Shen et al.2004).

The GIY-YIG endonucleases, the second most numerous family and the focus of this chapter, were first identified through the presence of the sequence GIY-X10/11-YIG in intron-encoded proteins of filamentous fungi and bacteriophage T4 (Michel and Dujon 1986; Cummings et al. 1989). As additional sequences became available, it became clear that these motifs were shared with other proteins, including intergenic endonucleases (Belfort and Perlman 1995). A detailed computational analysis of proteins that contain the GIY-YIG motif showed that family members share several additional sequence elements (Kowalski et al. 1999). Together, these elements form a conserved

V. Derbyshire (e-mail: [email protected]) P. Van Roey (e-mail: [email protected])

Wadsworth Center, New York State Department of Health, Center for Medical Sciences, 150 New Scotland Avenue, Albany, New York 12208, USA

Nucleic Acids and Molecular Biology, Vol. 16 Marlene Belfort et al. (Eds.) Homing Endonucleases and Inteins © Springer-Verlag Berlin Heidelberg 2005

module of 70-100 amino acids, containing up to five distinct sequence motifs (Fig. 1A). These motifs range from 7 to 17 amino acids in length and all but one include at least one highly conserved residue. Motif A contains the signature GIY and YIG elements, motif B an absolutely conserved Arg residue, motif D a conserved Glu, and motif E a highly conserved Asn. Motif C is less well

YQIK-NTLNNKVYVG

A K D F EKRWKRHF

CATALYTIC DOMAIN

DNA-BINDING DOMAIN

Fig. 1. a LOGOS (Schneider and Stephens 1990; Henikoff et al. 1995) representation of the GIY-YIG sequence module, based on those presented by Kowalski et al. (1999), but updated with sequences from the literature and sequence databases. Beneath each logo motif is the sequence of I-TevI, with its amino acid position in the protein, b Cartoon presentation of GIY-YIG endonuclease I-TevI and its homing site. The enzyme is represented as a red dumbbell. The DNA shown to be contacting the protein in footprinting experiments is white. CS Cleavage site; IS intron insertion site conserved and is actually absent in approximately one-third of the proteins. Work on I-TevI as a model GIY-YIG enzyme has shown that the conserved Arg and Glu residues (Arg 27 and Glu 75) are essential for catalytic activity (Derbyshire et al. 1997; Kowalski et al. 1999). These and other data have led to the idea that the GIY-YIG module forms a "catalytic cartridge" that associates with a variety of DNA-binding domains, giving the individual GIY-YIG endonucleases their sequence specificity (Derbyshire et al. 1997). This concept has been borne out by structural and functional studies that show that these enzymes are assembled of individual modules similar to beads on a string.

0 0

Post a comment