Origin of the Hint Domains

3.1 Features of the Progenitor Hint Domain

Hint families diverged from each other very early in evolution. This is evident from their sequence diversity and phylogenetic dispersion. The presence of Hint families in different phylogenetic domains is a very strong indicator that the Hint progenitor was present before the split between prokaryotes and eu-karyotes (Pietrokovski 2001). In the simplest scenario, the progenitor Hint domain did not have additional domains inserted in it, being similar in size to current Hog-Hint and BIL domains (Fig. 4A).

All characterized Hint families apparently catalyze an N-S/O acyl shift of the peptide bond between their N-terminus and host flank (Paulus 2000; Dassa et al. 2004a). This is supported by the presence of the N-terminal sequence motifs (N1 and N3) that are necessary for this reaction in all Hint families (Amitai et al. 2003; Fig. 4A). We thus assume that the progenitor Hint domain also had these sequence features and activity. However, C-termini of current Hint families are diverse by sequence and seem responsible for different reactions. Hence, it is unknown which, if any, of these reactions were present in the progenitor domain.

Current Hint domains have very different known biological roles. Best characterized are inteins, whose role seems limited to selfish propagation of their genes, and Hog-Hint domains that perform a crucial role in the maturation of their host proteins. Was the progenitor Hint domain a selfish element or did it have some beneficial role for its host cell? The Hint progenitor domain was probably already a complex fold, catalyzing several reactions. This domain was thus more likely to have arisen by a positive selection for some advantageous cellular role, rather than by the non-selective (spontaneous) appearance of a selfish function.

Fig. 4. Structure and active sites of Hint domains, a Two symmetrical subdomains of a Hog-Hint domain (Hall et al. 1997) are shown in red and green, with common Hint active-site residues as balls, and balls with stars marking specific active sites: the Hog-Hint sterol/adduct activating site [1AT0 Asp303], and the intein's C-terminal residue. A linear scheme of a Hint domain shows the subdomains below, with intein motifs as boxes. Beneath the scheme are amino-acid distributions in active sites and corresponding positions of the four Hint families. Each column points to its position in the scheme, b Two Hint subdomains are superimposed (top) and aligned in the linear scheme (bottom), c A representative secondary structure elements diagram of Hint domains, colored by subdomains with active sites indicated by balls

Fig. 4. Structure and active sites of Hint domains, a Two symmetrical subdomains of a Hog-Hint domain (Hall et al. 1997) are shown in red and green, with common Hint active-site residues as balls, and balls with stars marking specific active sites: the Hog-Hint sterol/adduct activating site [1AT0 Asp303], and the intein's C-terminal residue. A linear scheme of a Hint domain shows the subdomains below, with intein motifs as boxes. Beneath the scheme are amino-acid distributions in active sites and corresponding positions of the four Hint families. Each column points to its position in the scheme, b Two Hint subdomains are superimposed (top) and aligned in the linear scheme (bottom), c A representative secondary structure elements diagram of Hint domains, colored by subdomains with active sites indicated by balls

We present two possible beneficial roles for Hint progenitor domains, combinatorial trans-protein-splicing and the generation of protein variability. These roles are non-exclusive and other roles are possible too.

Trans-protein-splicing by split inteins was suggested to confer selective advantage to early organisms (Perler 1999; Pietrokovski 2001). A protein domain having one split intein portion could be ligated to either of several other proteins that include the complementary second portion of the split intein. Moreover, a few short protein regions, each including both a C-terminal intein part on the N-terminus and an N-terminal intein part on the C-terminus, would be able to ligate to each other in many different combinations, producing a large mixture of diverse linear and cyclic protein products. These processes would allow a primitive cell, with a small genome coding for relatively few protein domains, to produce many complex proteins without intricate enzymatic systems. Such proteins would enable the emergence of intricate cells that could support larger genomes that could directly code for large proteins.

The progenitor Hint domain could also enhance protein variability by the processes we suggest for present-day BIL domains. Each protein precursor could be in several states depending whether its Hint domain spliced, cleaved either end, cleaved both ends, ligated a molecule to one of its ends, or was inert. The different activities could be modulated by some signal or even be stochastic. Once more this could be of great benefit for early primitive cells that could only code for a limited number of proteins and did not possess complex systems for protein processing and degradation.

3.2 Emergence of the Progenitor Hint Domain

Hint domains have a pseudo-two-fold symmetry, being made up of two tandem subdomains that can be structurally superimposed (Hall et al. 1997). This is also apparent from subtle sequence similarity between the N2 and N4 motifs, present in all Hint families (Pietrokovski 1998). The originally described structural symmetry can be extended to the strand-loop-strand-loop C-terminal region of Hint domains (Amitai and Pietrokovski, unpubl. research). The progenitor Hint domain itself thus arose from a tandem duplication of a primordial subdomain, about 60 residues long, that exchanged short regions within the Hint fold by rearrangements of its gene (Hall et al. 1997). The active-site residues of both inteins and Hog-Hint domains are positioned asymmetrically on the two subdomains (Fig. 4B). For example, the catalytic N-terminal and N3 His residues are present in different subdomains, and each of their corresponding residues on the other subdomain is not a catalytic residue. This is notable since both these and adjacent residues are necessary for catalyzing the N-terminal acyl shift probably present in the progenitor Hint domain. For the primordial Hint subdomain to catalyze the acyl shift reaction, by itself or as a homodimer, the subdomain needed to include all active-site residues. This is not the current situation and thus either the active-site residues were differ ently distributed in the primordial Hint subdomain, or it was unlikely to catalyze the activities of present Hint domains.

Figure 1 summarizes the likely evolution of Hint domain families from a progenitor Hint subdomain, through the primordial Hint domain, to current Hint families. Each family adopted its activity according to its biological role. This led to the differences in activity, structure and sequence we can observe at present.

Many inteins include an EN domain, that seems to be accompanied by a DNA-binding domain. The relationship between the Hint domain and the EN and DNA-binding domains is dynamic. During evolution intein genes acquire, inactivate, lose and probably reacquire EN and DNA-binding coding regions (Gimble 2000,2001).

Inteins and A-type BIL domains have a C-terminal motif that allows them to cleave that end and ligate their flanks. However, the ligation reaction and its temporal position relative to cleavage of the C-terminus differ in the two families. This is probably related to the role of protein-splicing in each family and to its required fidelity. Inteins need to protein splice very efficiently to avoid negative selection. Cys/Ser/Thr residues at the +1 position of intein integration points allow such efficient splicing. Intein integration points also need to be in conserved positions of essential proteins, reducing the loss of intein genes due to genomic rearrangements and mutations.

Hog-Hint domains include a conserved Asp/His active-site residue and downstream domains that bind sterols and perhaps other adducts. This bi-do-main Hog unit might have the general role of processing its N-terminal flanking domain by cleaving it off while adding an adduct to the resulting C-termi-nus of the flank. Animal, nematode and red algae Hog domains diverged independently from a common ancestral Hog domain. The biological roles of Hog domains in nematode and red algae are not yet known.

BIL domains mainly differ in the sequence of their C-termini. A-type BIL domains have the same motif present in inteins but are flanked by diverse C-terminal flanking residues. The C-terminus motif of type-B BIL domains is different from that of inteins and A-type BIL domains, and of Hog-Hint domains. The domains are present in hyper-variable positions of non-conserved proteins in bacteria, and in polyubiquitin-like coding genes of ciliates. The role of both types of domain is hypothesized to benefit their hosts by various post-translational modifications.

Acknowledgements. This research was supported by The Israel Science Foundation, founded by The Israel Academy of Sciences and Humanities. S. Pietrokovski holds the Ronson and Harris Career Development Chair.

References

Abrahamsen M, Templeton T, Enomoto S, Abrahante J, Zhu G, Lancto C, Deng M, Liu C, Widmer G, Tzipori S, Buck G, Xu P, Bankier A, Dear P, Konfortov B, Spriggs H, Iyer L, Anantharaman V,Aravind L,KapurV (2004) Complete genome sequence of the apicom-plexan, Cryptosporidium parvum. Science 304:441-445 Amitai G, Belenkiy O, Pietrokovski S (2003) Distribution and function of new bacterial intein-like protein domains. Mol Microbiol 47:61-73 Amitai G, Dassa D, Pietrokovski S (2004) Protein-splicing of inteins with atypical glutamine and aspartate C-terminal residues. J Biol Chem 279:3121-3131 Aspock G,Kagoshima H,Niklaus G,Burglin TR (1999) Caenorhabditis elegans has scores of hedgehog-related genes: sequence and expression analysis. Genome Res 9:909-923 Belfort M, Roberts RJ (1997) Homing endonucleases: keeping the house in order. Nucleic

Acids Res 25:3379-3388 Butler G, Kenny C, Fagan A, Kurischko C, Gaillardin C, Wolfe KH (2004) Evolution of the MAT locus and its Ho endonuclease in yeast species. Proc Natl Acad Sci USA 101:16321637

Caspi J, Amitai G, Belenkiy O, Pietrokovski S (2003) Distribution of split DnaE inteins in cyanobacteria. Mol Microbiol 50:1569-1577 Dalgaard JZ, Klar AJ, Moser MJ, Holley WR, Chatterjee A, Mian IS (1997) Statistical modeling and analysis of the LAGLIDADG family of site-specific endonucleases and identification of an intein that encodes a site-specific endonuclease of the HNH family. Nucleic Acids Res 25:4626-4638 Dassa B, Haviv H, Amitai G, Pietrokovski S (2004a) Protein-splicing and auto-cleavage of bacterial intein-like domains lacking a C'-flanking nucleophilic residue. J Biol Chem 279:32001-32007

Dassa B, Yanai I, Pietrokovski S (2004b) New type of poly ubiquitin-like genes with intein-

like autoprocessing domains. Trends Genet 20:538-542 Duan X, Gimble FS, Quiocho FA (1997) Crystal structure of Pl-Scel, a homing endonuclease with protein-splicing activity. Cell 89:555-564 Duchaud E, Rusniok C, Frangeul L, Buchrieser C, Givaudan A, Taourit S, Bocs S, Boursaux-Eude C, Chandler M, Charles J, Dassa E, Derose R, Derzelle S, Freyssinet G, Gaudriault S, Medigue C, Lanois A, Powell K, Siguier P, Vincent R, Wingate V, Zouine M, Glaser P, Boemare N, Danchin A, Kunst F (2003) The genome sequence of the entomopathogenic bacterium Photorhabdus luminescens. Nat Biotechnol 21:1307-1313 Fsihi H, Vincent V, Cole ST (1996) Homing events in the gyrA gene of some mycobacteria.

Proc Natl Acad Sci USA 93:3410-3415 Gimble FS (2000) Invasion of a multitude of genetic niches by mobile endonuclease genes.

FEMS Microbiol Lett 185:99-107 Gimble FS (2001) Degeneration of a homing endonuclease and its target sequence in a wild yeast strain. Nucleic Acids Res 29(20):4215-4223 Gogarten JP, Senejani AG, Zhaxybayeva O, Olendzenski L, Hilario E (2002) Inteins: structure, function, and evolution. Annu Rev Microbiol 56:263-287 Haber J (1998) Mating-type gene switching in Saccharomyces cerevisiae. Annu Rev Genet 32:561-599

Hall TM, Porter JA, Young KE, Koonin EV, Beachy PA, Leahy DJ (1997) Crystal structure of a hedgehog autoprocessing domain: homology between hedgehog and self-splicing proteins. Cell 91:85-97

Hammerschmidt M, Brook A, McMahon AP (1997) The world according to hedgehog. Trends Genet 13:14-21

Hirata R, Ohsumk Y, Nakano A, Kawasaki H, Suzuki K, Anraku Y (1990) Molecular structure of a gene, VMA1, encoding the catalytic subunit of H(+)-translocating adenosine triphosphatase from vacuolar membranes of Saccharomyces cerevisiae. J Biol Chem 265:6726-6733

Isabelle S, Mamadou D, Jean-Michel M (2001) Distribution of GyrA intein in non-tuberculous mycobacteria and genomic heterogeneity of Mycobacterium gastri. FEBS Lett 508:121-125

Janssen R, Prpic NM, Damen WG (2004) Gene expression suggests decoupled dorsal and ventral segmentation in the millipede Glomeris marginata (Myriapoda: Diplopoda). Dev Biol 268:89-104

Jentsch S, Pyrowolakis G (2000) Ubiquitin and its kin: how close are the family ties? Trends Cell Biol 10:335-342

Kang D, Huang F, Li D, Shankland M, Gaffield W, Weisblat DA (2003) A hedgehog homolog regulates gut formation in leech (Helobdella). Development 130:1645-1657 Koonin EV (1995) A protein splice-junction motif in hedgehog family proteins. Trends

Biochem Sci 20:141-142 Koufopanou V, Goddard MR, Burt A (2002) Adaptation for horizontal transfer in a homing endonuclease. Mol Biol Evol 19:239-246 Liu XQ (2000) Protein-splicing intein: genetic mobility, origin, and evolution. Annu Rev Genet 34:61-76

Mann R, Beachy P (2004) Novel lipid modifications of secreted protein signals. Annu Rev Biochem 73:891-923

Matsuzaki M, Misumi O, Shin IT, Maruyama S, Takahara M, Miyagishima SY, Mori T, Nishida K, Yagisawa F, Yoshida Y, Nishimura Y, Nakao S, Kobayashi T, Momoyama Y, Higashiyama T, Minoda A, Sano M, Nomoto H, Oishi K, Hayashi H, Ohta F, Nishizaka S, Haga S, Miura S, Morishita T, Kabeya Y, Terasawa K, Suzuki Y, Ishii Y, Asakawa S, Takano H, Ohta N, Kuroiwa H, Tanaka K, Shimizu N, Sugano S, Sato N, Nozaki H, Ogasawara N, Kohara Y, Kuroiwa T (2004) Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D. Nature 428:653-657 Mills KV, Manning JS, Garcia AM, Wuerdeman LA (2004) Protein-splicing of a Pyrococcus abyssi intein with a C-terminal glutamine. J Biol Chem 279:20685-20691 Nederbragt AJ, van Loon AE, Dictus WJ (2002) Evolutionary biology: hedgehog crosses the snail's midline. Nature 417:811-812 Okuda Y, Sasaki D, Nogami S, Kaneko Y, Ohya Y, Anraku Y (2003) Occurrence, horizontal transfer and degeneration of VDE intein family in Saccharomycete yeasts. Yeast 20:563-573

Paulus H (2000) Protein-splicing and related forms of protein autoprocessing. Annu Rev Biochem 69:447-496

Perler FB (1999) A natural example of protein trans-splicing. Trends Biochem Sci 24:209211

Perler FB, Olsen GJ, Adam E (1997) Compilation and analysis of intein sequences. Nucleic

Acids Res 25:1087-1093 Pietrokovski S (1994) Conserved sequence features of inteins (protein introns) and their use in identifying new inteins and related proteins. Protein Sci 3:2340-2350 Pietrokovski S (1998) Modular organization of inteins and C-terminal autocatalytic domains. Protein Sci 7:64-71 Pietrokovski S (2001) Intein spread and extinction in evolution. Trends Genet 17:465-472 Porter JA, Ekker SC, Park WJ, von Kessler DP, Young KE, Chen CH, Ma Y, Woods AS, Cotter RJ, Koonin EV, Beachy PA (1996) Hedgehog patterning activity: role of a lipophilic modification mediated by the carboxy-terminal autoprocessing domain. Cell 86:21-34

Romanelli A, Shekhtman A, Cowburn D, Muir TW (2004) Semisynthesis of a segmental isotopically labeled protein-splicing precursor: NMR evidence for an unusual peptide bond at the N-extein-intein junction. Proc Natl Acad Sei USA 101:6397-6402 Saves I, Laneelle MA, Daffe M, Masson JM (2000) Inteins invading mycobacterial RecA

proteins. FEBS Lett 480:221-225 Shimeld S (1999) The evolution of the hedgehog gene family in chordates: insights from amphioxus hedgehog. Dev Genes Evol 209:40-47 Shingledecker K, Jiang S-Q, Paulus H (1998) Molecular dissection of the Mycobacterium tuberculosis RecA intein: design of a minimal intein and of a trans-splicing system involving two intein fragments. Gene 207:187-195 Southworth MW, Adam E, Panne D, Byer R, Kautz R, Perler FB (1998) Control of protein-

splicing by intein fragment reassembly. EMBO J 17:918-926 Southworth MW, Benner J, Perler FB (2000) An alternative protein-splicing mechanism for inteins lacking an N-terminal nucleophile. EMBO J 19(18):5019-5026 Sun W, Yang J, Liu X-Q (2004) Synthetic two-piece and three-piece split inteins for protein trans-splicing. J Biol Chem 279:35281-35286 Takatori N, Satou Y, Satoh N (2002) Expression of hedgehog genes in Ciona intestinalis embryos. Mech Dev 116:235-238 Vigneron N, Stroobant V, Chapiro J, Ooms A, Degiovanni G, Morel S, van der Br├╝ggen P, Boon T, van den Eynde B (2004) An antigenic peptide produced by peptide splicing in the proteasome. Science 304:587-590 Waters E, Hohn MJ, Ahel I, Graham DE, Adams MD, Barnstead M, Beeson KY, Bibbs L, Bola-nos R, Keller M, Kretz K, Lin X, Mathur E, Ni J, Podar M, Richardson T, Sutton GG, Simon M, Soll D, Stetter KO, Short JM, Noordewier M (2003) The genome of Nanoarchaeum equitans: insights into early archaeal evolution and derived parasitism. Proc Natl Acad Sei USA 100:12984-12988 Wu H, Hu Z, Liu X-Q (1998) Protein trans-splicing by a split intein encoded in a split DnaE

gene of Synechocystis sp. PCC6803. Proc Natl Acad Sei USA 95:9226-9231 Ziebuhr W, Ohlsen K, Karch H, Korhonen T, Hacker J (1999) Evolution of bacterial pathogenesis. Cell Mol Life Sei 56:719-728

0 0

Post a comment