Reporter Protein Reconstitution

Fluorescent or bioluminescent reporter proteins have proven to be of great use for detecting a variety of cellular activities (Grimm 2004). A typical example is the reporter gene assay with a firefly or Renilla luciferase reporter gene. Activation of transcription factors leads to gene expression of the luciferase, which leads to a detectable light signal in the presence of the substrate, d-luciferin or coelenterazine. Green fluorescent protein (GFP) and its variants are also used as a marker of gene expression in living cells and living subjects (Zhang et al. 2002) because GFPs form their own chromophores without external substrates (Nishiuchi et al. 1998). Reporter gene assays have also been applied to detect protein-protein and protein-DNA interactions, by use of yeast or mammalian two-hybrid assays (Fields and Song 1989; Shioda et al. 2000). One of the drawbacks of the reporter gene assay is that transcription factor activation occurs only in the nucleus, and therefore it is impossible to detect protein-protein interactions outside the nucleus (Ozawa and Umezawa 2002).

To overcome this drawback, novel reporters have been developed by our group, in which protein splicing generates a reconstituted reporter protein from two split protein fragments (Ozawa et al. 2000). One example is reconstitution of split GFP. The structure of GFP is composed of eleven P-strands that form a barrel structure with short a-helices forming lids on each end (Ormo et al. 1996; Yang et al. 1996; Brejc et al. 1997). The fluorescence-active center of GFP is located inside the barrel. When the GFP is dissected between amino acid positions 128 and 129, located at the end of the sixth P-strand of GFP, the fluorescence is completely lost. However, insertion of the VDE in-tein at this position results in the ligation of the amino- and carboxy-termi-nal fragments of GFP by protein splicing (Fig. 2). The ligated fragments fold correctly to form its fluorophore, and recover its fluorescence in vitro and in vivo. The basic concept of reconstitution by protein splicing is of general use for any reporter protein. Reconstitutions of split firefly luciferase and split Renilla luciferase by protein splicing have also been demonstrated (Ozawa et al. 2001a; Paulmurugan et al. 2002).

Fig. 2. Reconstitution of split GFP by protein splicing. A single polypeptide, composed of 128 amino acid residues of N-GFP, 454 residues of VDE and 110 residues of C-GFP, undergoes protein splicing and thereby N- and C-terminal fragments of GFP ligate by a peptide bond. The EGFP fragment thus formed folds correctly and its fluorophore is formed

Fig. 2. Reconstitution of split GFP by protein splicing. A single polypeptide, composed of 128 amino acid residues of N-GFP, 454 residues of VDE and 110 residues of C-GFP, undergoes protein splicing and thereby N- and C-terminal fragments of GFP ligate by a peptide bond. The EGFP fragment thus formed folds correctly and its fluorophore is formed

Inteins for Split-Protein Reconstitutions and Their Applications 3.1 Detection of Protein-Protein Interactions

In living cells, protein-protein interactions constitute essential regulatory steps that modulate the activity of signaling pathways. Identification of the interactions and characterization of their physiological significance is one of the main goals of current research in different fields of biology. Towards this goal, several technologies have been developed for detecting protein-protein interactions without the need to disrupt living cells. We describe herein a method for detecting protein-protein interactions based on reporter protein reconstitution and some potential applications of the technique.

The basic principle of detecting protein-protein interactions using the VDE intein is shown in Fig. 3 (Ozawa et al. 2000). An N-terminal fragment of VDE (N-VDE; 1-184 amino acids) is fused to an N-terminal fragment of GFP (N-GFP; 1-128 amino acids), and the C-terminal fragment of VDE (C-VDE; 389-454 amino acids) is fused to the rest of GFP (C-GFP; 129-239 amino acids). Each of these fusion proteins is linked to a protein of interest (protein A) and its target (protein B). When an interaction occurs between the two proteins of interest, the N- and C-VDE fragments are brought into close proximity and undergo correct folding, which induces a splicing event. The N-GFP and C-GFP become linked by a peptide bond, and thereby the mature GFP forms its fluorophore with an emission maximum at 510 nm. The extent of the protein-protein interaction can be evaluated by measuring the magnitude of fluorescence intensity generated by the formation of GFP. In a proof of principle, calmodulin (CaM) and its target peptide (M13) were used to show that their interaction in E. coli results in the formation of fluorescent GFP (Ozawa et al. 2000).

On the basis of this design, we have demonstrated a bacterial screening and selection system with several advantages (Ozawa et al. 2001b). Several bacterial one- and two-hybrid systems have been proposed, where there is a common principle: when the proteins interact, they trigger transcriptional activation of a reporter gene to produce a signal protein that is accumulated in the bacteria. Unlike the earlier protein interaction assays, the split-GFP system in bacteria involves the reconstitution of GFP, and does not require that a reporter gene is translated into its protein or that an enzyme substrate be present. This will make the method more generally useful in eukaryotic cells and allow the interactions to be screened in the cytosol, intracellular organelles or at the inner-membrane level.

Using the same concept as the GFP reconstitution system, a luciferase reconstitution system has been developed for monitoring protein-protein interactions in mammalian cells (Ozawa et al. 2001a). Firefly luciferase oxidizes its substrate d-luciferin to result in light emission (bioluminescence), which enables background-free and high-sensitivity detection. X-ray analysis

Fig. 3. A scheme for detecting protein-protein interactions by split GFP reconstitution. Ribbon diagrams of the N-terminal half of VDE (1-184 amino acids) and the C-terminal half of VDE (389-454 amino acids) are each connected to N-GFP and C-GFP fragments, respectively. Interacting proteins A and B are linked to opposite ends of the split VDE. Interaction between protein A and protein B accelerates the folding of N- and C-VDE and protein splicing occurs. The N- and C-GFP are thereby linked together by a peptide bond to yield the GFP fluorescence

Fig. 3. A scheme for detecting protein-protein interactions by split GFP reconstitution. Ribbon diagrams of the N-terminal half of VDE (1-184 amino acids) and the C-terminal half of VDE (389-454 amino acids) are each connected to N-GFP and C-GFP fragments, respectively. Interacting proteins A and B are linked to opposite ends of the split VDE. Interaction between protein A and protein B accelerates the folding of N- and C-VDE and protein splicing occurs. The N- and C-GFP are thereby linked together by a peptide bond to yield the GFP fluorescence of firefly luciferase revealed that the 3-D structure is folded into two compact domains. The C-terminal portion of the enzyme is separated from the larger N-terminal globular domain by a wide cleft, which is the location of the active site of the enzyme (Fig. 4; Conti et al. 1996). N- and C-terminal fragments of DnaE are connected to N- and C-terminal fragments of firefly luciferase, respectively. The free ends of the DnaE intein fragments are then fused to a pair of proteins of interest and the resulting fusions are expressed in mammalian cells. Upon interaction between the two proteins, the two DnaE fragments are brought close enough to fold and initiate splicing, restoring the luciferase by formation of a peptide bond. Reconstitution of luciferase is mon-

Split firefly luciferase (inactive)

Fig. 4. 3-D structure of the split firefly luciferase. Split firefly luciferase, composed of N-terminal (1437 amino acids) and C-termi-nal (438-544 amino acids) domains, does not possess bioluminescence activity. When the N- and C-terminal domains are linked together by protein splicing, its bioluminescence activity can be recovered itored by its bioluminescence, of which intensity is again proportional to the number of interacting protein pairs. As a model system, an interaction between the insulin receptor substrate 1 and the SH2 domain of phosphatidyli-nositol-3-kinase has been demonstrated by using the split firefly luciferase reconstitution (Ozawa et al. 2001a).

Luciferase reconstitution by protein splicing was used to non-invasively monitor protein-protein interactions in living mice (Paulmurugan et al. 2002). The split reporters of firefly luciferase were fused to a pair of test proteins, MyoD and ID, which are known to interact strongly. Both proteins were transiently expressed in cultured cells, and, thereafter, the cells were implanted onto the backs of living mice. After injection of d-luciferin, the mice were placed in a light-shielding chamber, and photons emitted from the backs of the mice were collected by cooled CCD camera for a period of 1 min. The My-oD-ID interaction is induced by injecting TNF-a. The TNF-a-induced mouse showed significantly higher luminescence when compared with the mouse not receiving TNF-a.

These split GFP and split luciferase approaches have an advantage in that protein interactions can be monitored anywhere in the cells, whereas the traditional two-hybrid approaches are limited to interactions only in the nucleus. In addition, GFP or luciferase accumulates in a target cell until it degrades, and evidence of the interaction is thereby integrated in the cell. Imaging interacting protein pairs in living subjects may pave the way to functional pro-teomics in whole animals and provide a new tool for evaluating new pharmaceuticals targeted to modulate protein-protein interactions.

The basic concept of the above approach for detecting protein-protein interactions (Ozawa et al. 2000) was further applied to control protein splicing with a small molecule, rapamycin (Mootz and Muir 2002; Mootz et al. 2003). Rapamycin, a potent immunosuppressive agent, is known to bind a pair of proteins: the FK506-binding protein (FKBP) and the FKBP-rapamycin-asso-ciated protein (FRAP; Ho et al. 1996). The FKBP and the binding domain of FRAP (FRB) were used as a pair of protein interaction partners. The proteins were fused to a pair of N- and C-terminal fragments of an artificially split VMA intein (Fig. 5A). Upon addition of rapamycin, protein frans-splicing was triggered and the intein was excised. The two exteins were ligated to produce a new polypeptide with a potentially new function. This on/off switching of protein splicing with rapamycin was confirmed in vitro and in vivo.

In contrast to the frans-splicing based on the FKBP-FRB interaction, Buskirk et al. (2004) created another on/off switching system of protein splicing based on an intramolecular interaction, i.e. protein conformational change. A ligand-binding domain (LBD) of the human estrogen receptor (ER) binds a synthetic small molecule, 4-hydroxytamoxifen (4-HT), with high affinity. The ER was connected to the N- and C-terminal halves of RecA intein, yielding a 424-residue RecA(N)-ER-RecA(C) fusion (Fig. 5B). Using the fusion protein as a template, a cDNA library of mutated genes was obtained by error-prone polymerase chain reaction (PCR). The cDNAs were connected with the split GFP reporter, and screened based on the fluorescence in the presence and absence of 4-HT. A mutant RecA(N)-ER-RecA(C) fusion protein showed 4-HT-dependent activation of protein splicing. Using this intein, a clear-cut post-translational activation of KanR, LacZ, Ade2p and GFP upon addition of 4-HT was demonstrated.

Estrogen receptor

Conformational change

^ Protein splicing

protein (active)

Fig. 5. Scheme for small molecule-induced conditional splicing, a FKBP and FRB are linked to N-VDE and C-VDE, respectively. Interaction between FKBP and FRB induced by rapamycin results in the folding of N- and C-VDE and protein splicing occurs. The N- and C-proteins (exteins) are linked together by a peptide bond, b N- and C-RecA are connected at the N- and C-terminal ends of the ligand-binding domain (LBD) in the estrogen receptor (ER), respectively. The fusion protein contains four mutations, V376A and R521G in LBD, and A34V and H41L in the RecA intein. Upon binding 4-hydroxytamoxifen (4-HT),the ER undergoes a conformational change, the N- and C-RecA fold correctly, and protein splicing occurs. The N- and C-proteins (exteins) are linked together by a peptide bond. As the exteins, KanR, LacZ, Ade2p and GFP were reconstituted by 4-HT-induced protein splicing

3.2 Protein Splicing in Intracellular Organelles

3.2.1 Identification of Organelle-Localized Proteins from cDNA Libraries

One of the most distinct features of eukaryotic cells, especially mammalian cells, is the compartmentalization of each protein. Protein localization is tightly bound to function, such that preferential localization of a protein is often essential for its function. Therefore, functional assays aimed at characterizing the cellular localization of proteins are very important for understanding complicated protein networks. The technique for identifying proteins localized in organelles largely relies on the isolation of compartments through cell fractionation and electrophoresis, combined with mass spectrometry. This biochemical method is useful for systematic identification, but it de-

Fig. 5. Scheme for small molecule-induced conditional splicing, a FKBP and FRB are linked to N-VDE and C-VDE, respectively. Interaction between FKBP and FRB induced by rapamycin results in the folding of N- and C-VDE and protein splicing occurs. The N- and C-proteins (exteins) are linked together by a peptide bond, b N- and C-RecA are connected at the N- and C-terminal ends of the ligand-binding domain (LBD) in the estrogen receptor (ER), respectively. The fusion protein contains four mutations, V376A and R521G in LBD, and A34V and H41L in the RecA intein. Upon binding 4-hydroxytamoxifen (4-HT),the ER undergoes a conformational change, the N- and C-RecA fold correctly, and protein splicing occurs. The N- and C-proteins (exteins) are linked together by a peptide bond. As the exteins, KanR, LacZ, Ade2p and GFP were reconstituted by 4-HT-induced protein splicing pends on the yield and purity of the intracellular organelle, and therefore the technique can be problematic for organelles that are difficult to isolate (Westermann and Neupert 2003).

Protein reconstitution technology opens a new avenue of genetic approaches for identifying mitochondrial proteins from large-scale cDNA libraries (Ozawa et al. 2003). The strategy is based on reconstitution of enhanced GFP (EGFP) by protein splicing with the DnaE intein (Fig. 6). A tandem fusion protein, containing a mitochondrial targeting signal (MTS) fused to the C-termi-nal fragments of the DnaE intein and EGFP, localizes to the mitochondrial matrix in mammalian cells. cDNA libraries generated from mRNAs are genetically fused to a sequence encoding the N-terminal fragments of EGFP and the DnaE intein. If test proteins expressed from the cDNA libraries contain a functional MTS, the fusion products translocate into the mitochondrial matrix, bringing the N- and C-terminal fragments of the DnaE intein close enough to fold correctly. This initiates protein splicing, thus linking the EGFP fragments with a peptide bond. This method has the advantage that only "mitochondria-positive" clones yield a fluorescence signal. These cells can then be isolated by fluorescence-activated cell sorting (FACS). Relevant genes can subsequently be identified by PCR and DNA sequencing. In this work, 27 proteins, including 10 novel proteins, were identified as mitochondrial (Ozawa et al. 2003).

This basic concept for identifying mitochondrial proteins was extended for designing a new indicator for identifying proteins translocating into the endoplasmic reticulum. A tandem fusion protein, containing an endoplasmic reticulum-targeting signal (ERTS) and the C-terminal fragments of DnaE and EGFP, localizes in the lumen of the endoplasmic reticulum. If test proteins expressed from the cDNA library contain an ERTS, the fusion products translocate into the endoplasmic reticulum, generating mature EGFP. Using the same procedure as the mitochondrial case, the fluorescent cells are collected by FACS, and relevant genes are identified. In this work, 110 non-redundant proteins targeting to the endoplasmic reticulum were identified (Ozawa et al. 2005). An important point of these genetic methods for identifying organelle-localized proteins is that they do not require purification of target organelles or separation of each protein. Therefore, the proteins localized in other organelles do not contaminate. Moreover, the methods provide accurate identification of gene products that are compartmentalized in the mitochondria or endoplasmic reticulum. This genetic approach will also allow the identification of proteins localized in the nucleus and peroxisomes by replacing the signal sequence attached to the C-terminal fragments of DnaE and EGFP with the targeting signals of each.

Fig. 6. Scheme showing how to identify mitochondrial proteins from cDNA libraries, a Principle for detecting translocation of a test protein into mitochondria using protein splicing of split-EGFP. EGFPc is connected with DnaEc and the mitochondrial targeting signal (MTS), which is predominantly localized in mitochondria. A test protein is connected with the EGFPn and DnaEn, which is expressed in the cytosol. When the test protein translocates into mitochondria, the DnaEn interacts with DnaEc, and protein splicing results. The EGFPn and EGFPc are linked together by a peptide bond, and the reconstituted EGFP recovers its fluorescence, b Strategy for identifying mitochondrial proteins. BNLlMEmito cells, which are expressing EGFPc-DnaEc, are infected with retrovirus libraries. Fluorescent cells are sorted by FACS, and subcloned into individual cells. cDNAs integrated in the genome are extracted by PCR, and identified by DNA sequencing

Fig. 6. Scheme showing how to identify mitochondrial proteins from cDNA libraries, a Principle for detecting translocation of a test protein into mitochondria using protein splicing of split-EGFP. EGFPc is connected with DnaEc and the mitochondrial targeting signal (MTS), which is predominantly localized in mitochondria. A test protein is connected with the EGFPn and DnaEn, which is expressed in the cytosol. When the test protein translocates into mitochondria, the DnaEn interacts with DnaEc, and protein splicing results. The EGFPn and EGFPc are linked together by a peptide bond, and the reconstituted EGFP recovers its fluorescence, b Strategy for identifying mitochondrial proteins. BNLlMEmito cells, which are expressing EGFPc-DnaEc, are infected with retrovirus libraries. Fluorescent cells are sorted by FACS, and subcloned into individual cells. cDNAs integrated in the genome are extracted by PCR, and identified by DNA sequencing

3.2.2 Detection of Protein Nuclear Transport

Protein movement inside living cells is an important dynamic event in eu-karyotic cells. A typical example is protein nuclear transport, which plays a key role in regulating gene expression in response to extracellular signals. In order to non-invasively detect this molecular event in living subjects, a probe molecule using Renilla luciferase (Rluc) reconstitution by protein splicing has been developed (Fig. 7; Kim et al. 2004). Rluc has desirable features for a monomeric protein: small size (36 kDa), strong luminescence, and ATP is not nec-

Fig. 7. Strategy for detecting protein translocations, a Principle of monitoring translocation of a particular protein (X) into the nucleus using protein splicing of split-Renilla luciferase (Rluc). RLuc-N (1-229 aa) is connected with DnaE-N and the nuclear localization signal (NLS), which is predominantly localized in the nucleus. DnaE-C is connected with RLuc-C (230-311 aa) and a protein X, which is localized in the cytosol. When the protein X translocates into the nucleus, the DnaE-C interacts with DnaE-N, and protein splicing results. RLuc-N and RLuc-C are linked by a peptide bond, and the reconstituted RLuc recovers its bioluminescence activity, b DHT-dependent translocation of AR in the mouse brain. The COS-7 cells expressing the probe were implanted in the forebrain of the mice. Of mouse groups 1-4, groups 1 and 2 were stimulated with 1% DMSO, whereas groups 3 and 4 were stimulated with procymidone and PCB, respectively. After DHT stimulation, the mice were imaged with a cooled CCD camera

Fig. 7. Strategy for detecting protein translocations, a Principle of monitoring translocation of a particular protein (X) into the nucleus using protein splicing of split-Renilla luciferase (Rluc). RLuc-N (1-229 aa) is connected with DnaE-N and the nuclear localization signal (NLS), which is predominantly localized in the nucleus. DnaE-C is connected with RLuc-C (230-311 aa) and a protein X, which is localized in the cytosol. When the protein X translocates into the nucleus, the DnaE-C interacts with DnaE-N, and protein splicing results. RLuc-N and RLuc-C are linked by a peptide bond, and the reconstituted RLuc recovers its bioluminescence activity, b DHT-dependent translocation of AR in the mouse brain. The COS-7 cells expressing the probe were implanted in the forebrain of the mice. Of mouse groups 1-4, groups 1 and 2 were stimulated with 1% DMSO, whereas groups 3 and 4 were stimulated with procymidone and PCB, respectively. After DHT stimulation, the mice were imaged with a cooled CCD camera essary for its activity (Lorenz et al. 1991). In addition, its substrate, coelenter-ate luciferin (coelenterazine), rapidly penetrates through cell membranes, making it suitable for in vivo imaging. The luciferase is split into N- and exterminai fragments, and they are connected to N- and C-terminal fragments of the DnaE intein, respectively. The C-terminal fragment is permanently localized in the nucleus, while the N-terminal fragment is fused to a test protein in the cytosol. If the test protein translocates into the nucleus, the N-termi-nal Rluc can interact with the C-terminal Rluc in the nucleus, and full-length Rluc is reconstituted by protein splicing. In order to demonstrate the utility of this indicator, a well-known nuclear receptor, androgen receptor (AR), was examined. This receptor translocates from the cytosol into the nucleus upon binding to 5a-dihydrotestosterone (DHT). Quantitative analysis of AR translocation was shown with various exo- and endogenous chemical compounds in vitro. Moreover, the indicator enabled non-invasive in vivo imaging of AR translocation in the brains of living mice with a cooled CCD imaging system. This rapid and quantitative analysis in vitro and in vivo will provide a wide variety of applications for screening pharmacological or toxicological compounds and testing them in living animals (Kim et al. 2004).

3.2.3 trans-Splicing in the Chloroplast in Plant Cells

Split-reporter protein reconstitution by protein splicing with the DnaE intein (Ozawa et al. 2001a, 2003) has been further applied to growing safer transgenic plants. Genetic crossing in transgenic plants by pollen has been reported and is known to be a potential environmental risk (Bergelson et al. 1998). Genetically modified plants with a transgene integrated in the nucleus may be able to hybridize with a sexually compatible species to give rise to unexpected hybrids and their progeny.

Chin et al. (2003) have demonstrated the reconstitution of a herbicide-resistance protein from split genes in tobacco plants. An N-terminal portion of the herbicide-resistance gene 5-enol-pyruvylshikimate-3-phosphate synthase (EPSPS) containing a chloroplast localization signal fused to the N-terminal DnaE intein fragment (EPSPSn-DnaEn) was integrated into the nuclear DNA of a tabacum plant. The remaining EPSPS gene fragment (EPSPSc) fused to the C-terminal DnaE intein fragment (EPSPSc-DnaEc) was integrated into the chloroplast genome by homologous recombination. The full-length EPSPS protein was generated in the chloroplast after translocation of the PESP-Sn-DnaEn fusion protein fragment to the plastid followed by trans-splicing. The resulting transgenic plants displayed improved resistance to the herbicide N-(phosphonomethyl)glycine (glyphosate), when compared with wildtype plants.

The point is that the chloroplasts of most crops are always maternally transmitted, and pollen-mediated gene transfer is limited to the gene in the nucleus. Therefore, pollen can spread only the nucleus-based fragment of the split gene of EPSPS far and wide, which is inactive. Thus, putting one part of a new gene into the chloroplast protects against transfer of the full gene to other plants. This system may be broadly applicable to all crop species for transgene containment, and a combination of approaches may prove most effective for environmentally safe transgenic crops.

3.3 Screening of Potential Antimycobacterial Agents

The split GFP reporter (Ozawa et al. 2000) was applied to the screening of antimycobacterial agents. Inhibitors of protein splicing could become highly specific antimycobacterial antibiotics because among many bacteria associated with a human host only mycobacteria include intein genes in their genomes (Paulus 2003). For example, the RecA DNA repair enzyme of Mycobacterium tuberculosis contains an intein. Protein splicing by the RecA intein restores the DNA-repairing enzyme, which is essential for the mycobacterial growth and survival (Davis et al. 1992).

Gangopadhyay et al. (2003) developed a high-throughput screening system for general protein-splicing inhibitors in vitro. The principle is based on split GFP reconstitution by protein splicing by the RecA intein. Insertion of the RecA intein at amino acid residue 129 of a GFP variant (GFPuv) causes the resulting fusion protein to be expressed entirely as inclusion bodies (Fig. 8). When the GFPuv-intein fusion protein is solubilized with urea and renatured, the fusion protein is able to undergo efficient protein splicing to yield GFPuv, leading to formation of the fluorescent chromophore. The formation of fluorescent GFPuv is thus sensitive to the presence of anfz-splicing, and therefore potentially antimycobacterial, agents.

Conventional in vivo screening systems have the disadvantage that they involve monitoring the growth of bacteria and are therefore subject to interference by non-specific antibacterial agents. In contrast to the in vivo system, this in vitro system allows the examination of the inhibition of the splicing reaction specifically, and is therefore not susceptible to false signals that inhibit other reactions essential for growth.

Fig. 8. In vitro screening system for protein splicing of RecA intein. A fusion protein composed of N-GFPuv, RecA intein, and C-GFPuv is expressed in E. coli, forming inclusion bodies. The inclusion bodies are solubi-lized with urea, refolded in the presence of zinc, and then possible inhibitors are added. Splicing is induced by addition of EDTA to neutralize zinc inhibition of protein splicing. The fluorescence intensity of the reconstituted GFP is measured. This method eliminates the isolation of compounds that simply interfere with refolding but are not intein-specific

Delicious Diabetic Recipes

Delicious Diabetic Recipes

This brilliant guide will teach you how to cook all those delicious recipes for people who have diabetes.

Get My Free Ebook


Post a comment