Rapid screening of novel dehydrogenases

Even when highly reliable computer modeling techniques exist for dehydrogenases, the need for rapid screening of dehydrogenases will remain, both to verify the predictions experimentally and to determine basic kinetic parameters (substrate

Km and kcat values) and stereochemical properties. Ideally, screening should be carried out under reaction conditions that mimic the final process as closely as possible. This step is often one of the most time-consuming phases of process development, and improvements here can have significant impacts.

Traditionally, libraries of enzymes - either naturally occurring or deliberately created mutants - are created in bacterial cells and the resulting clones are screened (Fig. 2, left). This is a time-tested strategy, but it suffers from two

1) Extract DNA

1) Extract DNA

2) PCR-amplify full-length gene

AACGGTAACTACTTTCC GGCTTAACCCCCCTACG AAAATTCCGGTACCCAA GGGTAAACCCTGATGAT TTAAAAACGGCGGCCCA CCAATTTTTGATATTATT

AACGGTAACTACTTTCC GGCTTAACCCCCCTACG AAAATTCCGGTACCCAA GGGTAAACCCTGATGAT TTAAAAACGGCGGCCCA CCAATTTTTGATATTATT

PSnlS&T

Chemically synthesize full-length gene

^^ Full-length gene of interest

1) Clone into plasmid

2) Transform suitable host cells

3) Sequence insert

4) Purify protein

Coupled transcription/ translation cocktail

Figure 2: Comparison of cloning and expression methods. In the conventional strategy (left), dehydrogenase genes obtained by PCR amplification of the original source DNAs are cloned into overexpression plasmids and verified by sequencing. Those with the desired structure are individually transformed into suitable host strains and the proteins are obtained, either as crude extracts or as purified samples. In the proposed streamlined approach (right), full-length dehydrogenase genes obtained by chemical synthesis are used directly in coupled transcription/translation reactions to obtain the proteins of interest.

key problems. First, it is relatively labor-intensive. This is particularly true when the collection consists of wild-type dehydrogenase genes derived from genome sequence databases. These are normally PCR-amplified from the appropriate cells or genomic DNA, cloned into plasmid vectors and sequenced to ensure fidelity prior to screening. In some cases, the expressed proteins are purified prior to screening, while in other cases, whole cells or crude extracts are employed. Limited library sizes due to bacterial transformation efficiencies are the other major disadvantage of this approach. This constraint is particularly important when examining collections of dehydrogenases created by random or semi-random mutagenesis.

There has been intense interest in high-throughput methods for protein expression and characterization.16-18 Many of these studies have focused on defining protein-protein interactions or on small molecule binding. Directly screening for enzyme activity is much less common, since this requires proper protein folding and post-translational modifications.19 Fortunately, dehydrogenases rarely require such modifications, which considerably simplifies their production in heterologous expression systems. Martzen et al. pioneered the idea of genome-wide protein libraries by creating a complete set of yeast overexpression strains for every ORF identified in the S. cerevisiae genome.20 We have used a dehydrogenase subset of this collection to profile the substrate- and stereoselectivity patterns of many key yeast reductases21-23 and used some for synthetic applications.24-26 Despite the success of this methodology, the effort involved convinced us that a new approach that is amenable to larger number of candidate genes and requires significantly less hands-on work was needed for future progress.

One way to solve the key problems associated with current dehydrogenase screening methods would be to eliminate the gene cloning and bacterial transformation steps altogether (Fig. 2, right). In this strategy, full-length synthetic dehydrogenase genes would be added to coupled transcription/translation cocktails to produce the correctly folded protein directly.27 28 The genes would be chemically synthesized from genome sequence data, thereby eliminating the need for access to the original source organism. This also allows for codon optimization for the cell extract used for protein production. If desired, the proteins could be immobilized in situ by appending an affinity tag that binds to a corresponding site on the container's surface. This would allow for different reaction conditions in the activity screening step, which could be conducted in the same container. A variety of analytical methods could be coupled to this system; the only requirement is adequate sensitivity. Enantiomer-specific isotopic labeling can be used to probe stereoselectivity at the same time.29 Proteins with desirable properties could then be cloned from a sample of the full-length synthetic DNAs. Such an approach would be much more rapid than current methods of screening dehydrogenases, and the upper limit of library members is limited only by the scale of DNA synthesis. Finally, this strategy naturally lends itself to automation and miniaturization, so that customized dehydrogenase "chips" could be produced after computational examination of genomic sequence databases to identify those whose active site structures were most likely to accommodate the substrate of interest.

Given the potential of the strategy described above to dramatically speed biocatalyst discovery and optimization, why has not it already been used? All the required elements have been demonstrated, albeit separately. One practical problem is that the efficiency for synthesizing long DNA strands (>150 nucleotides) is relatively low. This means that full-length genes must be prepared in pieces that must be assembled later. The cost of gene synthesis is another issue. Total synthesis of an average-sized gene costs approximately $3000 USD. For even a modest-sized collection of 100 dehydrogenases, this would require resources beyond the reach of a single academic laboratory. On the other hand, individual genes need only be synthesized once, since even small-scale methods yield sufficient DNA for a nearly limitless number of transcription/translation reactions. Seen in this light, even at the current costs of gene synthesis, the speed advantage of building libraries directly from genome database data makes such a strategy worthy of consideration as a way to enhance the number of candidates for solving chemical problems.

Was this article helpful?

0 0

Post a comment