Eukaryotic Promoters Are More Complex

It is clear that the signals in DNA which control transcription in eukaryotic cells are of several types. Two types of sequence elements are promoter-proximal. One of these defines where transcription is to commence along the DNA, and the other contributes to the mechanisms that control how frequently this event is to occur. For example, in the thymidine kinase gene of the herpes simplex virus, which utilizes transcription factors of its mammalian host for gene expression, there is a single unique transcription start site, and accurate transcription from this start site depends upon a nucleotide sequence located 32 nucleotides upstream from the start site (ie, at -32) (Figure 37-7). This region has the sequence of TATAAAAG and bears remarkable similarity to the functionally related TATA box that is located about 10 bp upstream from the prokaryotic mRNA start site (Figure 37-5). Mutation or inactivation of the TATA box markedly reduces transcription of this and many other genes that contain this consensus cis element (see Figures 37-7, 37-8). Most mammalian genes have a TATA box that is usually located 25-30 bp upstream from the transcription start site. The consensus sequence for a TATA box is TATAAA, though numerous variations have been characterized. The TATA box is bound by 34 kDa TATA binding protein (TBP), which in turn binds several other proteins called TBP-associated factors (TAFs). This complex of TBP and TAFs is referred to as TFIID. Binding of TFIID to the TATA box sequence is thought to represent the first step in the formation of the transcription complex on the promoter.

A small number of genes lack a TATA box. In such instances, two additional cis elements, an initiator sequence (Inr) and the so-called downstream promoter element (DPE), direct RNA polymerase II to the promoter and in so doing provide basal transcription starting from the correct site. The Inr element spans the start

Direction of transcription

Coding strand 5' Template strand 3'

AGCCCGC TCGGGCG

GCGGGCT CGCCCGA

TTTTTTTT AAAAAAAA

Coding strand 5' Template strand 3'

AAAAAAAA U UUUUUU-3'

C G RNA transcript

Figure 37-6. The predominant bacterial transcription termination signal contains an inverted, hyphenated repeat (the two boxed areas) followed by a stretch of AT base pairs (top figure). The inverted repeat, when transcribed into RNA, can generate the secondary structure in the RNA transcript shown at the bottom of the figure. Formation of this RNA hairpin causes RNA polymerase to pause and subsequently the p termination factor interacts with the paused polymerase and somehow induces chain termination.

Figure 37-7. Transcription elements and binding factors in the herpes simplex virus thymidine kinase (tk) gene. DNA-dependent RNA polymerase II binds to the region of the TATA box (which is bound by transcription factor TFIID) to form a multicomponent preinitiation complex capable of initiating transcription at a single nucleotide (+1). The frequency of this event is increased by the presence of upstream c/'s-acting elements (the GC and CAAT boxes). These elements bind trans-acting transcription factors, in this example Sp1 and CTF (also called C/EBP, NF1, NFY). These c/s elements can function independently of orientation (arrows).

Figure 37-7. Transcription elements and binding factors in the herpes simplex virus thymidine kinase (tk) gene. DNA-dependent RNA polymerase II binds to the region of the TATA box (which is bound by transcription factor TFIID) to form a multicomponent preinitiation complex capable of initiating transcription at a single nucleotide (+1). The frequency of this event is increased by the presence of upstream c/'s-acting elements (the GC and CAAT boxes). These elements bind trans-acting transcription factors, in this example Sp1 and CTF (also called C/EBP, NF1, NFY). These c/s elements can function independently of orientation (arrows).

-Regulated expression -

"Basal" expression

Distal regulatory elements

Promoter proximal -elements

Promoter

Other regulatory elements

and repressor (-) elements

Promoter proximal elements

TATA

Coding region

Figure 37-8. Schematic diagram showing the transcription control regions in a hypothetical class II (mRNA-producing) eukaryotic gene. Such a gene can be divided into its coding and regulatory regions, as defined by the transcription start site (arrow; + 1). The coding region contains the DNA sequence that is transcribed into mRNA, which is ultimately translated into protein. The regulatory region consists of two classes of elements. One class is responsible for ensuring basal expression. These elements generally have two components. The proximal component, generally the TATA box, or Inr or DPE elements direct RNA polymerase II to the correct site (fidelity). In TATA-less promoters, an initiator (Inr) element that spans the initiation site (+1) may direct the polymerase to this site. Another component, the upstream elements, specifies the frequency of initiation. Among the best studied of these is the CAAT box, but several other elements (Sp1, NF1, AP1, etc) may be used in various genes. A second class of regulatory c/s-acting elements is responsible for regulated expression. This class consists of elements that enhance or repress expression and of others that mediate the response to various signals, including hormones, heat shock, heavy metals, and chemicals. Tissue-specific expression also involves specific sequences of this sort. The orientation dependence of all the elements is indicated by the arrows within the boxes. For example, the proximal element (the TATA box) must be in the 5' to 3' orientation. The upstream elements work best in the 5' to 3' orientation, but some of them can be reversed. The locations of some elements are not fixed with respect to the transcription start site. Indeed, some elements responsible for regulated expression can be located either interspersed with the upstream elements, or they can be located downstream from the start site.

site (from —3 to +5) and consists of the general consensus sequence TCA+1 G/T T T/C which is similar to the initiation site sequence per se. (A+1 indicates the first nucleotide transcribed.) The proteins that bind to Inr in order to direct pol II binding include TFIID. Promoters that have both a TATA box and an Inr may be stronger than those that have just one of these elements. The DPE has the consensus sequence A/GGA/T CGTG and is localized about 25 bp downstream of the +1 start site. Like the Inr, DPE sequences are also bound by the TAF subunits of TFIID. In a survey of over 200 eukaryotic genes, roughly 30% contained a TATA box and Inr, 25% contained Inr and DPE, 15% contained all three elements, while ~30% contained just the Inr.

Sequences farther upstream from the start site determine how frequently the transcription event occurs. Mutations in these regions reduce the frequency of transcriptional starts tenfold to twentyfold. Typical of these DNA elements are the GC and CAAT boxes, so named because of the DNA sequences involved. As illustrated in Figure 37-7, each of these boxes binds a protein, Sp1 in the case of the GC box and CTF (or C/EPB,NF1,NFY) by the CAAT box; both bind through their distinct DNA binding domains (DBDs). The frequency of transcription initiation is a consequence of these protein-DNA interactions and complex interactions between particular domains of the transcription factors (distinct from the DBD domains—so-called activation domains; ADs) of these proteins and the rest of the transcription machinery (RNA polymerase II and the basal factors TFIIA, B, D, E, F). (See below and Figures 37-9 and 37-10). The proteinDNA interaction at the TATA box involving RNA polymerase II and other components of the basal transcription machinery ensures the fidelity of initiation.

Together, then, the promoter and promoter-proximal cis-active upstream elements confer fidelity and frequency of initiation upon a gene. The TATA box has a particularly rigid requirement for both position and orientation. Single-base changes in any of these cis elements have dramatic effects on function by reducing the binding affinity of the cognate trans factors (either TFIID/TBP or Sp1, CTF, and similar factors). The spacing of these elements with respect to the transcription start site can also be critical. This is particularly true for the TATA box Inr and DPE.

A third class of sequence elements can either increase or decrease the rate of transcription initiation of eukary-otic genes. These elements are called either enhancers or repressors (or silencers), depending on which effect they have. They have been found in a variety of locations both upstream and downstream of the transcription start site and even within the transcribed portions of some genes. In contrast to proximal and upstream promoter elements, enhancers and silencers can exert their effects when located hundreds or even thousands of bases away from transcription units located on the same chromosome. Surprisingly, enhancers and silencers can function in an orientation-independent fashion. Literally hundreds of these elements have been described. In some cases, the sequence requirements for binding are rigidly constrained; in others, considerable sequence variation is

Figure 37-9. The eukaryotic basal transcription complex. Formation of the basal transcription complex begins when TFIID binds to the TATA box. It directs the assembly of several other components by protein-DNA and protein-protein interactions. The entire complex spans DNA from position -30 to +30 relative to the initiation site (+1, marked by bent arrow). The atomic level, x-ray-derived structures of RNA polymerase II alone and of TBP bound to TATA promoter DNA in the presence of either TFIIB or TFIIA have all been solved at 3 Â resolution. The structure of TFIID complexes have been determined by electron microscopy at 30 Â resolution. Thus, the molecular structures of the transcription machinery are beginning to be elucidated. Much of this structural information is consistent with the models presented here.

-50 I

-30 I

I

r

1

+30 I

+50 I

Figure 37-9. The eukaryotic basal transcription complex. Formation of the basal transcription complex begins when TFIID binds to the TATA box. It directs the assembly of several other components by protein-DNA and protein-protein interactions. The entire complex spans DNA from position -30 to +30 relative to the initiation site (+1, marked by bent arrow). The atomic level, x-ray-derived structures of RNA polymerase II alone and of TBP bound to TATA promoter DNA in the presence of either TFIIB or TFIIA have all been solved at 3 Â resolution. The structure of TFIID complexes have been determined by electron microscopy at 30 Â resolution. Thus, the molecular structures of the transcription machinery are beginning to be elucidated. Much of this structural information is consistent with the models presented here.

Rate of transcription i ccaa~t1

Rate of transcription i ccaa~t1

Rate of transcription

Figure 37-10. Two models for assembly of the active transcription complex and for how activators and coacti-vators might enhance transcription. Shown here as a small oval is TBP, which contains TFIID, a large oval that contains all the components of the basal transcription complex illustrated in Figure 37-9 (ie, RNAP II and TFIIA, TFIIB, TFIIE, TFIIF, and TFIIH). Panel A: The basal transcription complex is assembled on the promoter after the TBP subunit of TFIID is bound to the TATA box. Several TAFs (coactivators) are associated with TBP. In this example, a transcription activator, CTF, is shown bound to the CAAT box, forming a loop complex by interacting with a TAF bound to TBP. Panel B: The recruitment model. The transcription activator CTF binds to the CAAT box and interacts with a coactivator (TAF in this case). This allows for an interaction with the preformed TBP-basal transcription complex. TBP can now bind to the TATA box, and the assembled complex is fully active.

allowed. Some sequences bind only a single protein, but the majority bind several different proteins. Similarly, a single protein can bind to more than one element.

Hormone response elements (for steroids, T3, reti-noic acid, peptides, etc) act as—or in conjunction with— enhancers or silencers (Chapter 43). Other processes that enhance or silence gene expression—such as the response to heat shock, heavy metals (Cd2+ and Zn2+), and some toxic chemicals (eg, dioxin)—are mediated through specific regulatory elements. Tissue-specific expression of genes (eg, the albumin gene in liver, the hemoglobin gene in reticulocytes) is also mediated by specific DNA sequences.

Diabetes 2

Diabetes 2

Diabetes is a disease that affects the way your body uses food. Normally, your body converts sugars, starches and other foods into a form of sugar called glucose. Your body uses glucose for fuel. The cells receive the glucose through the bloodstream. They then use insulin a hormone made by the pancreas to absorb the glucose, convert it into energy, and either use it or store it for later use. Learn more...

Get My Free Ebook


Post a comment