Transcription
Transcription (from Latin transcriptio = transcription; verb transkribieren), RNA synthesis, is the first step of gene expression, where-apart from certain RNA viruses-a DNA-dependent (deoxyribonucleic acid acts as a template) synthesis of ribonucleic acids occurs, catalyzed by DNA-dependent RNA polymerase, leading to the formation of messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA; see Fig. 1), and a number of other RNA species. Transcription is similar to DNA replication in that both are template-dependent syntheses of nucleic acids from nucleoside triphosphates (2'-deoxyribonucleoside-5'-triphosphates, ribonucleoside-5'-triphosphates). However, it differs in essential features: in transcription, only one DNA strand (template strand, coding strand) and only a small part of it are transcribed into RNA. Moreover, RNA polymerases lack the proofreading capability of DNA polymerases.
During transcription, the four ribonucleoside-5'-triphosphates-ATP (adenosine triphosphate), CTP (cytidine triphosphate), GTP (guanosine-5'-triphosphate), and UTP (uridine-5'-triphosphate)-are used as substrates and are successively incorporated as ribonucleoside monophosphates into the growing RNA chain from the 5' to the 3' end. In this way, the nucleotide sequences of genes (DNA) are copied as single RNA chains, i.e., rewritten from DNA to RNA-hence the term transcription.
Transcription is the first step in expressing the genetic information (gene expression) encoded in DNA and occurs in three phases: initiation, elongation, and termination. Especially for initiation and termination, additional proteins are usually required alongside RNA polymerase. In prokaryotes, for example, these are the cAMP-binding protein and sigma factors for initiation and the rho factor for termination (see Fig. 2). Elongation is, for example, influenced by the Nus-A protein. In eukaryotes, a variety of transcription factors mediate the initiation of transcription, with their binding to regulatory gene sequences being significantly influenced by the local chromatin organization (eu- and hetero-chromatin), the arrangement of nucleosomes, and DNA methylation.
Regulatory sequences on the DNA that determine the initiation point and frequency are called promoters. The binding of general transcription factors at the promoter forms the preinitiation complex; with the binding of RNA polymerase and a nucleoside triphosphate, the initiation complex is formed. The frequency and efficiency of transcription of individual genes or gene groups are regulated by a variety of signal structures. In eukaryotic and viral genes (viruses), transcription can be enhanced by so-called enhancers-DNA segments that can occur at almost any position (up to several kb [= kilobases; kilobase pairs] before, within, or after the coding region) and in any orientation relative to the gene. So-called silencers in the flanking region of eukaryotic and viral genes reduce the frequency of initiation. The signal structures mediating chain termination are called terminators (for more information on transcriptional control: differential gene expression, gene regulation, response element; antitermination, attenuator regulation, transcriptional antitermination).
Transcription occurs in a complex of DNA, RNA, and RNA polymerase, which is called the transcription bubble due to its characteristic structure under the electron microscope (see Fig. 3). Products of transcription are single-stranded RNA chains that are complementary to the coding strand of the DNA template or have a nucleotide sequence identical to the non-coding DNA strand (but: uracil instead of thymine and ribose instead of deoxyribose). Through processing, the primary transcripts are converted into the mature, usually shorter RNAs (mRNAs, rRNAs, or tRNAs), and in so-called mosaic genes (gene mosaic structure), this process is identical to splicing. Eukaryotic mRNAs are also modified by capping at the 5' end and polyadenylation at the 3' end.
In prokaryotes, the transcription of neighboring genes often results in a single RNA chain (polycistronic mRNA), as in the genes of the arabinose operon, galactose operon, and lactose operon. Moreover, translation usually occurs simultaneously with transcription (cotranscriptional translation). In eukaryotes, the transcription of nuclear genes is spatially and temporally separated from translation, which takes place in the cytoplasm. The resulting primary transcripts (hn-RNA) are processed in the nucleus and transported through the nuclear pore complex into the cytoplasm. In mitochondria, transcription occurs on both strands of the mtDNA (mitochondrial DNA). The transcripts correspond to the complete sequence of each mtDNA strand and are cleaved by RNases (ribonucleases) into the individual tRNAs, rRNAs, and mRNAs during transcription (mitochondrial RNA). Transcription in plastids occurs similarly. Many genes of the plastome are organized polycistronically as in prokaryotes. Some genes show a gene mosaic structure typical of eukaryotes (and also found in the mitochondrial genome), where exons can be far apart and located on different DNA strands and must be joined by trans-splicing. Various agents can specifically inhibit transcription, e.g., by intercalation into double-stranded template DNA (actinomycins), by blocking RNA polymerase (rifampicin in prokaryotes, amatoxins in eukaryotes), or by preventing chain elongation (cordycepin), making them useful for analytical and therapeutic purposes. Methods for analyzing transcription or RNA transcripts include in vitro transcription, in vivo analysis using reporter genes, Northern technique (blotting techniques), nuclear run-on transcription, primer-extension analysis, RNase protection assays, RT-PCR (polymerase chain reaction), run-off transcription, S1 mapping. Opposite: reverse transcription (reverse transcriptase).
Transcription
Fig. 1: Electron micrographs show a transcription process: the synthesis of ribosomal RNA (rRNA) along a gene for ribosomal RNA. Many RNA polymerases synthesize rRNA along the central DNA from the start signal (in a top) to the stop signal (in a bottom), where the RNA molecules reach their full length and are released (a, higher magnification). In many organisms, the DNA contains a series of genes for rRNA (b, lower magnification).
Transcription
Fig. 2: Schematic of transcription in prokaryotes
Transcription
Fig. 3: The complex of DNA, RNA, and RNA polymerase forms the transcription bubble. In the front part, the DNA double helix is unwound and opened; in the rear part, the newly synthesized RNA is displaced from the template strand and the DNA double helix is reformed. In between, an RNA/DNA hybrid is formed, with the RNA polymerase controlling the base pairing of the template strand codon with the incoming ribonucleotide and forming the phosphodiester bond.
Transcription
Fig. 4: Hairpin-like secondary structure in the 3'-noncoding region of a prokaryotic mRNA. On the left, a stop codon for translation (UAA) can be seen; to the right, following the GC-rich region of the hairpin, a poly(U) sequence. This sequence forms relatively weak interactions with the poly(A) sequence of the DNA, which causes the RNA to detach from the template strand.
Gene Regulation
Gene Regulation (from Latin regulare = to regulate), genetic regulation, regulation of gene activity, regulation of transcription, is the regulation of transcription of genes (gene) depending on the biochemical and biophysical state of a cell or a multicellular organism as well as environmental influences. Gene regulation is crucial for differential gene expression. Two basic mechanisms of gene regulation are distinguished:
1) Negative gene regulation: The binding of a repressor to the control region of a structural gene prevents transcription of the gene. The repressor proteins encoded by regulatory genes mediate the signal effect of small molecules, effectors (substrates or end products of biochemical pathways), on the activity of certain genes or gene groups. It is crucial that repressors have specificity for binding the respective effectors and for binding to the corresponding DNA control regions (operators). Two types of repressors are distinguished:
a) Repressors controlling the transcription of genes in a catabolic pathway (catabolism, dissimilation, metabolism) can bind to their operator region only in the absence of the effector (substrate of one of the encoded enzymes) and thus block the activity of the gene or operon;
b) Repressors controlling the transcription of genes in an anabolic pathway (anabolism) are altered allosterically (allostery) by the presence of the effector (end product of the pathway) so that they bind to the operator region and block transcription. The principle of negative gene regulation was postulated in 1961 by F. Jacob and J.L. Monod (Jacob-Monod model) for the regulation of the synthesis of sugar-degrading enzymes (lactose operon) in microorganisms. Many other operons in prokaryotes have since been studied in detail, whose transcription is regulated by the same principle (arabinose operon, galactose operon). Another mechanism of negative gene regulation is stringent control in the transcription of prokaryotic rRNA and tRNA genes (transfer RNA). If, due to amino acid deficiency, the concentration of uncharged tRNAs in the cytoplasm increases, the nucleotide guanosine-5'-diphosphate-3'-diphosphate (ppGpp) is formed, which inhibits the initiation of transcription of rRNA and tRNA genes. The significance of negative gene regulation for controlling gene activity in eukaryotic cells (eucyte) is probably less than in prokaryotes (protocyte). The main reason is the huge genome in eukaryotes, of which only about 7% is transcribed, depending on the organism. Thus, most of the eukaryotic genome must be transcriptionally inactivated. In eukaryotes, there are various mechanisms for this, such as packaging non-transcribed genome parts into inactive chromatin domains (chromatin). Methylated DNA segments (DNA methylation) are not transcribed. Genes that are highly methylated and inactive in one tissue may be unmethylated and expressed in another. So-called silencers act selectively via specific protein-DNA interactions on individual genes and are the most rapidly reversible and most comparable to prokaryotic repression. Repressors of transcription are now also known to directly interfere with the formation of preinitiation complexes, e.g., via protein-protein interactions of repressor molecules with components of the general transcription machinery or with activators of transcription (e.g., the proteins Krüppel, Engrailed, or WT1).
2) Positive gene regulation: The transcription of prokaryotic genes or gene groups is stimulated by the binding of activator proteins (again products of regulatory genes) to the control region. Without bound activators, RNA polymerase has only low affinity for the promoters of the corresponding genes or operons. The effectiveness of the activator depends (analogous to repressors) on the presence of certain small effector molecules. An example of positive gene regulation in Escherichia coli is catabolite repression: in the absence of glucose in the medium, the intracellular concentration of cAMP (cyclic adenosine monophosphate) increases, stimulating the transcription of several operons encoding enzymes for the fermentation of other sugars. cAMP, the effector molecule, binds to the cAMP binding protein, which acts as the activator protein for these operons (arabinose operon, galactose operon, lactose operon). Positive gene regulation in this case is superordinate to the specific negative regulation of each operon. In eukaryotes, positive transcriptional control is the dominant principle of gene regulation. The regulation of transcription efficiency is mainly achieved by binding of specific transcription factors to control regions (TATA box, CCAAT box; enhancer) on the DNA, which, in the case of RNA polymerase I- and II-transcribed genes, are located in the 5'-flanking region. Such protein-DNA interactions are prerequisites for transcription initiation. The various gene classes of eukaryotic cells are transcribed by three different RNA polymerases, which are supported by different general transcription factors (GTFs) (RNA polymerase I, II, and III by TFI, II, and III [transcription factors] as well as TAFI, II, and III [TBP-associated factors]). All three RNA polymerases also use the common general transcription factor TBP (TATA-binding protein). The "promoter strength," i.e., the potential of a promoter to activate transcription, depends on the exact nucleotide sequence of the promoter region, as this controls the affinity of transcription factors for this DNA region. Different "promoter strengths" cause different genes to be transcribed at different levels. The specificity of transcription initiation is achieved by the interplay of various transcription factors. This results in a basal transcription machinery (preinitiation complex) of impressive complexity, consisting of many protein components, with the general transcription factors (GTFs) themselves made up of several subunits. Differential gene activity, i.e., the pattern of eukaryotic transcription depending on temporal, spatial, cell-type-specific, organ-specific, or signal-mediated parameters, is mainly defined by the activity of regulatory transcription factors (RTFs), which interact directly or indirectly with the GTFs of the basal transcription machinery and contribute to activation. In tissue- or stage-specific gene activation, enhancer-binding protein factors, whose recognition sequence can be located both upstream and downstream of the transcribed region (or even in an intron), play a key role. Through such regulatory protein factors, extracellular factors, including light, temperature (heat-shock proteins), and growth factors, also influence gene regulation (signal transduction). Transcription factors have characteristic DNA binding motifs (protein-DNA interaction), which recognize specific sequences, often characterized by palindromic sequence symmetry (palindrome), on the DNA. The binding of specific protein factors to DNA control elements may require local disruption of the nucleosome structure (DNase I-hypersensitive sites). Chromatin structure-based parameters, e.g., the nucleosome with its histone core or HMG proteins, can have potentially repressive or activating effects on gene activity. For example, the topological folding of DNA packaged in nucleosomes can bring otherwise distant DNA regions into desired neighboring positions. Helper proteins (SWI/SNF complexes) mediate dynamic chromatin changes to facilitate transcription or its regulation. In rare cases, the extent of transcription initiation can also be affected by DNA translocations. A gene may come under the control of stronger transcription start signals and thus be expressed at a higher rate (activation of oncogenes). Such mechanisms are associated with the development of certain tumors.
In yeast, genes for the two different mating types a and α are not expressed in vegetatively reproducing cells. Under certain circumstances, either the a- or α-type genes are moved to the MAT locus. Only in this position are they expressed. The product of the MAT locus in yeast-as well as the homeobox sequence of Drosophila melanogaster and homologous sequences in other organisms-seem to act as regulatory elements for scattered genes that are simultaneously activated during development. It is assumed that the genes induced by the products of homeotic genes have sequences comparable to bacterial promoters or possibly viral enhancers, to which the regulatory proteins that stimulate transcription bind. In many eukaryotic cell types, mechanisms exist for transposing genes on a chromosome. The rearrangement of genes, as seen with yeast mating type genes, is a necessary step for activating the corresponding lymphoid clones in immunoglobulin and T-cell antigen receptor molecules.
The activation of specific transcription factors is usually triggered by signal transduction mechanisms. The signal effect of most hormones, cytokines, and other substances is based on their binding to plasma membrane receptors and subsequent phosphorylation of cytoplasmic substrate proteins, which either become active as transcription factors themselves or activate such factors. Besides the regulation of transcription initiation, premature termination, as found in attenuator regulation, can also contribute to gene regulation. The helical torsion of DNA (DNA topoisomerases) can also influence transcription efficiency. While the control of messenger RNA synthesis is often the first and decisive step in gene activity regulation, further regulation is possible after transcription. For example, mRNA stability, translational control, and protein stability can influence gene action. Disruptions in gene regulation, e.g., due to defective transcription factors, can cause developmental disorders, hormonal imbalances, cancers, and syndromes of defective DNA repair.