What are some examples of genetic error

The editors of the cell

Figuratively speaking, the information processing in the cell resembles a news agency, in which editors write, edit, proofread and approve texts for publication. The effort that is made to ensure that information is correctly transmitted is great. Still, mistakes can happen. Gabriele Neu-Yilik and Andreas Kulozik from the Department of Pediatric Oncology, Hematology and Immunology at the University Children's Hospital describe the fact that the "cell editors" are not infallible.

In molecular biology it has become fashionable to use communication and textual metaphors to talk about the human genome, genes and their products. One example is the word "information", which is often used to describe genetic relationships. Since the "genetic code" has been deciphered, we know a great deal more about genetic information. The human genome - the totality of the genetic information of an organism - is often viewed as a future fully "readable" information system, a kind of instruction manual for organisms that "copied", "copied", "redacted" and "edited" and is "translated".

Such metaphors are used to illustrate abstract ideas and make complex processes easier to understand. Naturally, however, they remain what they are: an image. Even if one that has a strong impact on scientific thinking. With this reservation, we also want to use analogies from the areas of text creation, editing and publication to explain how diseases arise at the molecular level.

Like many scientific terms, the content of the term gene is constantly changing. What do we know about genes today? Genes can be found in the cell nucleus on the chromosomes. As far as we know today, genes only make up about 25 percent of the genetic make-up. The remaining 75 percent of the genome - of the entire human genome - contain regulatory elements and areas whose function is not yet known. From a biochemical point of view, genes are sections of a very long thread-like molecule, deoxyribonucleic acid (DNA). "Real" genes contain the building instructions for protein molecules, the actual function carriers of the cell. The genome is to a certain extent an "archive" of genetic information. The sequence of the DNA building blocks (bases) that make up a gene is rewritten, "transcribed", in the cell nucleus into messenger ribonucleic acid (messenger RNA; mRNA).

The messenger carries the genetic information for the production of a protein from the nucleus into the cell plasma. There the information is read and translated into a chain of amino acids, the building blocks of proteins (translation). This path from gene to protein - gene expression - is extremely complex, both in terms of its individual steps and in its entirety, and is still not fully understood.

In the not so far distant beginnings of molecular genetics, the central dogma "DNA makes RNA, RNA makes protein" still applied. Another dogma was "one gene = one protein". Both doctrines are no longer valid today. The views on the very limited task originally assigned to the messenger RNA are also considered outdated. The function of the mRNA was seen to be limited to the transport of genetic information and initially it was not assigned any significance beyond this "ambassadorial role". But it soon turned out to be a mistake.

The genetic information is not contained in the protein-coding genes in an uninterrupted sequence of meaningful words that could be copied without thinking. Rather, meaningful (coding) sections on the DNA, which actually carry the information for the construction of a protein, alternate with "meaningless" sections. Before the mRNA can be used for protein synthesis, the "meaningless" sections must be removed and the sensible ones joined together - "spliced". The sensible ones are then transported out of the cell nucleus, together with regulatory sections, into the cell plasma. They are therefore called "exons". What remains in the core are the "introns", i.e. those sections that are not translated into protein. They are dismantled there. Both introns and exons are therefore initially written down in mRNA. However, this "text" is still being edited in the cell nucleus.

Responsible for this is a molecular machine, the "spliceosome": introns that do not contain any information relevant for protein synthesis are truncated from the mRNA and units of meaning, i.e. exons, are merged. The activity of the splice osome (after the English splice = to stick together) thus resembles the working method of an author who repeatedly puts together text modules of an originally detailed manuscript depending on the client and audience. The spliceosome also combines exons differently. In this way, a different text is created each time. In terms of molecular biology, building instructions are created for structurally related, but functionally often very different proteins.

From a limited number of "text modules", the cell can put together different "texts" as required. As a result, organisms achieve a greater complexity and diversity of their products than the relatively small number of genes suggests. It is believed that some genes can produce thousands of splice variants.

A finished mRNA typically has a number of properties. In addition to the cap and poly-A-tail, most mRNA molecules have so-called "untranslated" regions which, although they do not play a role as building instructions for the protein, nevertheless often contain instructions, for example for the stability of the mRNA itself or for time and place and frequency of translation.

In addition to the editorial processing, the mRNA in the cell nucleus is also "trimmed". The front end is given a special "cap", which is important for its stability and portability. In addition, the cap later serves as a "landing platform" from which the ribosomes - the protein synthesis factories in the cytoplasm - can operate. A conspicuous "tail" is attached to the rear end of the mRNA, which consists exclusively of one RNA building block, the base adenine. This is why one speaks of the "poly-A tail". Among other things, it determines the lifespan of the messenger molecule. In their entirety, the processes that turn the immature mRNA into a mature one are called "processing". A fully processed mRNA that is ready for translation into protein is usually built in a typical way: at the front end it has the "cap", in the middle there is the protein-coding area, also known as the "reading frame" and at the end there is the poly-A- Tail. Processing is also of great importance for the molecular understanding of diseases. We want to illustrate this with two examples.

The first example, a hereditary form of thrombophilia, is a disastrous tendency of the blood to form dangerous blood clots (thrombi). The prothrombin gene is changed (mutated) in around one to two percent of people. People affected by this mutation are at a significantly greater risk of developing dangerous thromboses. The product of the prothrombin gene - the protein prothrombin - is an important part of the blood coagulation system. Due to the mutation, the amount of prothrombin in the blood is increased by 50 percent. The risk of developing a thrombosis is correspondingly higher.

The mutation does not occur in the protein-coding part of the mRNA, but where the poly-A tail is attached to the messenger molecule. This finding was initially a mystery. The proximity of the mutation to the poly A tail suggested a change in the lifespan of the mRNA or an improved translation into protein. However, experimental analysis showed that the mutation improved the prothrombin mRNA manufacturing process: more mRNA molecules were made; as a result, more templates were also available for translation into protein. This finding was astonishing. Because it shows that in some cases the cell "intentionally" makes a production process inefficient. The question is why is she doing this. This is the subject of our current research.

It would be conceivable, for example, that the organism reserves the right to only produce large quantities of prothrombin mRNA when the protein is actually needed, for example when the blood coagulation system becomes active to close wounds after an operation or a serious injury. The process is comparable to the modern "Publishing on Demand" in the publishing industry, where books are produced depending on the immediate demand.

In our second example, the mutation does not affect the poly-A tail, but rather the coding region of an mRNA, which contains the information for the production of a protein. The reading frame always begins with the base sequence adenine-uracil-guanine (AUG), the code word for the amino acid methionine. A stop symbol marks the end of the reading frame. It is, as it were, the point that marks the end of a sentence. The stop sign signals the ribosomes to end protein synthesis. This shows that the accuracy with which the transcription and editing of the genetic information in the cell nucleus takes place is of the greatest importance: transcription errors or incorrectly inserted letters can obscure the meaning. If the cell's own editor (the spliceosome) makes a mistake in truncating the introns and joining the exons by even one single "letter" (a base), the reading frame shifts. The result: a completely different protein is produced.

A typical mistake during editing is that a stop sign gets into the meaningful section of the mRNA. According to the latest estimates, such "nonsense mutations" make up around 25 to 30 percent of all mutations. One would intuitively assume that the ribosomes, if they encounter such a premature stop sign, stop reading the assembly instructions too early and synthesize a shortened protein. In fact, there are examples of such a process - with fatal consequences, as we shall see later.

It is often the case, however, that the cell recognizes the incorrectly placed character in the assembly instructions before larger amounts of the wrong protein are formed. Such a faulty mRNA is broken down again. Molecular biologists call this process "nonsense-mediated mRNA degradation", or NMD for short (for nonsense-mediated mRNA decay). The editorial work of the NMD mechanism consists in checking the information content of the messenger molecule for correctness and legibility and eliminating waste in good time. It is a "quality control" of the cell for its own products, the importance of which is already recognizable from the fact that it is available to all higher living beings examined so far, from yeast to humans.

Higher organisms usually have a double set of chromosomes and therefore two versions of all genes. Usually only one version is affected by the mutation. The mRNA delivered by the unchanged "healthy" gene and therefore error-free is usually sufficient to deliver sufficient protein. The second defective mRNA, copied from the "sick" gene, is recognized and destroyed by the cell. This avoids the damage that could be done by a shortened protein.

An impressive example of this is b-thalassemia, a hereditary form of anemia that occurs mainly in the Mediterranean region and Asia. Nonsense mutations in the gene for the protein beta-globin are often responsible for this hereditary disease, the most common worldwide. Beta globin is a component of the red blood pigment and oxygen carrier hemoglobin. Because the defective beta globin mRNA is recognized and broken down by the cell, the disease follows a recessive mode of inheritance in most cases, i.e. people in whom only one version of the gene is mutated do not become ill. The disease only breaks out when both genes are mutated. There are exceptions, however: if the nonsense mutation is in a specific area of ​​the beta globin mRNA, it cannot be detected. Then a protein that is too short is formed and integrated into the hemoglobin molecule. This makes the vital protein inoperable. They suffer from severe anemia and may need blood transfusions.

The "code sun" illustrates the coding of the 20 different amino acids (in the circle on the outside) by the bases of the DNA (inner circles). Certain combinations of three bases also signal when the protein synthesis in the ribosomes begins (so-called start codon) and when it should end (stop codon).

The quality controllers in the cell can also make mistakes. But why is it that not all nonsense mutations are recognized? One answer was the precise analysis of the mutations that lead to the disease. It turned out that a mutation that occurs in the front part of the gene is always followed by the degradation of the mRNA. Mutations, however, which are in the last exon of the gene or in the penultimate one immediately before the "weld seam" to the last one, cannot be recognized by the NMD mechanism: the shortened beta-globin protein is produced.

The position-dependent importance of nonsense mutations could also be shown in many other genes. This suggests that the splicing itself plays an important role. But the translation process, translation in the cytoplasm, is also of great importance. Quality control therefore consists of two components: one is cell nucleus and splice-dependent, and the other is cell plasma-dependent and translation-dependent. Both components have to work for a correct protein to be made.

This conclusion resulting from the experiments was initially difficult to understand, as both processes are spatially separated from each other. A possible explanation is that a nonsense codon in the mRNA during translation into protein can only be recognized as an error if it is followed by a "weld seam". This is usually not the case with a natural stop codon. This notion can also explain why mutations in the last exon of the beta-globin gene have such fatal consequences: Because the exon-exon junction marked by a weld is missing, they are mistaken for the natural stop sign and the shortened beta-globin protein arises.

A single gene can result in multiple gene products (proteins) through a variable combination of the exon and, in rare cases, even the introns. A molecular machine, the "splice oosome", is responsible for this. His work is similar to the way an author works, who uses a limited number of text modules to put together different texts as required.

The question that remains is what the welds between the exons are all about. A currently widespread model postulates that the splice oosome leaves something behind at these points, which serves as a marker and allows the correct and incorrect stop codons to be distinguished. Such "marking" is in fact done. The cell uses a complex of proteins for this. How this protein label is recognized during translation and what then causes the defective mRNA to be destroyed is still unknown.

The examples are intended to make it clear that the mRNA is by no means a boring intermediate carrier of genetic information. On the contrary: it is so important a molecule for gene expression that nature has developed its own methods to ensure that the genetic information is transmitted in the correct amount and without errors. The role of the mRNA is by no means limited to that of an office messenger, whose job is simply to supply the design offices of the decision-makers with plans. Rather, the mRNA shows that it is possible for cells to intervene in the transmission of information at an early stage. Editing the information contained in the human genome is a remarkable achievement. Likewise the subsequent quality control. Both processes are extremely complex and are only just beginning to be understood. It is our goal to understand the mechanisms and the medical significance of these biologically complex processes and thus to better understand diseases.

Dr. Gabriele Neu-Yilik and Prof. Dr. Andreas Kulozik
University Children's Hospital, Department of Pediatric Oncology, Hematology and Immunology,
Im Neuenheimer Feld 153, 69120 Heidelberg
Telephone (0 62 21) 56 45 79, Fax: (0 62 21) 56 45 80
e-mail: [email protected]