This information was most recently updated October 31, 2006.

Nucleotide Excision Repair

Although base excision repair is clearly important, it is insufficient to deal with all types of damage. For a given type of damage to be corrected by base excision repair, there must be a DNA glycosylase capable of recognizing that specific damage. The huge variety of DNA-reactive chemicals in our environment combined with the huge variety of alterations that can be produced by radiation and by oxidative and free radical attack on DNA can generate so many types of damage that coping with all types of damage by evolutionary development of damage-specific DNA glycosylases would be difficult if not impossible. Fortunately, a different, more flexible damage repair mechanism has evolved in living organisms, nucleotide excision repair (NER), which recognizes damaged regions based on their abnormal structure as well as on their abnormal chemistry, then excises and replaces them.

In all organisms, NER involves the following steps:

  1. Damage recognition
  2. Binding of a multi-protein complex at the damaged site
  3. Double incision of the damaged strand several nucleotides away from the damaged site, on both the 5' and 3' sides
  4. Removal of the damage-containing oligonucleotide from between the two nicks
  5. Filling in of the resulting gap by a DNA polymerase
  6. Ligation

NER in eukaryotic cells

The overall process of NER in eukaryotic cells resembles that in E. coli, but there are many differences in detail. Information about eukaryotic NER has emerged from genetic and biochemical studies. As a result of these studies, the following proteins important for the damage recognition and incision steps of NER have been identified. Human proteins are shown here, but similar proteins have also been identified in yeasts, lower animals (such as Drosophila) and in other mammals. Although the overall processes are similar, many more proteins are required for NER in eukaryotes than in prokaryotes.

Table showing proteins of eukaryotic NER

In the above table, proteins that interact with each other sufficiently strongly to form isolatable complexes, and proteins that have similar functions, have been grouped together and provided with a common background color. Particularly noteworthy is the 10-protein complex called TFIIH (background color = green), which is essential for DNA repair and transcription (it stimulates promoter clearing by RNA polymerase II).

The names of many of the genes in the above table start with the letters "XP". That's because these genes were first identified in genetic complementation studies of the human DNA repair disease, Xeroderma pigmentosum (XP). These studies suggested that mutations in any of 7 genes (XPA-XPG) could give rise to the disease. I'll discuss XP in more detail later on.

Available evidence suggests the following model for the initial stages of NER in human cells.

The initial steps depend on whether the damage is in the actively transcribed strand of a gene or elsewhere in the genome. If the damage is not in the actively transcribed strand of a gene, then the repair process is called "global genomic NER" (GG-NER). In this case two different heterodimeric proteins cooperate to recognize the damage and initiate repair. One of these heterodimers, sometimes called UV-DDB (for UV-damaged-DNA-binding protein) consists of DDB1 and XPE. XPE is sometimes called DDB2. This heterodimer binds selectively to a variety of UV-induced DNA lesions, including CPDs. This heterodimer is also a component of an E3 ubiquitin ligase (E3UL), and it appears to bring the other components of the E3UL with it when it binds to UV-damaged DNA. The XPE-containing E3UL has many substrates that appear relevant to activation of repair at damaged sites. Among these substrates are the four histones. It seems likely that histone ubiquitylation may stimulate the chromatin remodeling that is undoubtedly important to give the NER proteins access to the damaged DNA. A second substrate is the XPC protein. It seems likely, therefore, that the XPE-containing E3UL helps to recruit XPC to the damaged site. Ubiquitylation of XPC by this E3UL does not cause XPC degradation. Instead, it appears to enhance XPC's affinity for damaged DNA. A third substrate is XPE itself. Ubiquitylation of XPE leads to its dissociation from DNA and to its degradation by the proteasome. It is likely that removal of XPE (and its associated E3UL) from the damaged site is essential to permit XPC full access to the damaged site.

The XPC/HR23B dimer appears to recognize damaged DNA based on the extent of distortion of the normal helical DNA structure caused by the damage. Consequently, 6-4PPs are recognized much more readily than CPDs. Genetic experiments indicate that recognition of CPDs is heavily dependent on the XPE protein (see previous paragraph), whereas recognition of 6-4PPs is not. In the process of binding to the damaged region, XPC/HR23B is thought to further increase the extent of structural distortion, as illustrated in this diagram (the red box indicates a damaged site, for example a thymine dimer):

The early stages of GGR

The increased distortion produced by XPC/HR23B permits the entry and binding of the general transcription factor TFIIH, whose 10 subunits are colored in various shades of green in the above diagram (see also the table above). Two of these subunits (XPB and XPD; shown in brighter green) are helicases, which bind to the damaged strand and use the energy of ATP to unwind a stretch of 20-30 nucleotides including the damaged site.

This unwinding occurs in two steps (not shown in the diagram). Initial unwinding by XPB (stimulated by TTD-A) opens up a small stretch and permits access of XPA to the damaged region. XPA contains a DNA-binding site that binds preferentially to structurally distorted DNA molecules. Thus XPA binding provides a second (after XPE and XPC) level of selection for damaged DNA versus normal DNA, ensuring that normal DNA will not be subjected to futile rounds of nucleotide excision repair.

After XPA binds, subsequent unwinding (to generate an open stretch of 20-30 nucleotides) by XPB and XPD permits two more proteins to bind. One of these is RPA, whose recruitment is facilitated by its physical interaction with XPA. RPA is the major eukaryotic single-stranded-DNA-binding protein. It is a heterotrimer, and it binds to and protects both of the separated strands in the open complex. For clarity in the diagram, it is shown binding only to the bottom strand. The long oval labeled "RPA" in the diagram represents a single RPA heterotrimer or possibly two binding side-by-side. The second protein recruited at this time, XPG, is a structure-specific nuclease (see below).

Concomitant with the binding of XPA, RPA and XPG, XPC and HR23B are released. These two proteins are then free to recycle to other damaged sites where the repair process has not yet been initiated.

Before proceeding to the next step (double strand incision), I wish to discuss another type of NER: transcription-coupled NER (TC-NER). Numerous experiments have demonstrated that damage within the transcribed strands of genes is usually repaired more rapidly than damage in the non-transcribed strand or damage in non-gene regions. In general, the less structural distortion produced by the damage, the greater the ratio of rate of repair in transcribed strands to rate of repair elsewhere. In humans TC-NER requires all of the proteins needed for GG-NER except for XPE, XPC and HR23B, suggesting that a different mechanism (not requiring XPE and XPC) is involved in recognizing damage in transcribed strands. Numerous experiments suggest that this different mechanism involves the stalling of RNA polymerase at damaged sites. RNA polymerase stops when it runs into damage—even minimally distorting damage such as CPDs—in the template strand. This probably explains why CPDs are repaired much more rapidly by TC-NER than by GG-NER.

Early stages of transcription-coupled repair

Defects in either of the two proteins shown associated with RNA polymerase in the above diagram, CSA and CSB, can lead to the human genetic disease, Cockayne's syndrome, which I'll discuss in more detail below. Their function is important for TC-NER, presumably in helping to recruit TFIIH to the damaged site and in helping to displace RNA polymerase and the nascent transcript so that TFIIH can access the damaged region. As in the case of GG-NER (above), after recruitment TFIIH unwinds a 20-30 nucleotide stretch of DNA including the damaged region. Presumably the partially unwound region produced by the stalled polymerase assists in providing access to TFIIH. The fact that the stalled polymerase produces a partially unwound region on its own may be one reason why XPC is not necessary (in humans) for TC-NER.

The above diagram suggests that RNA polymerase and the nascent transcript are removed at the same time as TFIIH is recruited. However, that has not been clearly established. It is also possible that RNA polymerase and the nascent transcript persist near the DNA lesion, while the lesion is being repaired. If RNA polymerase should persist near the lesion until after the lesion is repaired, then it might be possible for the RNA polymerase to resume transcription without the requirement for re-initiation at the gene promoter. Whether RNA polymerase is cleared from the damaged region and then re-initiates at the promoter or simply resumes transcription after the damage is repaired, the process of TC-NER is essential for proper resumption of transcription.

Additional evidence, some of which is discussed below, suggests that the XPB and XPD helicase subunits of TFIIH, the TTD-A subunit of TFIIH, and also the XPG nuclease, play special roles in TC-NER, roles that go beyond their roles in GG-NER. It may be that these three proteins assist in the resumption of transcription.

The next step in the repair process, for both GG-NER and TC-NER, is recruitment of another structure-specific endonuclease, the XPF-ERCC1 heterodimer:

Double incision step of NER

Both XPG and XPF-ERCC1 are specific for junctions between single- and double-stranded DNA. XPG, which is closely related to the FEN-1 nuclease that participates in base excision repair, cuts on the 3' side of such a junction, while ERCC1/XPF (a heterodimeric protein complex) cuts on the 5' side. The cut made by XPG is 2-8 nucleotides from the lesion, and the cut made by ERCC1/XPF is 15-24 nucleotides away. The cuts are paired with each other (probably as a consequence of the structure of the multiprotein complex) in such a way that the damage-containing oligonucleotide between the cuts averages 27 nucleotides (range 24-32 nucleotides).

Next the replicative gap-repair proteins, RFC, PCNA, and DNA polymerase delta or epsilon, bind to the 3'-OH group (arrowhead) generated by the ERCC1-XPF cut, and they carry out new DNA synthesis that fills the gap. This leads to displacement of the damage-containing oligonucleotide and of TFIIH, XPA, XPG, and XPF-ERCC1. The final nick is sealed by DNA ligase I.

NER and human genetic diseases

The genes encoding many of the human NER proteins (see the table above) were first identified in genetic complementation studies of the human DNA repair disease, Xeroderma pigmentosum (XP), which suggested that mutations in any of 7 genes (XPA-XPG) could give rise to the disease. In addition to XP, two other human genetic diseases involve defects in or related to NER. These two additional diseases are Cockayne's syndrome (CS) and trichothiodystrophy (TTD). Although all 3 diseases are associated with repair defects, they have strikingly different clinical manifestations:

Patients suffering from XP have

Cockayne's syndrome gives rise to

Trichothiodystrophy patients display

Note that the clinical symptoms of XP on the one hand and CS or TTD on the other are largely non-overlapping. In particular, CS and TTD patients do not have elevated rates of cancer, while XP patients do not display premature aging. How could it be that all of these patients suffer from defects in NER, yet display such different symptoms?

Part of the answer comes from the results of complementation and cloning studies. These reveal that TTD has 3 complementation groups corresponding to 3 gene products:

The notation in the latter two cases is a result of the fact that these two TTD complementation groups have proved to correspond to the XPB and XPD genes. Only a subset of XPB or XPD mutations give rise to TTD symptoms. That subset is indicated by the notation shown here.

All three TTD complementation groups correspond to proteins that are components of TFIIH, which is also a transcription factor (see table and diagrams above). The relationship to transcription is extended, though not completely, by the complementation groups of CS:

CSB appears to be required for the chromatin remodeling necessary for access of the NER proteins to the DNA damage at the site of the stalled RNA polymerase and also for post-repair resumption of transcription. In cells lacking CSB, many genes (for example, housekeeping genes) but not all genes (for example, p53-responsive genes) cannot resume transcription after UV irradiation. CSA, and its associated E3UL proteins, are essential for promoting the removal of the CSB protein after repair is complete. This post-repair removal of CSB is also essential for proper resumption of transcription. Thus defects in both CSA and CSB can lead to transcriptional problems.

However, the symptoms of CS and TTD cannot be explained simply by failure to repair transcribed strands, because people who are totally defective in NER due to null mutations in other NER genes (such as XPA) that are required to repair lesions in both transcribed and non-transcribed strands do not suffer from the symptoms peculiar to CS or TTD (see above).

In my view, the best current explanation of these observations is the hypothesis recently proposed by Mitchell, Hoeijmakers and Niedernhofer (Divide and conquer: nucleotide excision repair battles cancer and ageing. Current Opinion in Cell Biology 15:232-240, 2003). Although most types of damage can be repaired, repair systems do not always work perfectly. Some types of residual damage do not interfere with transcription and thus do not prevent cellular proliferation. These types of damage frequently lead to mutations and thus may contribute to the development of cancer. In contrast, other types of damage directly interfere with transcription. This category would include unrepaired damage in the transcribed strands of genes, which would block transcription. This category would also include problems in restarting RNA synthesis in genes where RNA polymerase stalled at damaged sites, even though those damaged sites might eventually be repaired. Interference with transcription would lead to activation of the p53 signaling network, resulting in cessation of cellular proliferation (cell senescence) or cell death (apoptosis). Senescence or apoptosis of many cells within a given tissue would cause premature aging of that tissue. Some types of unrepaired DNA damage might lead to mutations in some cell types and senescence or death in other cell types. Interestingly, many human mutations affecting proteins involved in DNA repair or checkpoint signaling lead to increased cancer or accelerated aging or (occasionally) both.

In the cases of XP, CS and TTD, the genetic evidence suggests that mutations that affect only GG-NER or affect only those portions of NER that are common to GG-NER and TC-NER lead to residual DNA damage that is mutagenic but that does not stimulate cell senescence or apoptosis, thus explaining the tendency of XP patients to suffer from cancer but not from premature aging. In contrast, mutations that affect only TC-NER or that affect proteins common to TC-NER and GG-NER in such a way that only their involvement in TC-NER is altered would lead to failure to properly resume transcription, thus activating p53 and inducing senescence or apoptosis. Since the frequencies of various types of DNA damage vary from tissue to tissue, and the sensitivity of cells to p53 signaling also varies from tissue to tissue, one would expect to see premature aging (which is the consequence of increased cellular senescence and/or apoptosis) more frequently in some tissues than in others, depending on the nature of the mutation. This is what is observed.


Interactions Between Excision Repair Pathways in Transcription-Coupled Repair

It has been demonstrated that thymine glycols in the transcribed strands of genes are repaired faster than thymine glycols elsewhere in the genome. Thus transcription coupling occurs in BER (TC-BER) as well as NER (TC-NER). TC-BER of thymine glycols is dependent on CSA and CSB and is defective in XPG/CS mutants but not in XPG/NER mutants. Thus CSA, CSB and XPG (and presumably also XPB and XPD) have functions outside of NER. They are important for TCR of some—perhaps all—lesions removed by BER.

Recent results from other laboratories suggest that some of the proteins of mismatch repair (MMR) are also required for TCR of certain types of damage. The particular requirements differ between humans (where, for example, MLH1 is required for TCR of UV-induced damage but not thymine glycols) and yeast (where MLH1 is required for TCR of thymine glycols but not UV damage). The mechanisms by which MMR proteins participate in TCR are not yet clear in either yeast or humans. This is a fascinating topic for future research.


Link to next page