This information was most recently updated November 9, 2006.
The major DNA repair mechanisms take advantage of the facts that DNA is double-stranded and the same information is present in both strands. Consequently, in cases where damage is present in just one strand, the damage can be accurately repaired by cutting it out (excision) and replacing it with new DNA synthesized using the complementary strand as template. All organisms, prokaryotic and eukaryotic, employ at least three excision mechanisms: mismatch repair, base excision repair, and nucleotide excision repair.
The mechanism of mismatch repair was first thoroughly studied in E. coli. The research groups of Modrich, Kolodner and others re-constituted the repair process from purified proteins. The proteins that initiate the repair process are MutS, MutL, and MutH:
As implied by the above diagram, most mismatches are due to replication errors. However, mismatches can also be produced by other mechanisms--for example, by deamination of 5-methyl cytosine to produce thymidine improperly paired to G. Regardless of the mechanism by which they are produced, mismatches can always be repaired by the mismatch repair pathway. In cases where the appropriate DNA-N-glycosylase is available, mismatches can also be repaired by the base excision repair pathway (see below).
The replication-error-produced mismatch in the above diagram is indicated by the distorted double helix. MutS recognizes such mismatches (true mismatches plus loops caused by insertions or deletions of up to 4 nucleotides) and binds to them. Binding of MutL stabilizes the complex. E. coli DNA is normally methylated at GATC sequences, but the newly synthesized strand is not immediately methylated. The fact that the old strand, but not the new, is methylated near the replication fork allows E. coli cells to distinguish the old (presumably correct) strand from the newly-synthesized (presumably incorrect) strand. The MutS-MutL complex activates MutH, which locates a nearby methyl group and nicks the newly synthesized strand opposite the methyl group.
Excision is accomplished by cooperation between the UvrD (Helicase II) protein, which unwinds from the nick toward the mismatch, and a single-strand specific exonuclease of appropriate polarity (one of several in E. coli), followed by resynthesis (Polymerase III) and ligation (DNA ligase).
It is important to note that the use of methylation to distinguish the parental strand is probably peculiar to E. coli. Data from yeast and vertebrate in vitro mismatch repair experiments suggest that single-strand nicks provide a signal for strand specificity in these organisms. Note that single-strand breaks are present in nascent DNA strands--between Okazaki fragments in the lagging strand and at the 3' end of the leading strand.
The structure of the MutS homodimer in association with a DNA mismatch has recently been solved. The structure of this protein is one of the most complex and interesting of which I am aware:
Note that, even though the two MutS proteins in this structure have identical amino acid sequences, they do not have totally symmetric spatial positions. The two monomers can be labelled "A" and "B". Near the "top" of this structure, monomer A binds the mismatch in the DNA, while monomer B binds to other nearby points on the DNA, thus strengthening the overall interaction between MutS and the mismatch-containing DNA. Further down there is another large hole in the protein structure, big enough for binding to a second DNA molecule. And at the bottom there is an ATP-binding and ATP-hydrolysis site, composed of amino acids from both monomers. Also notice that the DNA is kinked at the mismatch. It will be interesting in the future to learn the functions of all the portions of this interesting protein.
The eukaryotic genes listed in the upper portion of the table are homologs of the corresponding E. coli genes both in terms of amino acid sequence and in terms of functional similarities. Whereas MutS and MutL function as homodimers, the eukaryotic proteins function as heterodimers. Heterodimers of MutS homologs are responsible for initial recognition of mismatches and small insertions/deletions, and heterodimers of MutL homologs interact with the resulting complex, as in E. coli. In contrast to E. coli, however, some eukaryotic MutL homologs play a specific catalytic role in MMR. As I shall discuss in more detail below, these eukaryotic MutL homologs have an important endonuclease activity.
Two heterodimers of MutS homologs are found in human cells. One of these dimers (MSH2/MSH6) is called MutSalpha and the second (MSH2/MSH3) is called MutSbeta. The first heterodimer preferentially recognizes single base mismatches and small insertion/deletion loops (1-2 bases). The second heterodimer primarily recognizes larger loops (2-10 bases). Most (80-90%) of the MSH2 in the cell is complexed with MSH6 to form MutSalpha. Thus MutSbeta is a relatively minor activity. Structural studies of MutS homologs suggest that MSH2 resembles the B monomer of the MutS homodimer, while MSH3 and MSH6 resemble the A monomer.
Three MutL homolog heterodimers are known. MLH1 is one of the dimer subunits in every case. The major heterodimer, MutLalpha, consumes about 90% of the MLH1 and contains PMS1(yeast)/PMS2(human) in addition to MLH1. The second heterodimer, MutLbeta consists of MLH1 and MLH2(yeast)/PMS1(human). Note that human PMS2 is a better homolog of yeast PMS1 than is human PMS1. The third dimer, MutLgamma, consists of MLH1 and MLH3. The first dimer plays a major role in mismatch repair. The role of MutLbeta is not yet clear. MutLgamma plays a minor role in mismatch repair.
Recently the Modrich laboratory succeeded in reconstituting eukaryotic MMR in vitro, using purified human proteins. When defined substrates were employed, which contained both a mismatch and a nick in one of the strands, the non-nicked strand was always used as template for the repair. The complete in vitro system contained MutSalpha, MutLalpha, RFC (the replication clamp loader), PCNA (the replication clamp), RPA (the single-stranded-DNA-binding protein), exonuclease I (which hydrolyzes the 5'-ended strand in double-stranded DNA) and DNA polymerase delta. Surprisingly, the results suggested that one of the roles of MutLalpha in this system was to introduce a second nick into the already-nicked strand, on the side of the MutSalpha-MutLalpha complex distal from the original nick:
This diagram shows a replication fork, because a primary role of MMR is to correct mistakes made during replication. However, the DNA substrates employed by the Modrich laboratory were small circular DNA molecules containing mismatches and nicks at defined locations. The results obtained by the Modrich laboratory suggest that RFC and PCNA, which bind to double-stranded DNA at the 3' ends of strands at nicks and gaps, serve to locate the positions of nicks and gaps in the DNA near mismatches. As a result of communication between the RFC-PCNA complex at a nick or gap and MutSalpha-MutLalpha at the mismatch (indicated by the double-headed purple arrow in the diagram), the human PMS2 subunit of MutLalpha introduces a nick near the MutSalpha-MutLalpha complex, on the side of that complex furthest from the nick or gap. This leads to the originally nicked strand being doubly nicked, with one nick on each side of the MutSalpha-MutLalpha complex.
Notice that, regardless of whether the nick or gap was upstream or downstream of the mismatch in the diagram above, and regardless of the fact that exonuclease I degrades DNA only in the 5' to 3' direction, generation of the second nick ensures that exonuclease I will have an entry point at which it can begin to degrade the nicked strand toward the mismatch. Ultimately, exonuclease 1 degrades this strand between the two nicks, cutting away the mismatched nucleotide. The resulting gap is filled by DNA polymerase delta, and the final nick is sealed by DNA ligase I, as in normal DNA replication.
Hereditary non-polyposis colon cancer (HNPCC) is a form of colon cancer characterized by early age of onset and autosomal dominant inheritance with high penetrance. It is frequently associated with defects in the genes encoding MSH2 (about 35% of cases in which responsible genes have been identified) and MLH1 (about 60% of cases). Note that these two genes are essential for formation of all of the MutS and MutL homologue heterodimers that function in the nucleus during normal cell division cycles. HNPCC is occasionally associated with defects in other mismatch repair genes (defects in MSH6, PMS2 and PMS1 have been detected). The rarity of defects in these MMR genes in HNPCC is probably a consequence of their functional redundancy, which makes them individually non-essential for MMR.
HNPCC is nearly always associated with defects in DNA repair evidenced by "microsatellite instability"--frequent variations in the number of repeat units of short tandemly repeated sequences. In many cases of colon cancer associated with microsatellite instability (MIN) in which no mismatch-repair gene mutation is evident, extensive methylation of the MLH1 promoter is evident. This methylation silences the MLH1 gene, so no MLH1 protein is produced. Thus it is likely that all cases of MIN-associated colon cancer will eventually be found to be due to defects in the mismatch repair pathway.