Base-base mispairs and small nucleotide insertion/deletion mismatches are dominant sources of genetic mutations. Each mismatch enforces different DNA conformations dictated by the unpaired residues` number, arrangement, and chemical identity. These distortions can affect interactions with DNA-binding proteins, including regulatory transcription factors (TFs).
Recent studies found that interactions between TFs and damaged DNA may play an important role in mutagenesis. However, the structural impact of damage-induced DNA shapes on protein-DNA recognition has not been well characterized. We present Saturation Mismatch Binding Assay (SaMBA), a new technique to characterize the effects of mismatches on TF-DNA binding in high throughput. SaMBA generates DNA duplexes containing all possible single-base or insertions mismatches to quantitatively assess mismatches` effects on TF-DNA interactions.
We applied SaMBA to measure the binding of 21 TFs to thousands of mismatched sequences and mapped the impact of mismatches on these TFs. Remarkably, for all TFs examined, the introduction of mismatches at certain positions resulted in significantly increased binding, with some mismatches creating high-affinity binding sites in nonspecific DNA and some converting known binding sites into “super-sites” stronger than any canonical Watson-Crick site.
Structural analyses revealed that these mismatches often distort the naked DNA so that its structure becomes similar to bound DNA sites, explaining the increased binding measured in our assay. Our results reveal that the cost of deforming the DNA structure is a major determinant of protein-DNA recognition and reveal mechanisms by which mismatches can recruit TFs and modulate replication and repair activities in the cell.