ILANIT 2023

Mapping the affinity of protein-protein interactions with multiple amino acid mutations using deep neural networks

Reut Moshe 1 Shay-Lee Aharoni Lotati 2 Niv Papo 2 Yaron Orenstein 1,3,4
1School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Israel
2Avram and Stella Goldstein-Goren Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Israel
3Department of Computer Science, Bar-Ilan University, Israel
4The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Israel

Protein-protein interactions (PPIs) play vital roles in diverse biological processes. Hence, measuring PPIs is critical for decoding the evolution of proteins, and for developing powerful interactions for drugs. To date, studies focused mainly on a narrow range of affinities and on single mutations in the amino acid sequence of a given protein to develop high-affinity PPIs due to limitations in the experimental techniques. Our study introduces a novel approach to comprehensively map PPIs and identify multiple (affinity-enhancing or affinity-reducing) mutations by applying machine-learning methods to next-generation sequencing selection data. We present a novel method to accurately predict the impact of multiple interacting mutations that were not observed in the experimental data. We applied our method to the N-TIMP2\MMP9 protein complex as a case study due to its unique interface, which consists of seven positions in N-TIMP2 crucial for binding. We developed a neural network to accurately and quantitatively predict the impact of multiple potentially interacting mutations on binding affinity. Our neural network achieved in cross-validation (training on 90%, and testing on a held-out 10%) a Pearson correlation of 0.954 between predicted and observed enrichment ratios. In addition, on an independent dataset of 18 experimentally validated variants, the Pearson correlation between their affinity constants and predicted enrichment ratios was 0.504. Currently, we are testing the affinity of five novel multiple-mutations variants that we predicted as high-affinity variants. Generally, our innovative approach can be applied to many more protein-function datasets to provide a rich characterization of a PPI affinity landscape.