ILANIT 2020

Evolution-CSI: hunting for evidence in the protein universe

Sergey Nepomnyachiy 2 Aya Narunsky 2 Ron Solan 2 Amit Kessel 2 Rachel Kolodny 1 Nir Ben-Tal 2
1Computer Science, University of Haifa, Israel
2Biochemistry and Molecular Biology, Tel Aviv University, Israel

Reuse – the co-option of segments from unrelated proteins to produce new ones – underlies protein evolution. Thus, characterizing reuse can offer insights to protein function and evolution. To study the protein universe from this perspective, we developed an algorithm that identifies `themes` – reused segments of similar sequence and structure – from protein alignments. Our algorithm finds themes of varying minimal lengths, ranging from 35-200 residues. Using it, we quantify and study reuse in the ECOD database of domains and in the PDB. Indeed, theme reuse is prevalent, and reuse is more extensive when including shorter themes. Structural domains, which are autonomously folded protein parts and the best-characterized form of reuse in proteins, are just one of many, complex and intertwined, evolutionary traces. Others include long themes shared among a few proteins, which encompass and overlap with shorter themes that recur in more proteins. To better understand the role of themes in ancient proteins, we study themes linked to an ancient function: Adenine binding. Adenine has probably been on Earth since the beginning of life. It is part of many protein binding co-factors, with thousands of Adenine-binding structures in the PDB. We analyze the themes in this set and find that specific themes mediate specific binding to co-factors. These themes could be the ancestral segments, or decedents of ancient mini-proteins that ultimately evolved to the proteins that we see today via evolution by duplication and divergence.









Powered by Eventact EMS