ISMBE 2020

Under-Represented Short Nucleotide Sequences Found in Viruses and in their Related Hosts

yoram zarai Tamir Tuller
Tel Aviv University, Israel

Due to their complete reliance on the host gene expression machinery, viruses are under constant evolutionary pressure to effectively interact with the host intracellular factors, while evading its immune system. Understanding how viruses co-evolve with their hosts is a fundamental topic in molecular evolution, and may also aid in developing novel viral based applications such as vaccines, oncologic therapies, and anti-bacterial treatments. Here, based on a novel statistical framework and a large-scale genomic analysis, we identify short nucleotide sequences that are under-represented in the coding regions of viruses and their hosts. These sequences cannot be explained by the coding regions’ amino acid content, codon and dinucleotide frequencies. We specifically show that short homooligonucleotide and palindromic sequences tend to be under-represented in many viruses probably due to their effect on gene expression regulation and the interaction with the host immune system. In addition, we show that more sequences tend to be under-represented in dsDNA viruses than in other viral groups, and that many of these sequences are under-represented in viruses and not in their related hosts. Finally, we demonstrate, based on in-vitro and in-vivo experiments, how under-represented sequences can be used to attenuated Zika virus strains.









Powered by Eventact EMS