Evolutionary Selection for Short Under-Represented Nucleotide Sub-Sequences in Viruses and Related Hosts

yoram zarai Tamir Tuller
Tel Aviv University, Israel

Viruses are small infectious agents that replicate only inside living cells. They are believed to play a central role in evolution and have important potential applications in biotechnology and nanotechnology. Due to their complete reliance on the host gene expression machinery, viruses are under constant evolutionary pressure to effectively interact with the host intracellular factors, while effectively evade its immune system. Thus, understanding how viruses co-evolve with their hosts in order to ensure their fitness may help in developing novel viral based applications such as vaccines, oncologic therapies, and anti-bacterial treatments. Here, based on a large-scale genomic analysis of 2,692 viruses from all classes affecting 444 hosts organisms from all kingdoms of life, we detect short nucleotide sub-sequences of viruses that are under-represented in the coding regions of the viruses and their hosts that cannot be explained by their amino acid content, codon frequencies, and GC content. For example, we detect 23,838 sub-sequences of three nucleotides long across the analysed viruses with an estimated FDR of 0.015. We show that a subset of these are also under-represented in the corresponding hosts (which may be relate to mechanisms and selection forces shared by the hosts and the viruses), while the other are unique to the viruses.

Powered by Eventact EMS