In the last years, hundreds of bacterial genomes isolated from the rhizosphere, endophytic compartment and phyllosphere of a variety of plants have been sequenced. These bacterial collections have proven valuable by enabling the development of tractable host-microbe synthetic ecology systems. In addition, these genomes have allowed the characterization of novel plant-associated genes via large scale comparative genomics.
The selection constraints imposed by environmental influences, such as host-microbe or microbe-microbe interactions can produce specific genetic signatures in the genomes of bacteria. In particular, signatures of positive selection have been identified in relevant host-microbe interaction proteins such as virulence effectors that interact actively with the immune system of the plant host. Beyond well-studied model bacterial organisms, little is known about the proteins under positive selection in bacteria that compose the microbiome of plants. In this study, we developed a pipeline that employs state of the art phylogenetic approaches to scan ~250 genomes encompassing five genera of bacteria that inhabit the rhizosphere and phyllosphere of plants. This allowed us to identify novel proteins that may play a relevant role in the survival of each genus in the plant environment. Besides this, comparison between positive selected proteins across the five genera allowed us to delimit probable niches and roles of each of the genus in the context of the community.
Additionally, due to the availability of closely related genomes coming from diverse plant-hosts and different environmental niches, such as the rhizosphere and the phyllosphere, we re-scanned the genomes to look for positively selected proteins that showed statistical differences between hosts and environments. This allowed us to identify candidate proteins that may play a relevant role in host specificity.