How to handle big data in fungal genomics?

Igor Grigoriev
Fungal Program, US Department of Energy Joint Genome Institute, Walnut Creek, California, USA

Genomics has a transformational effect on biology. The genome of Saccharomyces cerevisiae was the first sequenced fungal genome and played a critical role in developing a variety of molecular tools to move biology forward. With additional sequenced species, comparing unique genes, gene family expansions and contractions, for dozens of diverse fungal species shed light on genetic toolkits of mycorrhizae, pathogens, and saprobes.

Genomes of over 1000 of fungal species have been sequenced and this number continues to grow in the large scale projects like the 1000 fungal genomes project, which samples diversity across the entire Fungal Tree of Life, or the 300 Aspergillus genomes, focused on a single genus. In addition, large-scale functional genomics studies employing transcriptomics, proteomics and other omics approaches like fungal ENCODE add new dimensions to the genomic data.

Genomics Big Data has a huge unexplored potential but we can no longer efficiently explore hundreds millions of data points staring at the tables of gene counts. How to visualize Big Data in fungal genomics? What are the determinants of fungal lifestyles? Can we predict them from genomics sequences? These questions and new approaches to answer them for specific groups of fungi will be discussed.