Leveraging the 3D Genome to Solve Genome Assembly

Noam Kaplan
Department of Physiology, Biophysics & Systems Biology, Rappaport Faculty of Medicine, Technion Israel Institute of Technology

Genome assembly, the problem of determining an entire genome sequence from shorter sequences, has long been a fundamental challenge in the fields of genomics and bioinformatics. In the recent decade, next-generation DNA sequencing (NGS) technologies have improved dramatically, enabling inexpensive measurement of massive amounts of short DNA sequences. Despite this, short-read NGS-based genome assembly is unable to produce high-quality genome assemblies such as that of the human genome. This limitation significantly diminishes the usefulness of genome sequences for addressing biological questions.

We have developed a new approach to short-read genome assembly, based on the repurposing of a genomic experiment known as Hi-C. Hi-C is a molecular biology technique that is used to measure the spatial proximity of pairs of DNA loci genome-wide, and has led to key insights of how the genome’s 3D organization is associated with its functional state. We show that canonical invariant features of the genome’s 3D organization provide a quantitative link between the genome’s 1D and 3D structure, and we use build on these features to achieve, for the first time, chromosome-scale genome scaffolding using only NGS short reads. In the past year, several major genomes have been solved using this novel concept. Finally, we and others are currently extending this approach to address several key challenges in the field of genomics, including haplotype phasing, metagenomic assembly and cancer genome analysis.

Noam  Kaplan
Noam Kaplan








Powered by Eventact EMS