ILANIT 2023

rG4detector: convolutional neural network to predict RNA G-quadruplex propensity based on rG4-seq data

Maor Turner 1 Mira Barshai 1 Yaron Orenstein 2
1School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Israel
2Department of Computer Science and The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Israel

RNA G-quadruplexes (rG4s) are RNA secondary structures, which are formed by guanine-rich sequences and have important cellular functions. Thus, researchers would like to know where and when rG4s are formed throughout the transcriptome. Measuring rG4s experimentally is a long and laborious process, and hence researchers often rely on computational methods to predict the rG4 propensity of a given RNA sequence. However, existing computational methods for rG4 propensity prediction are sub-optimal since they rely on specific sequence features and/or were trained on small datasets and without considering rG4 stability information. Here, we developed rG4detector, a convolutional neural network to predict the rG4 propensity of any given RNA sequence. We demonstrated that rG4detector outperforms existing methods over various transcriptomic datasets. In addition, we used rG4detector to detect potential rG4s in transcriptomic data and showed that it improves detection performance compared to existing methods. Last, we interrogated rG4detector for the important features it learned and discovered known and novel molecular principles behind rG4 formation. We expect rG4detector to advance future rG4 research by accurate detection and propensity prediction of rG4s. The code, trained models, and processed datasets are publicly available via github.com/OrensteinLab/rG4detector.