Using Public Data to Train a Convolutional Neural Network for Classifying Radiological Images According To Imaging Modality and Organ System

Zehavit Kirshenboim ^1,2 Peniel Argaw ³ Nathalie Bloch ⁴ Shelly Soffer ² Michal Amitai ^1,2 Orit Shimon ² Eli Konen ^1,2 Eyal Klang ^1,2

¹The Division of Diagnostic Imaging, The Chaim Sheba Medical Center
²Sackler Faculty of Medicine, Tel Aviv University
³Electrical Engineering and Computer Science, Massachusetts Institute of Technology
⁴Innovation Center, The Chaim Sheba Medical Center

PURPOSE: Convolutional Neural Network (CNN) algorithms have shown tremendous ability for image analysis. There is a rapidly expanding interest in this technology in radiology.

The aim of this study is to present a method using the Google Images search engine to acquire publicly available radiological images, and then classify them with a CNN algorithm according to imaging modality and organ system.

MATERIAL AND METHODS: This study was conducted with the help of ARC- The Innovation Center at Sheba Hospital.

For each of the main imaging modalities, X-RAY, Ultrasound (US), CT and MRI, we looked for the 3 most frequently scanned organ systems in our hospital in the year 2017, giving a total of 12 categories (i.e. head CT, chest X-RAY).

The algorithms were written in Python 3.6 using the Keras library and TensorFlow library as backend.

A dedicated library was used to scrape Google Images search engine in order to acquire public free images for each category. The images were manually classified into "proper", if it was a relevant radiologic image, and "improper" for all other images.

For CNN architecture we used the VGG16 network with transfer learning. The top model consisted of two layers of 256 and 32 fully connected ReLU neurons.

We examined the capability of the network to classify the imaging modalities and the organ system. Accuracy was the metrics for analyzing the network performance using 80/20 training/testing split.

RESULTS: The categories were: X-RAY: chest, abdomen and lower extremities; US: gallbladder, kidneys and lower limbs Doppler; CT: head, abdomen and chest; MRI: head, breast and knee.

The total number of images acquired from Google Images was 7,595 out of which 4,488 were relevant. The average numbers of images in each category were 632 and 374 before and after filtering according to relevance.

The trained CNN showed an accuracy of 98% for classifying imaging modality and 90% accuracy for classifying both the imaging modality and the specific organ system.

CONCLUSION:

We showed that free data available online can be used to train deep learning networks for radiological tasks. The trained network showed excellent results in classifying imaging modalities and organ systems.

Usage of such methods can increase the collection of online radiologic data for both common and rare pathologies, and build powerful artificial intelligent (AI) systems.

Going towards an AI future in diagnostic radiology, this work demonstrates the power of combining CNNs with the endlessly growing online information.

Zehavit Kirshenboim