Introduction:
Breast cancer is the most lethal cancer found in women but screening and therapy have improved the survival rates dramatically. However, recurrence remains to be a major problem faced by patients. Most drugs become ineffective when cancer reappear due to the aggressive growth rate of the recurrent tumor. Effective classification of tumor grade before therapy could help the patients to avoid recurrence and allow clinicians to choose optimal therapy.
Methods:
We used classical machine learning approach information gain and structural equation modeling to prioritize and find a structural relationship between breast cancer genes, recurrence events and recurrence free survival using AMOS software in a large microarray breast cancer cohort obtained from KMplotter.
Results:
We identified signature genes which are predictive of recurrence free survival in breast cancer patients. The model was able to find unique genes that can differentiate between the grades. The identified genes for grade 1 NEUROD2, IFNA14, SMCP, A1CF, CNTNAP1, APBB3 and G6PC2 are predictive of survival with a hazard ratio of 0.43 (p-value - 1e-16). For Grade 2, we identified CYP11B1, ARHGEF38, CHRNB3, and NTNG1 genes, which are predictive of survival with a hazard ratio of 0.75 (p-value - 2.2e-4). Whereas, grade 3 included NTNG1, ALS2CL, A1CF, and P2RY4 genes which are predictive of survival with a hazard ratio of 0.62 (p-value - 2.8e-09).
Discussion:
The recurrence free survival model developed might aid in effective classification of tumor grades, predicting survival and recommending optimal therapy for the patients.