Predicting clinical outcomes in neuroblastoma with genomic data integration
Acar, D. Alp Emre
Aderinwale, Tunde W.
MetadataShow full item record
Background: Neuroblastoma is a heterogeneous disease with diverse clinical outcomes. Current risk group models require improvement as patients within the same risk group can still show variable prognosis. Recently collected genome-wide datasets provide opportunities to infer neuroblastoma subtypes in a more unified way. Within this context, data integration is critical as different molecular characteristics can contain complementary signals. To this end, we utilized the genomic datasets available for the SEQC cohort patients to develop supervised and unsupervised models that can predict disease prognosis. Results: Our supervised model trained on the SEQC cohort can accurately predict overall survival and event-free survival profiles of patients in two independent cohorts. We also performed extensive experiments to assess the prediction accuracy of high risk patients and patients without MYCN amplification. Our results from this part suggest that clinical endpoints can be predicted accurately across multiple cohorts. To explore the data in an unsupervised manner, we used an integrative clustering strategy named multi-view kernel k-means (MVKKM) that can effectively integrate multiple high-dimensional datasets with varying weights. We observed that integrating different gene expression datasets results in a better patient stratification compared to using these datasets individually. Also, our identified subgroups provide a better Cox regression model fit compared to the existing risk group definitions. Conclusion: Altogether, our results indicate that integration of multiple genomic characterizations enables the discovery of subtypes that improve over existing definitions of risk groups. Effective prediction of survival times will have a direct impact on choosing the right therapies for patients.