Repo for submitted paper (Bifarin et al 2021 Journal Proteome Research)
Machine Learning-enabled Renal Cell Carcinoma Status Prediction Using Multi-Platform Urine-based Metabolomics
Abstract: Renal cell carcinoma (RCC) is diagnosed through expensive cross-sectional imaging, frequently followed by renal mass biopsy, which is not only invasive but also prone to sampling errors. Hence, there is a critical need for a non-invasive diagnostic assay. RCC exhibits altered cellular metabolism combined with the close proximity of the tumor(s) to the urine in the kidney, suggesting urine metabolomic profiling is an excellent choice for assay development. Here, we acquired liquid chromatography-mass spectrometry (LC-MS) and nuclear magnetic resonance (NMR) data followed by the use of machine learning (ML) to discover candidate metabolomic panels for RCC. The study cohort consisted of 105 RCC patients and 179 controls separated into two sub-cohorts: the model cohort and the test cohort. Univariate, wrapper, and embedded methods were used to select discriminatory features using the model cohort. Three ML techniques, each with different induction biases, were used for training and hyperparameter tuning. Assessment of RCC status prediction was evaluated using the test cohort with the selected biomarkers and the optimally-tuned ML algorithms. A seven-metabolite panel predicted RCC in the test cohort with 88% accuracy, 94% sensitivity, and 85% specificity, and an AUC of 0.98.
Novel Aspect: Our results provide evidence that RCC diagnosis may be possible via a routine urine test.