The aim of this study was to develop a diagnostic strategy for esophageal squamous cell carcinoma(ESCC)that combines plasma metabolomics with machine learning algorithms.Plasma-based untargeted metabolomics analysis was performed with samples derived from 88 ESCC patients and 52 healthy controls.The dataset was split into a training set and a test set.After identification of differential metabolites in training set,single-metabolite-based receiver operating characteristic(ROC)curves and multiple-metabolite-based machine learning models were used to distinguish between ESCC patients and healthy controls.Kaplan-Meier survival analysis and Cox proportional hazards regression analysis were performed to investigate the prognostic significance of the plasma metabolites.Finally,twelve differential plasma metabolites(six up-regulated and six down-regulated)were annotated.The predictive performance of the six most prevalent diagnostic metabolites through the diagnostic models in the test set were as follows:arachidonic acid(accuracy:0.887),sebacic acid(accuracy:0.867),indoxyl sulfate(accuracy:0.850),phosphatidylcholine(PC)(14:0/0:0)(accuracy:0.825),deoxycholic acid(accuracy:0.773),and trimethylamine N-oxide(accuracy:0.653).The prediction accuracies of the machine learning models in the test set were partial least-square(accuracy:0.947),random forest(accuracy:0.947),gradient boosting machine(accuracy:0.960),and support vector machine(accuracy:0.980).Additionally,survival analysis demonstrated that acetoacetic acid was an unfavorable prognostic factor(hazard ratio(HR):1.752),while PC(14:0/0:0)(HR:0.577)was a favorable prognostic factor for ESCC.This study devised an innovative strategy for ESCC diagnosis by combining plasma metabolomics with machine learning algorithms and revealed its potential to become a novel screening test for ESCC.
Zhongjian ChenXiancong HuangYun GaoSu ZengWeimin Mao