#AI reads Urine #Artificial intelligence in pancreatic cancer diagnosis -- Is this result attributed to algorithm or urine?

Published 25 April, 2025

Pancreatic cancer remains a significant threat despite medical advancements, with its low five - year survival rate and difficulties in early diagnosis. Current diagnostic methods, such as endoscopic ultrasonography, positron emission tomography, computed tomography, and magnetic resonance imaging, are costly and have limited sensitivity and specificity. Identifying pancreatic cancer biomarkers in physiological fluids like urine offers a more accessible and less invasive approach.

This study utilized a publicly available dataset from Kaggle (https://www.kaggle.com/datasets/johndavisiurinary - biomarkers - for - pancreatic - cancer). Comprising 590 patient urine samples, the dataset contains multiple attributes including 'id', 'patient cohort','sample origin', 'age','sex', 'diagnosis','stage', 'benign sample diagnosis', 'plasma CA19 9', 'creatinine', 'LYVE1', 'REG1B', 'TFF1', and 'REG1A'. The 'diagnosis' values of '1', '2', and '3' represent healthy patients, benign cases, and malignant pancreatic conditions respectively.

Six machine - learning algorithms (Logistic Regression, K - nearest neighbors, Random Forest, Support Vector Machine, Naïve Bayes, and Decision Tree) were applied to develop conventional models for classifying cancerous and non - cancerous pancreatic cases. An ensemble voting classifier was created from the best - performing single models, and six novel hybrid models were formed by hybridizing the ensemble classifier with each of the single models.

The performance of these models was evaluated using metrics like accuracy, precision, recall, F1 - score, and AUC - ROC. The ensemble voting classifier outperformed all standalone models, achieving an accuracy of 96.61% and a precision of 98.72%. Among the six hybrid models, the voting classifier - random forest hybrid model had the best performance, with an AUC of 99.05% (95% confidence interval: 0.93 - 1.00).

Shapley Additive Explanations (SHAP) was used to interpret the model outcomes. It indicated that features such as benign sample diagnosis, TFF1, and LYVE1 had a significant positive impact on the machine - learning model's performance in diagnosing pancreatic cancer.Overall, this study demonstrated the potential of using urine - based biomarkers and machine - learning models for pancreatic cancer diagnosis.  

The author probably thinks it's because of a good algorithm. In fact, judging from the graph, it seems that each algorithm is quite good. It's more likely that the data from urine is of high quality for pancreatic cancer diagnosis rather than the algorithm being excellent. 

Sci Rep. 2025 Apr 23;15(1):14038. doi: 10.1038/s41598-025-98298-0.

 

Youhe Gao

Statement: During the preparation of this work the author(s) used Doubao / AI reading for summarizing the content. After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the published article.

 

For earlier AI Reads Urine articles:

https://www.keaipublishing.com/en/journals/advances-in-biomarker-sciences-and-technology/ai-reads-urine/

 

Back to AI Reads Urine

Stay Informed

Register your interest and receive email alerts tailored to your needs. Sign up below.