Sheza Malik, MD1, Renisha Redij, MD2, Dushyant S. Dahiya, MD3, Umar Hayat, MD4, Douglas G. Adler, MD, FACG5 1Emory University, Atlanta, GA; 2Trinity Health Livonia Hospital, Livonia, MI; 3University of Kansas School of Medicine, Kansas City, KS; 4Geisinger Wyoming Valley Medical Center, Wilkes-Barre, PA; 5Center for Advanced Therapeutic (CATE), Centura Health, Porter Adventist Hospital, Peak Gastroenterology, Denver, CO Introduction: Inflammatory Bowel Disease (IBD), including Crohn’s Disease (CD) and Ulcerative Colitis (UC), presents significant challenges in predicting treatment response and long-term remission due to disease heterogeneity. Traditional clinical and endoscopic assessments may be subjective and resource-intensive, necessitating novel approaches. Machine Learning (ML) models have emerged as potential tools for improving predictive accuracy in treatment outcomes, enabling more personalized management strategies for IBD patients. Methods: Following PRISMA guidelines, a systematic review was conducted to evaluate the application of ML in predicting treatment responses and remission in IBD. Data extraction was performed using the CHARMS checklist, and bias assessment was conducted using the PROBAST tool. Given the heterogeneity in study methodologies, a meta-analysis was not feasible; instead, descriptive statistics summarized the findings. Results: The systematic review analyzed six studies (3 retrospective cohort analyses and 3 post-hoc analyses of Phase III trials), encompassing 67 to 3,004 IBD patients. ML models, predominantly ensemble methods such as Random Forest and XGBoost, demonstrated low-to-moderate predictive accuracy for treatment response and remission (AUROC: 0.489 to 0.811; sensitivity: 0.46–0.96; specificity: 0.56–0.98). Input features varied widely, including biomarkers (e.g., CRP, fecal calprotectin), endoscopic scores (SES-CD, Mayo), and genetic markers (NOD2). While models integrating multi-modal data achieved superior performance, only two studies employed external validation, and 50% exhibited high risk of bias in statistical analysis due to inadequate handling of missing data or overfitting. Heterogeneity in outcome definitions (clinical vs. biochemical remission) and validation strategies further limited generalizability. Discussion: ML models show significant promise in predicting treatment outcomes and remission in IBD, potentially enhancing clinical decision-making and reducing reliance on invasive assessments. However, the substantial heterogeneity across studies and methodological limitations highlight the need for larger, prospective studies with standardized ML assessment frameworks. Future research should focus on robust validation, feature selection standardization, and improved interpretability to enhance the clinical applicability of ML in IBD management.
Figure: PROBAST Score For Risk of Bias Assesment
Disclosures: Sheza Malik indicated no relevant financial relationships. Renisha Redij indicated no relevant financial relationships. Dushyant Dahiya indicated no relevant financial relationships. Umar Hayat indicated no relevant financial relationships. Douglas Adler: Boston Scientific – Consultant.
Sheza Malik, MD1, Renisha Redij, MD2, Dushyant S. Dahiya, MD3, Umar Hayat, MD4, Douglas G. Adler, MD, FACG5. P5421 - Machine Learning in Predicting Treatment Response and Remission in Inflammatory Bowel Disease: A Systematic Review, ACG 2025 Annual Scientific Meeting Abstracts. Phoenix, AZ: American College of Gastroenterology.