Sri Harsha Boppana, MBBS, MD1, Manaswitha Thota, MD2, Gautam Maddineni, MD3, Sachin Sravan Kumar Komati, 4, C. David Mintz, MD, PhD5 1Nassau University Medical Center, East Meadow, NY; 2Virginia Commonwealth University, Richmond, VA; 3Florida State University, Cape Coral, FL; 4Florida International University, Florida, FL; 5Johns Hopkins University School of Medicine, Baltimore, MD Introduction: Colorectal cancer (CRC) is a leading cause of cancer-related morbidity and mortality worldwide. Microsatellite instability (MSI) is associated with improved prognosis and a better response to immunotherapy in CRC patients.This study aims to develop and validate predictive models using genomic data, clinical variables, and MSI status to improve personalized treatment strategies.
Methods: The study used the MSI/MSS dataset, consisting of 420 colorectal cancer patients with complete genomic profiles, including mutations, copy number variations (CNVs), gene expression, and clinical data. A machine learning approach was employed for analysis, with the following models developed: XGBoost was used for classifying MSI-high (MSI-H) versus microsatellite stable (MSS) tumors based on genetic and clinical features. Cox proportional hazards models assessed tumor progression and survival time, incorporating genetic features and clinical outcomes. Logistic regression predicted therapy response (chemotherapy or immunotherapy) using genomic markers and clinical data. Data preprocessing included z-score normalization, missing data handling, and synthetic data generation using SMOTE to balance the MSI-H and MSS groups.Model validation was performed with 5-fold cross-validation, and performance metrics such as accuracy, AUC-ROC, and precision-recall curves were evaluated.
Results: The XGBoost model achieved 87% accuracy in classifying MSI-H versus MSS tumors, with an AUC-ROC of 0.92 and a precision-recall AUC of 0.91.This robust performance was particularly enhanced by SMOTE, addressing the class imbalance in the dataset. In survival analysis, the Cox model revealed that MSI-H patients had significantly longer progression-free survival (PFS) compared to MSS patients (p< 0.05), with key mutations in TP53, KRAS, and APC correlating with poorer outcomes. The logistic regression model for therapy response prediction showed an 80% accuracy, with MSI-H patients demonstrating a higher likelihood of responding to immunotherapy than MSS patients.
Discussion: This study presents a data-driven model that integrates genomic and clinical data to predict tumor progression, therapy response and microsatellite instability in colorectal cancer. MSI status and key genetic markers correlate strongly with survival and treatment outcomes, underscoring their value for personalizing therapy. Further validation and real-time clinical integration could enhance these models’ utility in optimizing treatment plans and patient outcomes.
Disclosures: Sri Harsha Boppana indicated no relevant financial relationships. Manaswitha Thota indicated no relevant financial relationships. Gautam Maddineni indicated no relevant financial relationships. Sachin Sravan Kumar Komati indicated no relevant financial relationships. C. David Mintz indicated no relevant financial relationships.
Sri Harsha Boppana, MBBS, MD1, Manaswitha Thota, MD2, Gautam Maddineni, MD3, Sachin Sravan Kumar Komati, 4, C. David Mintz, MD, PhD5. P4574 - Genomic and Clinical Predictive Models for Tumor Progression, Therapy Response, and Microsatellite Instability Status in Colorectal Cancer, ACG 2025 Annual Scientific Meeting Abstracts. Phoenix, AZ: American College of Gastroenterology.