University of Chicago Medicine Inflammatory Bowel Disease Center, Chicago, IL, USA Chicago, IL
Julian Lehrer, 1, Pavel Brodskiy, PhD1, Mohammad Haft-Javaherian, PhD1, Daniel Colucci, 2, Darren Thomason, MBA1, Klaus Gottlieb, MD, PhD, JD3, David T. Rubin, MD4 1Iterative Health Inc, Cambridge, MA; 2Iterative Health Inc, New York, NY; 3Eli Lilly and Company, Indianapolis, IN; 4University of Chicago Medicine Inflammatory Bowel Disease Center, Chicago, IL, USA, Chicago, IL Introduction: Artificial Intelligence assessment of Endoscopic Severity and Extent (AI-ESe) is a deep learning approach to continuous assessment of inflammation on endoscopy in ulcerative colitis (UC). AI-ESe measures the endoscopic subscore at discrete segments from the rectum to the maximum extent, generating a granular heatmap of inflammatory activity. We aim to validate the performance of our segment-level endoscopic subscore assessment model. Methods: We used a previously developed deep learning model to assess the endoscopic subscore, and adapted the algorithm to optimize performance on segments. A total of 47 endoscopic video recordings from the Phase 3 induction trial for mirikizumab in UC (NCT03518086) and from routine practice were used as a holdout test set. Each video had segments pre-defined every 15 seconds with correction for stalling, aligned with AI-ESe’s design. A panel of seven experienced human readers were trained to evaluate the endoscopy subscore with a minimum quadratic weighted kappa (QWK) of 0.6 required. Each segment was assigned an endoscopic subscore via independent assessment by 3 readers with the median assigned as the final segment score to mimic the 2+1 workflow, the current regulatory standard. We evaluated the agreement between the model and human readers. Results: A total of 823 segments were scored. Complete agreement in the endoscopy subscore among all three reviewers was achieved in 57.2% of segments, in line with published data on inter-rater variability among human reviewers. Inter-rater agreement for segment scores between the model and the final score generated via the 2+1 workflow was very good (QWK 0.82 (95% confidence interval 0.79-0.85)). Model performance is similar to any individual human reader (Spearman correlation range human-human 0.57-0.85, model-human 0.70-0.87). Discussion: We demonstrate that our deep learning model accurately assesses the endoscopic subscore at a segment-level in UC. This supports the use of AI-ESe to accurately assess inflammation severity at a more granular level in UC.
Figure: Table 1. Key model performance metrics for assessment of the endoscopic subscore against the 2+1 reference standard. Abbreviations: Acc, accuracy; QWK, quadratic weighted kappa.
Figure: Figure 1. Spearman correlation matrix for the endoscopy subscore between AI-ESe and all individual human readers.
Disclosures: Julian Lehrer: Iterative Health Inc – Employee. Pavel Brodskiy: Iterative Health Inc – Employee. Mohammad Haft-Javaherian: Iterative Health Inc – Employee. Daniel Colucci: Iterative Health Inc – Employee. Darren Thomason: Iterative Health – Employee. Klaus Gottlieb: Eli Lilly – Employee. David Rubin: AbbVie – Advisory Committee/Board Member, Consultant, Speaker fees. Abivax SA – Consultant. Altrubio – Advisory Committee/Board Member, Consultant, Speaker feees, Stock Options. Avalo – Advisory Committee/Board Member, Consultant, Speaker fees. Bausch Health – Consultant. Bristol Myers Squibb – Advisory Committee/Board Member, Consultant, Speaker fees. Buhlmann Diagnostics – Advisory Committee/Board Member, Consultant, Speaker fees. Celltrion – Consultant. ClostraBio – Consultant. Connect BioPharma – Consultant. Cornerstones Health, Inc – Board of Directors membership. Douglas Pharmaceuticals – Consultant. Eli Lilly & Co. – Consultant. Foresee, Genentech (Roche) Inc. – Consultant. Image Analysis Group – Consultant. InDex Pharmaceutical – Consultant. Intouch Group – Advisory Committee/Board Member, Consultant, Speaker fees. Iterative Health – Advisory Committee/Board Member, Consultant, Speaker fees. Iterative Health – Stock Options. Janssen Pharmaceuticals – Consultant. Lilly – Advisory Committee/Board Member, Consultant, Speaker fees. Odyssey Therapeutics – Consultant. Pfizer – Advisory Committee/Board Member, Consultant, Speaker fees. Sanofi – Consultant. Takeda – Advisory Committee/Board Member, Consultant, Grant/Research Support, Speaker fees. Throne – Consultant. Vedanta – Consultant.
Julian Lehrer, 1, Pavel Brodskiy, PhD1, Mohammad Haft-Javaherian, PhD1, Daniel Colucci, 2, Darren Thomason, MBA1, Klaus Gottlieb, MD, PhD, JD3, David T. Rubin, MD4. P1067 - Segment Level Validation of a Deep Learning Model to Assess Endoscopic Severity in Ulcerative Colitis Using Regulatory-Grade Consensus, ACG 2025 Annual Scientific Meeting Abstracts. Phoenix, AZ: American College of Gastroenterology.