Sri Harsha Boppana, MBBS, MD1, Manaswitha Thota, MD2, Gautam Maddineni, MD3, Sachin Sravan Kumar Komati, 4, Sarath Chandra Ponnada, 5, Sai Lakshmi Prasanna Komati, MBBS6, C. David Mintz, MD, PhD7 1Nassau University Medical Center, East Meadow, NY; 2Virginia Commonwealth University, Richmond, VA; 3Florida State University, Cape Coral, FL; 4Florida International University, Florida, FL; 5Great Eastern Medical School and Hospital, Srikakulam, Srikakulam, Andhra Pradesh, India; 6Government Medical College, Ongole, Ongole, Andhra Pradesh, India; 7Johns Hopkins University School of Medicine, Baltimore, MD Introduction: Accurate classification of liver disease stage remains essential for guiding treatment and monitoring progression. Traditional assessment relies on invasive biopsy or specialist interpretation of laboratory trends. We leveraged routinely collected demographic and biochemical data to develop and compare five machine-learning algorithms for automated assignment into five clinically relevant categories: donor, hepatitis, fibrosis, cirrhosis, and suspect. Methods: We assembled a structured dataset of demographic variables and 13 routinely collected laboratory biomarkers from electronic health records. We handled missing values with median imputation for continuous data and mode imputation for categorical data, then engineered ALT/AST and ALP/ALT ratios before applying min–max normalization. To correct for imbalance across five clinically relevant categories (Donor; Hepatitis; Fibrosis; Cirrhosis; Suspect), we employed the Synthetic Minority Oversampling Technique. We split the data 80:20 into training and test sets, trained logistic regression, random forest, XGBoost, stacking ensemble (random forest + XGBoost), and feedforward neural network models using five-fold cross-validation, and evaluated final performance on the held-out test set. We applied SHAP (SHapley Additive exPlanations) to interpret feature contributions. Results: The feedforward neural network achieved the highest accuracy at 95.1 %, outperforming XGBoost (93.5 %), random forest (92.7 %), stacking ensemble (92.7 %), and logistic regression (88.6 %). Per-class F1 scores for the FNN reached 0.99 for Donor and 0.83 for Cirrhosis, with all classes maintaining precision and recall above 0.80. SHAP analysis identified AST, albumin, and bilirubin as the most influential predictors across classifications. Balanced performance on minority classes and clear differentiation among disease stages underscore the model’s promise for real-time detection and staging of hepatitis C–related liver disease. Discussion: A feedforward neural network achieved the highest overall accuracy (95.1 %) and maintained strong F1 scores across all classes, including minority categories. Interpretability via SHAP highlighted AST, albumin, and bilirubin as primary drivers. Embedding this model within electronic health record systems offers a scalable strategy for real-time screening and staging; prospective, multi-center validation will determine its clinical utility.
Figure: Figure 1
Disclosures: Sri Harsha Boppana indicated no relevant financial relationships. Manaswitha Thota indicated no relevant financial relationships. Gautam Maddineni indicated no relevant financial relationships. Sachin Sravan Kumar Komati indicated no relevant financial relationships. Sarath Chandra Ponnada indicated no relevant financial relationships. Sai Lakshmi Prasanna Komati indicated no relevant financial relationships. C. David Mintz indicated no relevant financial relationships.
Sri Harsha Boppana, MBBS, MD1, Manaswitha Thota, MD2, Gautam Maddineni, MD3, Sachin Sravan Kumar Komati, 4, Sarath Chandra Ponnada, 5, Sai Lakshmi Prasanna Komati, MBBS6, C. David Mintz, MD, PhD7. P5865 - Artificial Intelligence-Driven Multiclass Model for Early Detection and Prognostication of Hepatitis C Using EHR-Based Laboratory Data, ACG 2025 Annual Scientific Meeting Abstracts. Phoenix, AZ: American College of Gastroenterology.