Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Posts

portfolio

MedCausalAtlas Permalink

Published: April 11, 2026

Automated causal knowledge synthesis from biomedical literature

PanOptic Permalink

Published: April 11, 2026

Multidisciplinary AI Board for Dementia Risk Assessment

publications

Prenatal exposure to air pollutants and childhood atopic dermatitis and allergic rhinitis adopting machine learning approaches: 14-year follow-up birth cohort study

Published in Science of the Total Environment, 2021

The incidence of childhood atopic dermatitis (AD) and allergic rhinitis (AR) is increasing. This warrants development of measures to predict and prevent these conditions. We aimed to investigate the predictive ability of a spectrum of data mining methods to predict childhood AD and AR using longitudinal birth cohort data. We conducted a 14-year follow-up of infants born to pregnant women who had undergone maternal examinations at nine selected maternity hospitals across Taiwan during 2000–2005.

Download Paper

Snippet policy network for multi-class varied-length ECG early classification

Published in IEEE Transactions on Knowledge and Data Engineering, 2022

Arrhythmia detection from ECG is an important research subject in the prevention and diagnosis of cardiovascular diseases. The prevailing studies formulate arrhythmia detection from ECG as a time series classification problem. Meanwhile, early detection of arrhythmia presents a real-world demand for early prevention and diagnosis. In this paper, we address a problem of cardiovascular diseases early classification, which is a varied-length and long-length time series early classification problem as well.

Download Paper

Spatio-attention embedded recurrent neural network for air quality prediction

Published in Knowledge-based Systems, 2022

Predicting the air quality index (AQI) has been regarded as a critical problem for environmental control management. Many factors over time and space may relate to the diffusion of pollutants. In other words, there exist very intricate spatio-temporal interactions among the characteristic for revealing diffusion of pollutants. Recently, some relevant works studied the topic of AQI prediction considering spatial and temporal correlations simultaneously, but most of them either ignore geospatially topological structures to learn spatio-temporal dependency or utilize sub-modules separately to encode the spatial and temporal information. Unfortunately, ignoring geospatially topological structures or correlations among spatial properties and temporal dependencies leads that the AQI prediction model cannot deal with the prediction task well…

Download Paper

Snippet policy network v2: Knee-guided neuroevolution for multi-lead ecg early classification

Published in IEEE Transactions on Neural Networks and Learning Systems, 2022

Early time series classification predicts the class label of a given time series before it is completely observed. In time-critical applications, such as arrhythmia monitoring in ICU, early treatment contributes to the patient’s fast recovery, and early warning could even save lives. Hence, in these cases, it is worthy of trading, to some extent, classification accuracy in favor of earlier decisions when the time series data are collected over time. In this article, we propose a novel deep reinforcement learning-based framework, snippet policy network V2 (SPN-V2), for long and varied-length multi-lead electrocardiogram (ECG) early classification.

Download Paper

A novel constraint-based knee-guided neuroevolutionary algorithm for context-specific ECG early classification

Published in IEEE Journal of Biomedical and Health Informatics, 2022

Cardiovascular diseases (CVDs) are considered the greatest threat to human life according to World Health Organization. Early classification of CVDs and the appropriate follow-up treatment are crucial for preventing sudden deaths. Electrocardiogram (ECG) is one of the most common non-invasive tools used to evaluate the state of the heart, which can be exploited to automatically diagnose as well. However, the importance of diagnosing CVDs is varying in different context-specific scenarios….

Download Paper

Neural Network to Identifying Type 2 Diabetes (T2D) Progression Subphenotypes

Published in American Diabetes Association, 2024

T2D is a heterogeneous disease with variations in presentation, progression, and response to treatments across individuals. We developed a novel GNN-based framework to identify distinct T2D progression pathways using electronic health records (EHR) data.

Download Paper

A scoping review of fair machine learning techniques when using real-world data

Published in Journal of Biomedical Informatics, 2024

The integration of artificial intelligence (AI) and machine learning (ML) in health care to aid clinical decisions is widespread. However, as AI and ML take important roles in health care, there are concerns about AI and ML associated fairness and bias. That is, an AI tool may have a disparate impact, with its benefits and drawbacks unevenly distributed across societal strata and subpopulations, potentially exacerbating existing health inequities. Thus, the objectives of this scoping review were to summarize existing literature and identify gaps in the topic of tackling algorithmic bias and optimizing fairness in AI/ML models using real-world data (RWD) in health care domains.

Download Paper

Evolution of digital twins in precision health applications: a scoping review study

Published in NPJ Digital Medicine, 2024

An increasing amount of research is incorporating the concept of digital twins in biomedical and health care applications. This scoping review summarizes existing research and identifies gaps in the development and use of digital twins in the health care domain. We reviewed the current state of digital twins for precision health, including their definitions, frameworks, enabling technologies, and applications across various health domains.

Download Paper

A Fair individualized polysocial risk score for identifying increased social risk in type 2 diabetes

Published in Nature Communications, 2024

Racial and ethnic minorities bear a disproportionate burden of type 2 diabetes (T2D) and its complications, with social determinants of health (SDoH) recognized as key drivers of these disparities. Implementing efficient and effective social needs management strategies is crucial…

Download Paper

An Interpretable Population Graph Network to Identify Rapid Progression of Alzheimer’s Disease Using UK Biobank

Published in AMIA Annual Symposium Proceedings, 2024

This study proposes an interpretable population graph network framework for identifying rapid progressors of Alzheimer’s Disease by utilizing patient information from electronic health-related records in the UK Biobank. The framework creates a patient similarity graph where each AD patient is represented as a node with edges established by clinical characteristics distance, using graph neural networks (GNNs) and a GNN Explainer with SHAP analysis for interpretability.

Download Paper

Cerebra: A Multidisciplinary AI Board for Multimodal Dementia Characterization and Risk Assessment

Published in arXiv, 2025

Download Paper

Optimizing Strategy for Lung Cancer Screening: From Risk Prediction to Clinical Decision Support

Published in JCO Clinical Cancer Informatics, 2025

This study proposes an advanced pipeline that integrates machine learning (ML) and causal inference techniques to optimize lung cancer screening decisions. Using real-world data from the OneFlorida+ Clinical Research Consortium, we developed ML models to predict individual lung cancer risk and estimate the benefits of LDCT screening, and applied explainable artificial intelligence techniques to identify key risk factors. The models demonstrated predictive performance with AUCs of 0.777 and 0.793 for 1-year and 3-year risk predictions, respectively.

Download Paper

From Image to Report: Automating Lung Cancer Screening Interpretation and Reporting with Vision-Language Models

Published in Journal of Biomedical Informatics, 2025

We present LUMEN (Lung cancer screening with Unified Multimodal Evaluation and Navigation), a multimodal AI framework that mimics the radiologist’s workflow by identifying nodules in LDCT images, generating their characteristics, and drafting corresponding radiology reports in accordance with reporting guidelines. LUMEN integrates computer vision, vision-language models (VLMs), and large language models (LLMs) to automate lung cancer screening interpretation and reporting.

Download Paper

Identifying Alzheimer’s Disease Progression Subphenotypes via a Graph-based Framework using Electronic Health Records

Published in Journal of Healthcare Informatics Research, 2026

This study developed a novel approach that combines a graph neural network (GNN)-based framework with time series clustering to characterize progression subphenotypes from MCI to AD. Applied to a real-world cohort of 2,525 patients (61.66% female; mean age 76 years), the model identified four distinct progression subphenotypes, each exhibiting characteristic clinical patterns, with average MCI-to-AD progression times ranging from 805 to 1,236 days. The findings indicate that AD does not follow a uniform progression trajectory but instead manifests heterogeneous pathways.

Download Paper

Temporally Detailed Hypergraph Neural ODEs for Type 2 Diabetes Progression Modeling

Published in International Conference on Learning Representations (ICLR), 2026

We propose Temporally Detailed Hypergraph Neural Ordinary Differential Equation (TD-HNODE), which represents disease progression on clinically recognized trajectories as a temporally detailed hypergraph and learns the continuous-time progression dynamics via a neural ODE framework. Experiments on two real-world clinical datasets demonstrate that TD-HNODE outperforms multiple baselines in modeling the progression of type 2 diabetes and related cardiovascular diseases.

Download Paper

talks

Care coordination and patient safety outcome: a network study: Celebration of Research

Published: February 27, 2023

This is a presentation of the study “Care coordination and patient safety outcome: a graph-based approach”

Developing A Fair Individualized Polysocial Risk Score (iPsRS) for Identifying Increased Social Risk of Hospitalizations in Patients with Type 2 Diabetes (T2D)

Published: June 24, 2023

This is a presentation of the study “Developing A Fair Individualized Polysocial Risk Score (iPsRS) for Identifying Increased Social Risk of Hospitalizations in Patients with Type 2 Diabetes (T2D)”

Preconference Course: Artificial Intelligence for Pharmacoepidemiology Research: An Introduction (Types of machine learning methods and algorithms)

Published: April 14, 2024

This is a course about foundation of machine learning methods.

Health Digital Twin: AI and Machine Learning Meet Real-world Data

Published: September 06, 2024

This is a talk about how to create health digital twin using real-world data.

Real-World Data to Real-World Evidence: Successes, Challenges, and Opportunities

Published: September 12, 2025

Real World Data (RWD), including electronic health records (EHR), claims, and billing data, capture detailed, longitudinal patient information, covering demographics, comorbidities, treatments, and outcomes. They have become as a vital data source reflecting real-world patient populations and clinical settings. Applying AI/ML models to RWD have generated insights and real-world evidence (RWE), such as predicting disease onset risk, modeling progression pathways, and supporting more nuanced patient stratification strategies. AI-derived RWE increasingly informs clinical and regulatory decisions.

Yu Huang

Sitemap

Pages

Posts

portfolio

publications

talks

teaching