“Distributed Learning from Multiple EHR Databases: Contextual Embedding Models for Predicting Medical Events”
Abstract: Electronic health records (EHRs) data offer great promises in personalized medicine. However, EHRs data also present significant analytical challenges due to their irregularity and complexity. In addition, analyzing EHR data raises concerns about privacy and sharing such data across multiple institutions/sites may be infeasible due to regulatory hurdles etc. Building on a contextual embedding model, we propose a distributed learning method to learn from multiple EHRs databases and build predictive models for multiple diagnoses simultaneously. In addition, this method can use both structured and unstructured data. We also augment the proposed method with Differential Privacy to further enhance data privacy protection. Our numerical studies demonstrate that the proposed method can build predictive models in a distributed fashion with privacy protection and the resulting models achieve comparable prediction accuracy compared with existing methods that use pooled data across all sites. This is joint work with Ziyi Li, Kirk Roberts, and Xiaoqian Jiang.
Division of Biostatistics seminars
For inquiries contact Chengjie Xiong.