Projects & Code
Computer Vision
As part of a European Union Horizon 2020 grant, I led a team who built a deep learning model for automated classification of CT scan contrast phases. This allowed our collaborators at Oxford to curate the data necessary for digital contrast.
Performance: 95%+ accuracy
Lily's Tunelab
I worked on Tunelab, an AI/ML platform launched by Eli Lily that provides biotech companies access to drug discovery models trained on Lilly's proprietary data. The platform uses Rhino's federated computing platform to allow small biotech firms to leverage Lilly's AI capabilities without protecting each party's sensitive data. This application represents the largest active deployment of federated computing across enterprises ever.
Impact: Enabled 15+ pharmaceutical partnerships while maintaining HIPAA compliance.
Data Harmonization Engine (RhinoDHE)
Led the development of the RhinoDHE, which uses GenAI to streamline data harmonization with human in the loop validation while ensuring data stays behind the data custodians' firewalls.
RAG (Retrieval Augmented Generation) System for Healthcare Benefits
Created a chatbot for Rightway's care coordination staff to submit queries about the insurance coverage of a specific procedure or medication. Required parsing of Explanation of Benefits documents from a handful of distinct health insurers.
Visual Analytics for Chronic Disease Care
Interactive visualization system for pattern recognition in patient-generated health data. Features automated anomaly detection and temporal pattern discovery.
Published: JAMIA
Clinical NLP for HIV Risk
NLP pipeline to extract HIV risk factors from unstructured clinical notes. Includes named entity recognition, relation extraction, and risk scoring algorithms.
Results: 85%+ accuracy in automated risk identification
Visual Analytics for Patient-Generated Data
Interactive visualization system for pattern recognition in patient-generated health data. Features automated anomaly detection and temporal pattern discovery.
Published: JAMIA
NLP System for Social Determinants of Health
Semi-supervised learning system for extracting social and behavioral determinants from clinical notes. Created gold-standard annotated corpus for training.
Published: AMIA 2018
OMOP Analysis Utilities
Python utilities for working with OMOP Common Data Model databases. Includes cohort builders, data quality checks, and common analysis patterns.
Population Health Dashboards
Interactive dashboards for population health management and quality reporting. Real-time visualization of care gaps, outcomes metrics, and health disparities.