I study learning from data with structures, motivated by applications in healthcare and social science. Specifically, my research focuses on high-dimensional data that are supported on structures such as graphs/networks, low-dimensional manifolds, or sparsity, and explores how our knowledge about those structures can be exploited to solve challenging inference problems more accurately and quickly.

Mathematically, I am interested in developing algorithms for nonconvex optimization with provable performance guarantees using data geometry, for which I leverage signal processing, machine learning, optimization, information theory, network science, spectral graph theory, and (high-dimensional) statistics.

I am interested in data science (statistical signal processing and machine learning) with


In author list, * denotes equal contribution, and † indicates alphabetical ordering.

Data Theory and Methods

Graph-structured data

Graph regularization for trend filtering (applied to transportation data), matrix factorization (applied to remote sensing), and federated multi-task learning (applied to census data).

Data supported on low-dimensional structure

Optimal transport

  • A Blob Method for Dynamic Optimal Transport with State and Control Constraints. Ongoing work with Katy Craig and Karthik Elamvazhuthi.

Computational Social Science Applications

Science of science

Network analysis and natural language processing to bring together traditionally disjointed scientific fields in diverse intelligence (including cognitive science and ML/AI).

Data science for social good

  • Power analysis of a statistical test to quantify gerrymandering. Ongoing work with Ranthony AC Edmonds, Susan Glenn, and Soledad Villar.

  • Branching models for gender imbalance in math. Ongoing work with Heather Zinn Brooks, Phil Chodrow, Anna Haensch, Mason Porter, and Juan Restrepo.

Healthcare Applications

Speech and audio

Deep learning for non-semantic speech and audio signals including coughing and sneezing.

Sleep health

Pediatric sleep scoring.


Risk model from long-term EEG signals.