I am an Assistant Professor in the Department of Statistics and Actuarial Science at the University of Iowa. I did my PhD in the Department of Biostatistics at the University of Michigan, working with Professor Peter X.K. Song. My research interests lie at developing real-time analytics to address methodological needs in the analysis of streaming data such as periodically updated large-scale database and mobile health data. In particular, I am interested in developing online analysis toolbox in regression models with a main focus on statistical inference. Some specific topics I am currently working on include online high-dimensional statistical inference, addressing dynamic heterogeneity and inter-data batch correlation in streaming data, and online semi-parametric regression.

- Streaming processing
- Change-point detection
- Sequential testing
- Mobile health

PhD in Biostatistics, 2020

University of Michigan - Ann Arbor

MS in Biostatistics, 2016

University of Michigan - Ann Arbor

BS in Biology, 2013

Huazhong University of Science and Technology

MORA is a real-time regression analysis method that incorporates both dynamics and inter-data batch correlation.

An incremental data analytic based on quadratic inference function (QIF) to analyze streaming correlated datasets, with a screening procedure on occurrences of abnormal data batches.

Detection and prediction of ovulation by incorporating biorhythm information in processing high frequency basal body temperature measurements via hidden Markov model (HMM).

A hybrid paradigm that integrates online streaming processing into each parallelized data process in a MapReduce framework.

Accepted & Published

Multivariate Online Regression Analysis with Heterogeneous Streaming Data.
The Canadian Journal of Statistics (Accepted).

(2021).
Renewable Estimation and Incremental Inference in Generalized Linear Models with Streaming Datasets.
Journal of the Royal Statistical Society: Series B, 82, Part 1, 69-97.

(2020).
Detection and Prediction of Ovulation from Body Temperature Measured by An In-Ear Wearable Therometer.
IEEE Transactions on Biomedical Engineering.

(2019).
Identification of gene pairs through penalized regression Subject to constraints.
BMC Bioinformatics.

(2017).
Novel EDA p.Ile260Ser Mutation Linked to Non-syndromic Hypodontia.
Journal of Dental Research.

(2013).
Submitted & Under revision

Two dominant distributed computing strategies have emerged to overcome the computational bottleneck of supervised learning with big …

This paper develops an incremental data analytic based on quadratic inference function (QIF) to analyze streaming datasets with …

Modern longitudinal data, for example from wearable devices, measures biological signals on a fixed set of participants at a diverging …

This course is intended for lower-level undergraduate students. The goal is to prepare students with the necessary knowledge and useful skills to tackle real-world data analysis challenges. This course will cover basic statistical concepts and computing skills in the field of data science. A list of the topics to be covered include:

- R basics (importing data, data types, sorting and summarizing)
- data visualization with ggplot2, robust summarises
- intro to probability, statistical inference, regression models
- intro to machine learning (classification, clustering, and prediction).

This course is a continuation of STAT4100. It is intended for upper-level undergraduate students in the mathematical sciences
as well as for graduate students in all disciplines. The goal is to give students a solid foundation in the theory and methods of statistical inference.
Main topics include:

- point estimation and confidence intervals
- convergence in distribution and convergence in probability
- maximum likelihood methods
- sufficient statistics
- hypothesis testing.

This is a course in mathematical statistics intended for upper-level undergraduate students
in the mathematical sciences as well as for graduate students in all disciplines. The goal is
to provide a solid foundation in th theory of random variables and probability distributions.

- Probability and distributions (sets, random variables, expectation, important inequalities)
- Multivariate distributions (joint/marginal distributions, transformation, independence)
- Some special distributions (Binomial, Negative Binomial, Geometric, Poisson, Normal)

This is an event to encourage collaboration between biostatistics
and statistics PhD students. Yinqiu He (from stats) and I proposed a work entitled
“real-time regression analysis of streaming health datasets with heterogeneity and correlation”.

The Rackham Predoctoral Fellowship supports outstanding doctoral
students who have achieved candidacy and are actively working on dissertation research
and writing