Research

MORA is a real-time regression analysis method that incorporates both dynamics and inter-data batch correlation.

An incremental data analytic based on quadratic inference function (QIF) to analyze streaming correlated datasets, with a screening procedure on occurrences of abnormal data batches.

Detection and prediction of ovulation by incorporating biorhythm information in processing high frequency basal body temperature measurements via hidden Markov model (HMM).

PASA

A hybrid paradigm that integrates online streaming processing into each parallelized data process in a MapReduce framework.

A new framework of real-time estimation and incremental inference in generalized linear models (GLMs) with cross-sectional data streams.

Publications

Submitted & Under revision

Two dominant distributed computing strategies have emerged to overcome the computational bottleneck of supervised learning with big …

This paper develops an incremental data analytic based on quadratic inference function (QIF) to analyze streaming datasets with …

Modern longitudinal data, for example from wearable devices, measures biological signals on a fixed set of participants at a diverging …

Teaching

 
 
 
 
 

Introduction to Data Science (STAT1015, new course)

University of Iowa

Aug 2021 – Dec 2021 Iowa City, Iowa
This course is intended for lower-level undergraduate students. The goal is to prepare students with the necessary knowledge and useful skills to tackle real-world data analysis challenges. This course will cover basic statistical concepts and computing skills in the field of data science. A list of the topics to be covered include:

  • R basics (importing data, data types, sorting and summarizing)
  • data visualization with ggplot2, robust summarises
  • intro to probability, statistical inference, regression models
  • intro to machine learning (classification, clustering, and prediction).
 
 
 
 
 

Mathematical Statistics II (STAT4101)

University of Iowa

Jan 2021 – May 2021 Iowa City, Iowa
This course is a continuation of STAT4100. It is intended for upper-level undergraduate students in the mathematical sciences as well as for graduate students in all disciplines. The goal is to give students a solid foundation in the theory and methods of statistical inference. Main topics include:

  • point estimation and confidence intervals
  • convergence in distribution and convergence in probability
  • maximum likelihood methods
  • sufficient statistics
  • hypothesis testing.
 
 
 
 
 

Mathematical Statistics I (STAT4100)

University of Iowa

Aug 2020 – Dec 2020 Iowa City, Iowa
This is a course in mathematical statistics intended for upper-level undergraduate students in the mathematical sciences as well as for graduate students in all disciplines. The goal is to provide a solid foundation in th theory of random variables and probability distributions.

  • Probability and distributions (sets, random variables, expectation, important inequalities)
  • Multivariate distributions (joint/marginal distributions, transformation, independence)
  • Some special distributions (Binomial, Negative Binomial, Geometric, Poisson, Normal)

Awards

Student Paper Competition Award for Statistical Learning and Data Science (SLDS) 2020

Winning Proposal in the 2020 Joint Shark Tank Retreat

This is an event to encourage collaboration between biostatistics and statistics PhD students. Yinqiu He (from stats) and I proposed a work entitled “real-time regression analysis of streaming health datasets with heterogeneity and correlation”.

Rackham Predoctoral Fellowship for Academic Year 2019-2020

The Rackham Predoctoral Fellowship supports outstanding doctoral students who have achieved candidacy and are actively working on dissertation research and writing

2019 ENAR Distinguished Student Paper Award

2018 MIDAS Annual Symposium Poster Award of Most Innovative Use of Data