About me
I am a 1st year Master student in Data Science at Harvard and just graduated from UCLA double major in Statistics and Math/Econ. My research interest is statistical machine learning, especially the applications in Genomics and Bioinformatics. I am grateful to be advised by Prof. Jessica Jingyi Li and Prof. Mathieu Bauchy in my undergraduate research. Also, I took internships at Adobe, Thumbtack, TAL Education Group, and onomy.
My research focuses on developing statistical tools for Single-cell RNA sequencing data. I built ML-based cell type similarity trees to refine ambiguity and subjectivity in the cell type annotation and designed an R package for scGTM(Cui et al., 2022) that uses GAM motivated model to fit gene expression trends along cell pseudotime.
Besides applying Statistics in Biology, I am also interested in model interpretability and robustness. I conducted research in Symbolic Regression to predict material’s fracture energy with geometric features to overcome the interpretability issues in CNN.
Here is a list of Machine Learning projects I did as an intern:
- Sales Lead Score Model
- Click Sequence Clustering
- Reddit Sentiment Analysis and Topic Modeling
- Markov Chain x Customer Adoption Journey