CV
Download PDF Updated 2021-01
Education
- Ph.D in Biostatistics, New York University, 2018-2022
- M.S. in Biostatistics, Duke University, 2016-2018
- Bachelor of Medicine in Clinical Medicine (5-year program), Peking University, 2011 - 2016
- B.S. in Statistics (double major), Peking University, 2013 - 2016
Experience
- Data Science Intern at Duke Clinical Research Institutioin (DCRI), Durham, NC, 2016-2018
- Prediction Model for Necrotizing Enterocolitis
- Conducted data exploration for the raw data collected from Duke hospitals.
- Applied the regular expressions to extract information from big data
- Built logistic regression models to predict the disease-Necrotizing Enterocolitis
- Data Science Intern, Duke University Department of Sociology, Durham, NC, 2016-2017
- Conducted a Meta-analysis to present the correlation between church strictness and church growth
- Conducted a system review for relevant papers and extracted data
- Conducted meta-analysis with consideration of a hierarchical data structure and build a model with the correlations.
- Intern (similar as Resident), Beijing Jishuitan Hosptial (北京积水潭医院), Beijing, China 2014-2016
- Received Training for clinicians and surgeons
- Rotated in 15 clinical departments including Cardiology, Surgery, Gynecology, Obstetrics, Pediatrics
- Assisted doctors’ work including conducting physical examination for patients and writing medical records
- Aided in surgeries including Appendectomy and TIPS( Transjugular intrahepatic portosystemic shunt)
Honors and Awards
Poster Award from International Chinese Statistical Association (ICSA) Applied Statistics Symposium, 2021
Eisai (China) Pharmaceutical Scholarship. 2014
China National Scholarship, 2013
Beijing excellent student award, 2013
Peking University excellent learning award, 2012, 2014
Publications
Turner EL, Yao L, Li F, Prague M. Properties and pitfalls of weighting as an alternative to multilevel multiple imputation in cluster randomized trials with missing binary outcomes under covariate-dependent missingness. Statistical methods in medical research. 2019 Jul 11:0962280219859915 paper link
Wang, B., Yao, L., Hu, J. and Li, H., 2020. A New Algorithm for Convex Biclustering and Its Extension to the Compositional Data. arXiv preprint arXiv:2011.12182. paper link
Zhang Z, Meng L, Ni C, Yao L, Zhang F, Jin Y, Mu X, Zhu S, Lu X, Liu S, Yu C. Engineering Escherichia coli to bind to cyanobacteria. Journal of bioscience and bioengineering. 2017 Mar 1;123(3):347-52. paper link
R Packages
- biADMM: A New Algorithm for Convex Biclustering
Talks and Posters
Extracting Scalar Measures from Functional Data with Missingness, Eastern North Americian Region, 2021 link
A Single Index Model for Longitudinal Outcomes to Optimize Individual Treatment Decision Rules Joint Statistical Meetings, American Statistical Association, 2020 link
Contributed Posters: Discovering Linear Biosignatures for Treatment Response: A Convexity-Based Clustering Approach Tom Ten Have Symposium on Statistics in Mental Health, Yale University, 2019 link
Discovering Linear Biosignatures for Treatment Response Based on Maximizing Kullback-Leibler Divergence in Linear mixedeffect Models, Precision Health Interest Group Meetings, NYU, Grossman School of Medicine, 2019
Teaching
Teaching assistant: Rigor & Reproducibility, Fall 2020, NYU School of Medicine
Student Tutor: Statistical Inference, Fall 2020, Spring 2021, Fall 2021
Scientific Memberships
Eastern North American Region International Biometric Society, 2021-Present
American Statistical Association, 2018-Present
Skills
- R, Python, SQL, SAS, C, C++, Matlab, SPSS, SPSS modeler, EViews
- Microsoft Office (Excel, Word, PowerPoint), Adobe Photoshop, Adobe After Effects
- Molecular cloning, Western blotting, PCR, ELISA, Protein Purification, Protein Engineering, Fluorescence Microscopy, Flow Cytometry
Competitions
HHS opioid CODE-A-THON, Dec 2017
Project description• Cleaned and summarized National Survey data provided by Department of Health and Human Services (HHS)
• Preformed principal component analysis (PCA) and built a topic model to classify the risk of opioid abuse
• Provided an innovative and user-friendly solution for opioid abuse based on our algorithm
• Ranked the third in Usage Track
ASA DATAFEST, Apr 2017
Project description• Cleaned and summarized data provided by Expedia.
• Plotted heat maps to provide data visualization.
• Preformed support vector machines (SVM) to classify Expedia international hotel income.
• Provided a practical solution to help Expedia save money.
Undergraduates Innovative Experimentation Competition, Apr 2015 – Aug 2016
- This is a competition for undergraduates at Peking University. Students who join the competition propose research topics and then work on the research by themselves. I worked with other four students and advised by Aimei Dong, a doctor at Peking University No.1 Hospital. We attempted to investigate the relativity between male climacteric syndrome and metabolism syndrome by analyzing the patient population of three hospitals in Beijing.
• Collected patients’ data by random cluster sampling from three hospitals
• Designed questionnaires based on the European old man symptom (AMS) questionnaire
• Conducted statistical tests to learn the relationship between patients’ questionnaire scores and their severity level of metabolism syndrome
• Performed principal component analysis (PCA) to identified questions with higher weight in the questionnaire and then performed the statistical analysis mentioned above based on the PCA results
Interdisciplinary Contest For Modeling (ICM), Feb 2016
Interdisciplinary Contest For Modeling (ICM), Feb 2015
Mathematical Contest in Modeling (MCM), Feb 2014
Provided an evaluation method for high-school basketball coaches.
Applied the website scraping methods to gain the data from websites by using Python.
Applied analytic hierarchic process (AHP) to weight the coefficients and used fuzzy preference programming (FPP) to optimize the model.
Won an “Honorable Mention” prize.
China Undergraduate Mathematical Contest in Modeling (CUMCM), Sep 2013
Provided an analysis of the traffic capacity in the condition of a traffic accident.
Built an improved model based on M/M/1 queuing theory applied in traffic flow.
Won the second prize.