Yishu Xue

Yishu Xue

Data Scientist / Coder / Novice Sprinter / Gym Enthusiast

Google Inc.


Hello, I am Yishu Xue, Ph.D. in Statistics and a data science passionate. I love data and programming. They together help us find the truths about the world. I am also interested in anthropology, foreign languages, aviation, and photography. I find playing with the computer interesting, too. I have been doing weight training and getting into running for the last six months. My Strava running badge is included here as a minor motivator to keep me moving more.

I like cats a lot. I have a seal lynx mitted ragdoll girl named Adira, though I most often call her “猛虎”, which reads měng hǔ, and means “tiger” in Chinese. She was born on 04/20/2019 in Middletown, PA, and in only two years she has grown into a big cat with gorgeous look. She brings a lot of happiness to my life. Seeing her grow up gives me huge sense of accomplishment, perhaps more than finishing a paper.



tidyverse + MCMC + survival analysis


Pandas + NumPy + Plotly + Dash

Machine/Deep Learning

Tensorflow + PyTorch + Scikit-learn + XGBoost


version control + website building


bash + distributed computing


MySQL + database design


Data Scientist, Engineering
Jul 2021 – Present San Bruno, CA
Data Scientist, Data Science Leadership Development Program
Travelers Insurance
Jul 2019 – Jul 2021 Hartford, CT

Recent Publications

  • Refereed Journal Publications
    Hu, G., Yang, H.-C., Xue, Y.*, and Dey, D.K. (2022+) Zero Inflated Poisson Model with Clustered Regression Coefficients: an Application to Heterogeneity Learning of Field Goal Attempts of Professional Basketball. Canadian Journal of Statstics, Forthcoming (Journal Website, arXiv)
    Hu, G., Xue, Y., and Ma, Z. (2021+) Bayesian Clustered Coefficients Regression with Auxiliary Covariates Assistant Random Effects. Statistical Modelling, Forthcoming ( Journal Website, arXiv, GitHub)
    Yang, H.-C.+, Xue, Y.+, Liu, Q., Pan, Y. and Hu, G. (2021+) Time Fusion Coefficient SIR Model with Application to COVID-19 Epidemic in the United States. Journal of Applied Statistics, special issue "Statistical Perspectives on Analytics for COVID-19 Data", Forthcoming (Journal Website, arXiv, Supplemental Code)
    Xue, Y., and Hu, G. (2021). Online Updating of Information Based Model Selection in the Big Data Setting. Communications in Statistics – Simulation and Computation 50(11), 3516-3529 (Journal Website)
    Yang, H.-C., Geng. L., Xue, Y.*, and Hu, G. (2022) Spatial Weibull Regression with Multivariate Log Gamma Process and Its Applications to China Earthquake Economic Loss. Statistics and Its Interface 15(1), 29-38 (Journal Website, arXiv)
    Ma, Z., Xue, Y.*, and Hu, G. (2021) Geographically Weighted Regression Analysis for Spatial Economics Data: a Bayesian Recourse. International Regional Science Review 44(5), 582-604. (Journal Website, arXiv)
    Hu, G., Xue, Y.*, and Huffer, F. (2021) A Comparison of Bayesian Accelerated Failure Time Models with Spatially Varying Coefficients. Sankhya B 83, 541-557 (Journal Website, arXiv)
    Xue, Y., Yan, J., and Schifano, E.D. (2021) Simultaneous Monitoring for Regression Coefficients and Baseline Hazard Profile in Cox Modeling of Time-to-Event Data. Biostatistics 22(4), 756-771 (Journal Website, GitHub)
    Ma, Z., Xue, Y.*, and Hu, G. (2020) Heterogeneous Regression Models for Clusters of Spatial Dependent Data. Spatial Economic Analysis 15(4), 459-475. (Journal Website, arXiv, Supplemental)
    Hu, G., Yang, H.C., Xue, Y.. (2020) Bayesian Group Learning for Shot Selection of Professional Basketball Players. Stat 10(1), e324 (Journal Website, arXiv)
    Xue, Y., Schifano, E.D. and Hu, G. (2020). Geographically Weighted Cox Regression for Prostate Cancer Survival Data in Louisiana. Geographical Analysis 52(4), 570-587. (Journal Website, arXiv, GitHub)
    Xue, Y., Wang, H., Yan, J, and Schifano, E.D. (2020). An Online Updating Approach for Testing the Proportional Hazards Assumption with Streams of Survival Data. Biometrics, 76(1), 171-182 (ENAR 2019 Distinguished Student Paper Award, Journal Website, arXiv, GitHub)
    Xue, Y., Harel, O., and Aseltine, R.H. (2019). Imputing Race and Ethnic Information in Administrative Health Data. Health Services Research 54(4), 957–963. (Journal Website)
    Ma, Z., Xue, Y., and Hu, G. (2019). Nonparametric Analysis of Income Distributions Among Different Regions Based on Energy Distance with Applications to China Health and Nutrition Survey Data. Communications for Statistical Applications and Methods 26(1), 57–67. (Journal Website)
    Xue, Y. and Schifano, E.D. (2017). Diagnostics for the Cox Model. Communications for Statistical Applications and Methods 24(6), 583–604. (Journal Website)

  • Refereed Conference Publications
    Geng, L., Xue, Y. and Hu, G. Subsampled Information Criteria for Bayesian Model Selection in the Big Data Setting. Proceedings of IEEE International Conference on Big Data, 2019 (IEEE Website)

  • Manuscripts Under Review
    Hu, G., Geng, J., Xue, Y., and Sang, H. Bayesian Spatial Homogeneity Pursuit of Functional Data: an Application to the U.S. Income Distribution. (arXiv)
  • + equal contribution; * corresponding author