Professional Highlights


  • Gordon M. Shrum Entrance Scholarship

    Simon Fraser University

    I was one of 40 students entering Simon Fraser University to receive the Gordon M. Shrum Entrance Scholarship, recognizing top academic achievement, community service, and leadership potential. This scholarship was worth $24,000.

  • Master of Science Graduate Funding Guarantee

    Department of Statistical Sciences, University of Toronto

    I completed my Master of Science degree in Statistics at one of the world’s best departments in statistics with a guaranteed funding package worth over $18,000.


  • Silver Medal

    Canadian Society for Chemistry

    I received the Canadian Society for Chemistry’s Silver Medal for being the top 4th-year chemistry student at Simon Fraser University among 100-120 possible recipients.

  • Undergraduate Student Research Award (USRA)

    Simon Fraser University – Office of the Vice-President of Research

    I was one of 11 undergraduate students who received research awards in the Department of Mathematics at Simon Fraser University in 2010-2011. This funded my research position in the Interdisciplinary Research in the Mathematical and Computational Sciences (IRMACS) Centre, where I modelled HIV epidemics using ordinary differential equations and Monte Carlo simulations.

  • Undergraduate Student Research Award (USRA)

    National Science and Engineering Research Council (NSERC)

    I was one of 5 undergraduate students who received research awards in the Department of Biomedical Physiology and Kinesiology at Simon Fraser University in 2005-2006. This funded my research position in the Cardiac Membrane Research Laboratory, where I used immunocytochemistry and confocal microscopy to study the physiology of cardiac muscle cells and stem cells.

  • Travel Award for Conference Presentation

    Statistical Society of Canada

    I received a Travel Award to attend the 2012 Statistical Society of Canada’s Conference to present my statistical computing research on Monte Carlo simulations of power functions of correlation tests. I discovered a pathology in the sampling distributions of these correlation tests when testing on data from the bivariate t-distribution, and I explored the nature of this pathology and how to overcome it.

  • Webber Prize

    Science and Environment Cooperative Education Program – Simon Fraser University

    I received the Webber Prize for writing best top co-operative education report based on my position as the Technology Transfer Assistant at TRIUMF, Canada’s National Laboratory for Particle and Nuclear Physics. This prize specifically recognized my efforts in writing the 2006-2007 Annual Report on TRIUMF’s Business Development Plan.

  • Dean’s List

    Faculty of Science, Simon Fraser University

    I was on the Faculty of Science’s Dean’s List for academic excellence throughout my undergraduate degree at Simon Fraser University, where I studied kinesiology, chemistry, economics and mathematics. I finished my Bachelor’s degree with Distinction with a major in chemistry and a minor in mathematics in 2011.



  • Scott, S.A., van der Zanden, C., Cai, E., McGahan, C.E. and Kwon, J.S., 2017. Prognostic significance of peritoneal cytology in low-intermediate risk endometrial cancer. Gynecologic Oncology.


  • McColl, R.J., McGahan, C.E., Cai, E., Olson, R., Cheung, W.Y., Raval, M.J., Phang, P.T., Karimuddin, A.A. and Brown, C.J., 2017. Impact of hospital volume on quality indicators for rectal cancer surgery in British Columbia, Canada. The American Journal of Surgery, 213(2), pp.388-394.


  • Olson, R.A., Tiwana, M., Barnes, M., Cai, E., McGahan, C., Roden, K., Yurkowski, E., Gentles, Q., French, J., Halperin, R. and Olivotto, I.A., 2016. Impact of using audit data to improve the evidence-based use of single-fraction radiation therapy for bone metastases in British Columbia. International Journal of Radiation Oncology* Biology* Physics, 94(1), pp.40-47.


  • Lee, G.Q., Lachowski, C., Cai, E., Lima, V.D., Boum, Y., Muzoora, C., Mocello, A.R., Hunt, P.W., Martin, J.N., Bangsberg, D.R. and Harrigan, P.R., 2016. Non-R5-tropic HIV-1 in subtype A1 and D infections were associated with lower pretherapy CD4+ cell count but not with PI/(N) NRTI therapy outcomes in Mbarara, Uganda. AIDS, 30(11), pp.1781-1788.


  • Swenson, L.C., Min, J.E., Woods, C.K., Cai, E., Li, J.Z., Montaner, J.S., Harrigan, P.R. and Gonzalez-Serna, A., 2014. HIV drug resistance detected during low-level viremia is associated with subsequent virologic failure. AIDS (London, England), 28(8), p.1125.



  • Pathological Monte Carlo Simulations of Power Functions of Correlation Tests

    I began this very fruitful and interesting research project in my graduate statistical computing class, where I originally intended to compare the power functions of the Pearson and the Spearman correlation tests using Monte Carlo simulations. This was specifically done for data from two distributions: the bivariate normal distribution and the bivariate t-distribution. However, I encountered grossly pathological power functions of the Pearson correlation test for data from the bivariate t-distribution. I later found a subtle but surprising pathology for the Spearman correlation test’s power functions. I presented my study of these pathologies and the antidotes to obtain the correct power functions at the Statistical Society of Canada’s annual conference in Guelph in June, 2012. I completed further work on this project after the conference, and I was invited to present the improved results at greater length at the graduate biostatistics weekly seminar at the University of Toronto in October, 2012.

  • Analyzing Data from the Motivated Strategies for Learning Questionnaire (MSLQ) to Determine the Factors Affecting Performance of First-Year University Students

    Team Members: Eric Cai, Derrick Gray, Craig Burkett

    A lecturer in the University of Toronto approached me, Derrick Gray and Craig Burkett to determine the factors for success in a first-year course using results that he collected from his students with the Motivated Strategies for Learning Questionnaire (MSLQ). His colleague in Kasetsart University in Thailand taught the same course and collected the same data, so we also compared the results between the two cultures.

    Derrick, Craig and I analyzed the data in SAS, wrote about our analyses and findings in detailed reports, and presented the highlights of our reports to our client. After the completion of my course, I voluntarily pursued further research with my this client to analyze his data sets with improved methods in machine learning, and I wrote a new report based on my findings. In June, 2013, I presented my findings to the client and concluded our successful collaboration.

  • Seminar Presentation: Using Advanced Predictive Modelling and Pattern Recognition in Business Analytics

    I delivered this 1-hour presentation to a diverse group of business analysts, statisticians, academics, students, and executives in a seminar series organized by the Southern Ontario Regional Association (SORA) of the Statistical Society of Canada (SSC). My main goal was to introduce how machine learning and data mining are used in business analytics, especially in overcoming the limitations of traditional statistical techniques. I then described 2 predictive modelling (or supervised learning) techniques – partial least squares regression and bootstrap (random) forest – and illustrated how Predictum succeeded in using these techniques to solve analytical problems for 2 actual clients.

  • Seminar Presentation: Discriminant Analysis – A Machine Learning Technique for Classification in JMP and SAS

    I was invited by the Toronto Area SAS Society (TASS) to deliver this presentation on discriminant analysis and how to implement it in JMP and SAS. Discriminant analysis is a classification techniquea that uses continuous predictor variables to predict categorical target variables. I specifically focused on Gaussian discriminant analysis and gave an non-technical and intuitive explanation that appealed to a diverse audience of statisticians, analysts, and database managers.

  • Seminar Presentation: Finding Patterns in Data with K-Means Clustering in JMP and SAS

    I was invited by the Toronto Area SAS Society (TASS) to deliver this presentation on K-means clustering and how to implement it in JMP and SAS. K-means clustering is a common machine learning technique for finding patterns in continuous data and grouping them based on proximity. This presentation was gently mathematical and used many diagrams, plots, and pictures to illustrate the essential ideas of K-means clustering for a wide audience of analysts, statisticians, and database managers. It was very well received by the audience, and many non-mathematical analysts praised the accessibility and clarity of this presentation.

  • Seminar Presentation: Overcoming Multicollinearity and Overfitting – Partial Least Squares Regression in JMP and SAS

    I was invited by the Toronto Area SAS Society (TASS) to deliver this presentation on partial least squares regression and how to implement it in JMP and SAS. Partial least squares regression is a powerful predictive modelling and variable selection technique in machine learning, and it is particularly useful for overcoming multicollinearity and overfitting, two common problems encountered by linear or logistic regression. This presentation was very well received by the audience, especially non-mathematicians who found it to be clear and accessible. Matt Malczewski and Chris Battiston, two active members of the SAS Canada Community, recapped my presentations in their respective blogs.

  • Mathematical Model of Treatment as a Strategy to Reduce HIV Transmission

    The World Health Organization published a controversial study in 2009 in support of using highly active anti-retroviral therapy (HAART) to prevent HIV transmission with results from two mathematical models – one deterministic and one stochastic – built in Visual Basic. I attempted to replicate these models in MATLAB. The stochastic model without treatment was replicated successfully, but the assumptions of the model were found to be tenuous, and an alternative model was produced with slightly different results. The deterministic model could not be replicated, with or without treatment. However, a simplification of the model opened the door to 3 different ways of examining treatment as prevention. With the given parameters of the population from the WHO, the simplified model showed that a sufficiently high treatment rate could eliminate the epidemic.

  • Mathematical Model of the Cathode Catalyst Layer in Proton Exchange Membrane Fuel Cells

    I studied a model of the proton exchange membrane fuel cell’s cathode catalyst layer, where the reduction half-reaction occurs. This model was a system of 3 partial differential equations that were reduced to ordinary differential equations, and I showed, both analytically by re-deriving the entire model and numerically using MATLAB, that this model had flawed boundary values and could not be solved. Given the opportunity in the near future, I am eager to continue this project to rectify these flaws and write a MATLAB script that clearly and correctly solves these equations.

  • Log-Normal Regression of Left-Censored Environmental Data: An Example with Trichloroethylene Concentrations in Ground Water

    In my graduate applied statistics course, I wrote a paper on left-censored data and used a sample data set from an environmental chemistry textbook to illustrate how to analyze it. I defined censored data, introduced the difficulties of analyzing them, and discussed methods to overcome those difficulties. Using a parametric approach, I assumed a log-normal distribution for the data and used maximum likelihood estimation to estimate the parameters in my model.

  • Comparing Non-parametric, Semi-parametric, and Parametric Survival Analyses of the Determinants of Surviving AIDS in Quebec: 1979-1994

    Team Members: Eric Cai, Jiafen Gong

    In my graduate course on survival analysis, I partnered with Jiafen Gong to analyze a data set of HIV patients in Quebec from 1979 to 1994. Using non-parametric (Kaplan-Meier), semi-parametric (Cox proportional hazards) and parametric (exponential, Weibull, log-normal, log-logistic) approaches, we sought to find the determinants of survival from AIDS. Significant data cleaning was done to ease the analysis, and inconsistencies in the recording of the data were discovered in the process. Despite the difficulties encountered with this data set, we found several interesting results. Younger patients were found to survive longer than older patients. Men who had sex with men and intravenous drug users wre found to survive longer than any other risk category. Patients who were diagnosed after 1989 had higher survival than patients who were diagnosed before 1989. Gender was found to be an insignificant factor for survival, though its effect on survival may have been confounded by the risk categories.

Your thoughtful comments are much appreciated!

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: