Getting Ready for Mathematical Classes in the New Semester – Guest-Blogging on SFU’s Career Services Informer

The following blog post was slightly condensed for editorial brevity and then published on the Career Services Informer, the official blog of the Career Services Centre at my undergraduate alma mater, Simon Fraser University

sfu csi

As a new Fall semester begins, many students start courses such as math, physics, computing science, engineering and statistics.  These can be tough classes with a rapid progression in workload and difficulty, but steady preparation can mount a strong defense to the inevitable pressure and stress.  Here are some tips to help you to get ready for those classes.

Read more of this post

Organic and Inorganic Chemistry Lesson of the Day – Optical Rotation is a Bulk Property

It is important to note that optical rotation is usually discussed as a bulk property, because it’s usually measured as a bulk property by a polarimeter.  Any individual enantiomeric molecule can almost certainly rotate linearly polarized light.  However, in a bulk sample of a chiral substance, there is usually another molecule that can rotate light in the opposite direction.  This is due to the uniform distribution of the stereochemistry of a random sample of the molecules of one compound.  (In other words, the substance consists of different stereoisomers of one compound, and the proportions of the different stereoisomers are roughly equal.)  Because one molecule’s rotation of the light can be cancelled by another molecule’s optical rotation in the opposite direction, such a random sample of the compound would have no net optical rotation.  This type of cancellation will definitely occur in a racemic mixture.  However, if a substance is enantiomerically pure, then all of the molecules in that substance will rotate linearly polarized light in the same direction – this substance is optically active.

Organic and Inorganic Chemistry Lesson of the Day – The Difference Between (+)/(-) and (R)/(S) in Stereochemical Notation

In a previous Chemistry Lesson of the Day, I introduced the concept of optical rotation (a.k.a. optical activity).  You may also be familiar with the Cahn-Ingold-Prelog priority rules for designating stereogenic centres as either (R) or (S).   There is no direct association between the (+)/(-) designation and the (R)/(S) designation.  In other words, an (R)-enantiomer can be dextrorotary or levorotary – it must be determined on a case-by-case basis.  The same holds true for an (S)-enantiomer.

(R)/(S) can be used to distinguish between enantiomers in one exception: If the stereoisomer has only 1 stereogenic centre, then this designation can also serve as a way to distinguish between 2 enantiomers.

Furthermore, note that the designation of optical rotation applies to a molecule, whereas the R/S designation applies to a particular stereogenic centre within a molecule.  Thus, a molecule with 2 stereogenic centres may have one (R) stereogenic centre and one (S) stereogenic centre.  However, a chiral compound consisting purely of one enantiomer can rotate linearly polarized light in only one direction, and that direction must be determined on a case-by-case basis by a polarimeter.

University of Toronto Alumni Reception with Meric Gertler – Tuesday, September 16, 2014 @ Sheraton Vancouver Wall Centre

I will attend the upcoming University of Toronto Alumni Reception in Vancouver to meet the new President of the University of Toronto, Meric Gertler.  If you will attend, please feel free to come up and say “Hello”!

ut_logo

Date: Tuesday, September 16, 2014

Time: 6:30 PM to 8:30 PM

Location:

Sheraton Vancouver Wall Centre
1088 Burrard St.
Vancouver, BC
V6Z 2R9

Mathematics and Mathematical Statistics Lesson of the Day – Convex Functions and Jensen’s Inequality

Consider a real-valued function f(x) that is continuous on the interval [x_1, x_2], where x_1 and x_2 are any 2 points in the domain of f(x).  Let

x_m = 0.5x_1 + 0.5x_2

be the midpoint of x_1 and x_2.  Then, if

f(x_m) \leq 0.5f(x_1) + 0.5f(x_2),

then f(x) is defined to be midpoint convex.

More generally, let’s consider any point within the interval [x_1, x_2].  We can denote this arbitrary point as

x_\lambda = \lambda x_1 + (1 - \lambda)x_2, where 0 < \lambda < 1.

Then, if

f(x_\lambda) \leq \lambda f(x_1) + (1 - \lambda) f(x_2),

then f(x) is defined to be convex.  If

f(x_\lambda) < \lambda f(x_1) + (1 - \lambda) f(x_2),

then f(x) is defined to be strictly convex.

There is a very elegant and powerful relationship about convex functions in mathematics and in mathematical statistics called Jensen’s inequality.  It states that, for any random variable Y with a finite expected value and for any convex function g(y),

E[g(Y)] \geq g[E(Y)].

A function f(x) is defined to be concave if -f(x) is convex.  Thus, Jensen’s inequality can also be stated for concave functions.  For any random variable Z with a finite expected value and for any concave function h(z),

E[h(Z)] \leq h[E(Z)].

In future Statistics Lessons of the Day, I will prove Jensen’s inequality and discuss some of its implications in mathematical statistics.

Organic and Inorganic Chemistry Lesson of the Day – DO NOT USE THE PREFIXES (d-) and (l-) TO CLASSIFY ENANTIOMERS

In a recent Chemistry Lesson of the Day, I introduced the concept of optical rotation, and I mentioned the use of (+) and (-) to denote dextrorotary and levorotary compounds, respectively.

Some people use d- and l- instead of (+) and (-), respectively.  I strongly discourage this, because there is an old system of classifying stereogenic centres that uses the prefixes D- and L-, and the obvious similarity between the prefixes of the 2 systems causes much confusion.

This old system classifies stereogenic centres based on the similarities of their configurations to the 2 enantiomers of glyceraldehyde.  It is confusing, non-intuitive, and outdated, so I will not discuss its rationale or details on my blog.  (If you are interested, here is a good explanation from the University of Maine’s chemistry department.)

Also, note that D- and L- classify stereogenic centres, whereas d- and l- classify enantiomers - this just adds more confusion.

In short,

  • DO NOT use d- and l- to classify enantiomers; use (+) and (-) instead.
  • DO NOT use D- and L- to classify stereogenic centres; use the Cahn-Ingold-Prelog priority rules (R/S) instead.

Mathematical Statistics Lesson of the Day – The Glivenko-Cantelli Theorem

In 2 earlier tutorials that focused on exploratory data analysis in statistics, I introduced

There is actually an elegant theorem that provides a rigorous basis for using empirical CDFs to estimate the true CDF – and this is true for any probability distribution.  It is called the Glivenko-Cantelli theorem, and here is what it states:

Given a sequence of n independent and identically distributed random variables, X_1, X_2, ..., X_n,

P[\lim_{n \to \infty} \sup_{x \epsilon \mathbb{R}} |\hat{F}_n(x) - F_X(x)| = 0] = 1.

In other words, the empirical CDF of X_1, X_2, ..., X_n converges uniformly to the true CDF.

My mathematical statistics professor at the University of Toronto, Keith Knight, told my class that this is often referred to as “The First Theorem of Statistics” or the “The Fundamental Theorem of Statistics”.  I think that this is a rather subjective title – the central limit theorem is likely more useful and important – but Page 261 of John Taylor’s An introduction to measure and probability (Springer, 1997) recognizes this attribution to the Glivenko-Cantelli theorem, too.

Mathematical and Applied Statistics Lesson of the Day – The Motivation and Intuition Behind Chebyshev’s Inequality

In 2 recent Statistics Lessons of the Day, I

Chebyshev’s inequality is just a special version of Markov’s inequality; thus, their motivations and intuitions are similar.

P[|X - \mu| \geq k \sigma] \leq 1 \div k^2

Markov’s inequality roughly says that a random variable X is most frequently observed near its expected value, \mu.  Remarkably, it quantifies just how often X is far away from \mu.  Chebyshev’s inequality goes one step further and quantifies that distance between X and \mu in terms of the number of standard deviations away from \mu.  It roughly says that the probability of X being k standard deviations away from \mu is at most k^{-2}.  Notice that this upper bound decreases as k increases – confirming our intuition that it is highly improbable for X to be far away from \mu.

As with Markov’s inequality, Chebyshev’s inequality applies to any random variable X, as long as E(X) and V(X) are finite.  (Markov’s inequality requires only E(X) to be finite.)  This is quite a marvelous result!

Organic and Inorganic Chemistry Lesson of the Day – Optical Rotation (a.k.a. Optical Activity)

A substance consisting of a chiral compound can rotate linearly polarized light – this property is known as optical rotation (more commonly called optical activity).  The direction in which the light is rotated is one way to distinguish between a pair of enantiomers, as they rotate linearly polarized light in opposite directions.

Imagine if you are an enantiomer, and linearly polarized light approaches you.

  • If the light is rotated clockwise from your perspective, then you are a dextrorotary enantiomer.
  • Otherwise, if the light is rotated counterclockwise from your perspective, then you are a levorotary enantiomer.

In a previous Chemistry Lesson of the Day, I introduced the concept of diastereomers, and I used threose as an example.  Let’s use threose to illustrate some notation about optical activity.

D-threose.svg 2

(-)-Threose

  • Levorotary compounds are denoted by the prefix (-), followed by a hyphen, then followed by the name of the compound.  The above molecule is (-)-threose.
  • Dextrorotary compounds are denoted by the prefix (+), followed by a hyphen, then followed by the name of the compound.  The enantiomer of (-)-threose is (+)-threose.

A compound’s optical rotation is determined by a polarimeter.

I strongly discourage the use of the prefixes (d)- and (l-) to distinguish between enantiomers.  Use (+) and (-) instead.

Beware of the difference between designating enantiomers as (+) or (-) and designating stereogenic centres as either (R) or (S).

It is important to note that optical rotation is usually referred to as a bulk property.

Mathematical Statistics Lesson of the Day – Chebyshev’s Inequality

The variance of a random variable X is just an expected value of a function of X.  Specifically,

V(X) = E[(X - \mu)^2], \ \text{where} \ \mu = E(X).

Let’s substitute (X - \mu)^2 into Markov’s inequality and see what happens.  For convenience and without loss of generality, I will replace the constant c with another constant, b^2.

\text{Let} \ b^2 = c, \ b > 0. \ \ \text{Then,}

P[(X - \mu)^2 \geq b^2] \leq E[(X - \mu)^2] \div b^2

P[ (X - \mu) \leq -b \ \ \text{or} \ \ (X - \mu) \geq b] \leq V(X) \div b^2

P[|X - \mu| \geq b] \leq V(X) \div b^2

Now, let’s substitute b with k \sigma, where \sigma is the standard deviation of X.  (I can make this substitution, because \sigma is just another constant.)

\text{Let} \ k \sigma = b. \ \ \text{Then,}

P[|X - \mu| \geq k \sigma] \leq V(X) \div k^2 \sigma^2

P[|X - \mu| \geq k \sigma] \leq 1 \div k^2

This last inequality is known as Chebyshev’s inequality, and it is just a special version of Markov’s inequality.  In a later Statistics Lesson of the Day, I will discuss the motivation and intuition behind it.  (Hint: Read my earlier lesson on the motivation and intuition behind Markov’s inequality.)

Organic and Inorganic Chemistry Lesson of the Day – Cis/Trans Isomers Are Diastereomers

Recall that the definition of diastereomers is simply 2 molecules that are NOT enantiomers.  Diastereomers often have at least 2 stereogenic centres, and my previous lesson showed an example of how such diastereomers can arise.

However, while an enantiomer must have at least 1 stereogenic centre, there is nothing in the definition of a diastereomer that requires it to have any stereogenic centres.  In fact, a diastereomer does not have to be chiral.  A pair of cis/trans isomers are also diastereomers.  Recall the example of trans-1,2-dibromoethylene and cis-1,2-dibromoethylene:

dibromoethylene

 

Image courtesy of Roland1952 on Wikimedia.

These 2 molecules are stereoisomers – they have the same atoms and sequence/connectivity of bonds, but they differ in their spatial orientations.  They are NOT mirror images of each other, let alone non-superimposable mirror images.  Thus, by definition, they are diastereomers, even though they are not chiral.

Mathematical and Applied Statistics Lesson of the Day – The Motivation and Intuition Behind Markov’s Inequality

Markov’s inequality may seem like a rather arbitrary pair of mathematical expressions that are coincidentally related to each other by an inequality sign:

P(X \geq c) \leq E(X) \div c, where c > 0.

However, there is a practical motivation behind Markov’s inequality, and it can be posed in the form of a simple question: How often is the random variable X “far” away from its “centre” or “central value”?

Intuitively, the “central value” of X is the value that of X that is most commonly (or most frequently) observed.  Thus, as X deviates further and further from its “central value”, we would expect those distant-from-the-centre vales to be less frequently observed.

Recall that the expected value, E(X), is a measure of the “centre” of X.  Thus, we would expect that the probability of X being very far away from E(X) is very low.  Indeed, Markov’s inequality rigorously confirms this intuition; here is its rough translation:

As c becomes really far away from E(X), the event X \geq c becomes less probable.

You can confirm this by substituting several key values of c.

 

  • If c = E(X), then P[X \geq E(X)] \leq 1; this is the highest upper bound that P(X \geq c) can get.  This makes intuitive sense; X is going to be frequently observed near its own expected value.

 

  • If c \rightarrow \infty, then P(X \geq \infty) \leq 0.  By Kolmogorov’s axioms of probability, any probability must be inclusively between 0 and 1, so P(X \geq \infty) = 0.  This makes intuitive sense; there is no possible way that X can be bigger than positive infinity.

Organic and Inorganic Chemistry Lesson of the Day – Meso Isomers

A molecule is a meso isomer if it

  • is a member of a set of stereoisomers that includes enantiomers
  • has a superimposable mirror image (i.e. it is achiral)

Meso isomers have an internal plane of symmetry, which arises from 2 identically substituted but oppositely oriented stereogenic centres.  (By “oppositely oriented”, I mean the stereochemical orientation as defined by the Cahn-Ingold-Prelog priority system.  For example, in a meso isomer with 2 tetrahedral stereogenic centres, one stereogenic centre needs to be “R”, and the other stereogenic centre needs to be “S”. )  This symmetry results in the superimposability of a meso isomer’s mirror image.

By definition, a meso isomer and an enantiomer from the same stereoisomer are a pair of diastereomers.

Having at least 2 stereogenic centres is a necessary but not sufficient condition for a molecule to have meso isomers.  Recall that a molecule with n tetrahedral stereogenic centres has at most 2^n stereoisomers; such a molecule would have less than 2^n stereoisomers if it has meso isomers.

Meso isomers are also called meso compounds.

Here is an example of a meso isomer; notice the internal plane of symmetry – the horizontal line that divides the 2 stereogenic carbons:

(2R,3S)-tartaric acid

(2R,3S)-tartaric acid

Image courtesy of Project Osprey from Wikimedia (with a slight modification).

The Chi-Squared Test of Independence – An Example in Both R and SAS

Introduction

The chi-squared test of independence is one of the most basic and common hypothesis tests in the statistical analysis of categorical data.  Given 2 categorical random variables, X and Y, the chi-squared test of independence determines whether or not there exists a statistical dependence between them.  Formally, it is a hypothesis test with the following null and alternative hypotheses:

H_0: X \perp Y \ \ \ \ \ \text{vs.} \ \ \ \ \ H_a: X \not \perp Y

If you’re not familiar with probabilistic independence and how it manifests in categorical random variables, watch my video on calculating expected counts in contingency tables using joint and marginal probabilities.  For your convenience, here is another video that gives a gentler and more practical understanding of calculating expected counts using marginal proportions and marginal totals.

Today, I will continue from those 2 videos and illustrate how the chi-squared test of independence can be implemented in both R and SAS with the same example.

Read more of this post

Mathematical Statistics Lesson of the Day – Markov’s Inequality

Markov’s inequality is an elegant and very useful inequality that relates the probability of an event concerning a non-negative random variable, X, with the expected value of X.  It states that

P(X \geq c) \leq E(X) \div c,

where c > 0.

I find Markov’s inequality to be beautiful for 2 reasons:

  1. It applies to both continuous and discrete random variables.
  2. It applies to any non-negative random variable from any distribution with a finite expected value.

In a later lesson, I will discuss the motivation and intuition behind Markov’s inequality, which has useful implications for understanding a data set.

Organic and Inorganic Chemistry Lesson of the Day – Racemic Mixtures

A racemic mixture is a mixture that contains equal amounts of both enantiomers of a chiral molecule.  (By amount, I mean the usual unit of quantity in chemistry – the mole.  Of course, since enantiomers are isomers, their molar masses are equal, so a racemic mixture would contain equal masses of both enantiomers, too.)

In synthesizing enantiomers, if a set of reactants combine to form a racemic mixture, then the reactants are called non-stereoselective or non-stereospecific.

in 1895, Otto Wallach proposed that a racemic crystal is more dense than a crystal with purely one of the enantiomers; this is known as Wallach’s rule.  Brock et al. (1991) substantiated this with crystallograhpic data.

 

Reference:

Brock, C. P., Schweizer, W. B., & Dunitz, J. D. (1991). On the validity of Wallach’s rule: on the density and stability of racemic crystals compared with their chiral counterparts. Journal of the American Chemical Society, 113(26), 9811-9820.

Applied Statistics Lesson of the Day – The Coefficient of Variation

In my statistics classes, I learned to use the variance or the standard deviation to measure the variability or dispersion of a data set.  However, consider the following 2 hypothetical cases:

  1. the standard deviation for the incomes of households in Canada is $2,000
  2. the standard deviation for the incomes of the 5 major banks in Canada is $2,000

Even though this measure of dispersion has the same value for both sets of income data, $2,000 is a significant amount for a household, whereas $2,000 is not a lot of money for one of the “Big Five” banks.  Thus, the standard deviation alone does not give a fully accurate sense of the relative variability between the 2 data sets.  One way to overcome this limitation is to take the mean of the data sets into account.

A useful statistic for measuring the variability of a data set while scaling by the mean is the sample coefficient of variation:

\text{Sample Coefficient of Variation (} \bar{c_v} \text{)} \ = \ s \ \div \ \bar{x},

where s is the sample standard deviation and \bar{x} is the sample mean.

Analogously, the coefficient of variation for a random variable is

\text{Coefficient of Variation} \ (c_v) \ = \ \sigma \div \ \mu,

where \sigma is the random variable’s standard deviation and \mu is the random variable’s expected value.

The coefficient of variation is a very useful statistic that I, unfortunately, never learned in my introductory statistics classes.  I hope that all new statistics students get to learn this alternative measure of dispersion.

Using Your Vacation to Develop Your Career – Guest Blogging on Simon Fraser University’s Career Services Informer

The following post was originally published on the Career Services Informer.

I recently took a vacation from my former role as a statistician at the BC Centre for Excellence in HIV/AIDS. I did not plan a trip out of town – the spring weather was beautiful in Vancouver, and I wanted to spend time on the things that I like to do in this city. Many obvious things came to mind – walking along beaches, practicing Python programming and catching up with friends – just to name a few.

sfu csi

Yes, Python programming was one of the obvious things on my vacation to-do list, and I understand how ridiculous this may seem to some people. Why tax my brain during a time that is meant for mental relaxation, especially when the weather is great?

Read more of this post

Machine Learning and Applied Statistics Lesson of the Day – Positive Predictive Value and Negative Predictive Value

For a binary classifier,

  • its positive predictive value (PPV) is the proportion of positively classified cases that were truly positive.

\text{PPV} = \text{(Number of True Positives)} \ \div \ \text{(Number of True Positives} \ + \ \text{Number of False Positives)}

  • its negative predictive value (NPV) is the proportion of negatively classified cases that were truly negative.

\text{NPV} = \text{(Number of True Negatives)} \ \div \ \text{(Number of True Negatives} \ + \ \text{Number of False Negatives)}

In a later Statistics and Machine Learning Lesson of the Day, I will discuss the differences between PPV/NPV and sensitivity/specificity in assessing the predictive accuracy of a binary classifier.

(Recall that sensitivity and specificity can also be used to evaluate the performance of a binary classifier.  Based on those 2 statistics, we can construct receiver operating characteristic (ROC) curves to assess the predictive accuracy of the classifier, and a minimum standard for a good ROC curve is being better than the line of no discrimination.)

Organic and Inorganic Chemistry Lesson of the Day – Diastereomers

I previously introduced the concept of chirality and how it is a property of any molecule with only 1 stereogenic centre.  (A molecule with n stereogenic centres may or may not be chiral, depending on its stereochemistry.)  I also defined 2 stereoisomers as enantiomers if they are non-superimposable mirror images of each other.  (Recall that chirality in inorganic chemistry can arise in 2 different ways.)

It is possible for 2 stereoisomers to NOT be enantiomers; in fact, such stereoisomers are called diastereomers.  Yes, I recognize that defining something as the negation of something else is unusual.  If you have learned set theory or probability (as I did in my mathematical statistics classes) then consider the set of all pairs of the stereoisomers of one compound – this is the sample space.  The enantiomers form a set within this sample space, and the diastereomers are the complement of the enantiomers.

It is important to note that, while diastereomers are not mirror images of each other, they are still non-superimposable.  Diastereomers often (but not always) arise from stereoisomers with 2 or more stereogenic centres; here is an example of how they can arise.  (A pair of cis/trans-isomers are also diastereomers, despite not having any stereogenic centres.)

1) Consider a stereoisomer with 2 tetrahedral stereogenic centres and no meso isomers*.  This isomer has 2^{n = 2} stereoisomers, where n = 2 denotes the number of stereogenic centres.

2) Find one pair of enantiomers based on one of the stereogenic centres.

3) Find the other pair enantiomers based on the other stereogenic centre.

4) Take any one molecule from Step #2 and any one molecule from Step #3.  These cannot be mirror images of each other.  (One molecule cannot have 2 different mirror images of itself.)  These 2 molecules are diastereomers.

Think back to my above description of enantiomers as a proper subset within the sample space of the pairs of one set of stereoisomers.  You can now see why I emphasized that the sample space consists of pairs, since multiple different pairs of stereoisomers can form enantiomers.  In my example above, Steps #2 and #3 produced 2 subsets of enantiomers.  It should be clear by now that enantiomers and diastereomers are defined as pairs.  To further illustrate this point,

a) call the 2 molecules in Step#2 A and B.

b) call the 2 molecules in Step #3 C and D.

A and B are enantiomers.  A and C are diastereomers.  Thus, it is entirely possible for one molecule to be an enantiomer with a second molecule and a diastereomer with a third molecule.

Here is an example of 2 diastereomers.  Notice that they have the same chemical formula but different 3-dimensional orientations – i.e. they are stereoisomers.  These stereoisomers are not mirror images of each other, but they are non-superimposable – i.e. they are diastereomers.

D-threose.svg 2

(-)-Threose

D-erythrose 2.svg

(-)-Erythrose

 

 

 

 

 

 

Images courtesy of Popnose, DMacks and Edgar181 on Wikimedia.  For brevity, I direct you to the Wikipedia entry for diastereomers showing these 4 images in one panel.

In a later Chemistry Lesson of the Day on optical rotation (a.k.a. optical activity), I will explain what the (-) symbol means in the names of those 2 diastereomers.

*I will discuss meso isomers in a separate lesson.

Follow

Get every new post delivered to your Inbox.

Join 348 other followers