DNA | The Chemical Statistician

Eric’s Enlightenment for Wednesday, June 3, 2015

June 3, 2015 Leave a comment

Jodi Beggs uses the Rule of 70 to explain why small differences in GDP growth rates have large ramifications.
Rick Wicklin illustrates the importance of choosing bin widths carefully when plotting histograms.
Shana Kelley et al. have developed an electrochemical sensor for detecting selected mutated nucleic acids (i.e. cancer markers in DNA!). “The sensor comprises gold electrical leads deposited on a silicon wafer, with palladium nano-electrodes.”
Rhett Allain provides a very detailed and analytical critique of Mjölnir (Thor’s hammer) – specifically, its unrealistic centre of mass. This is an impressive exercise in physics!
Congratulations to the Career Services Centre at Simon Fraser University for winning TalentEgg’s Special Award for Innovation by a Career Centre! I was fortunate to volunteer there as a career advisor for 5 years, and it was a wonderful place to learn, grow and give back to the community. My career has benefited greatly from that experience, and it is a pleasure to continue my involvement as a guest blogger for its official blog, The Career Services Informer. Way to go, everyone!

Filed under Eric's Enlightenment Tagged with bin width, cancer, chemistry, DNA, electrical leads, GDP, gold, hammer, histogram, histograms, jodi beggs, Mjölnir, nano-electrode, palladium, physics, rhett allain, rick wicklin, rule of 70, SFU, sfu career services centre, shana kelley, silicon wafer, statistics, talentegg, thor

Eric’s Enlightenment for Tuesday, April 28, 2015

April 28, 2015 Leave a comment

On a yearly basis, the production of almonds in California uses more water than businesses and residences in San Francisco and Los Angeles combined. Alex Tabarrok explains why.
How patient well-being and patient satisfaction become conflicting objectives in hospitals – a case study of a well-intended policy with deadly consequences. (HT: Frances Woolley – with a thought about academia.)
Contrary to a long-held presumption about the stability of DNA in mature cells, Huimei Yu et al. show that neurons use DNA methylation to rewrite their DNA throughout each day. This is done to adjust the brain to different activity levels as its function changes over time.
Alex Yakubovitch provides a tutorial on regular expressions (patterns that define sets of strings) and how to use them in R.

Filed under Eric's Enlightenment Tagged with affordable care act, alex tabarrok, alex yakubovitch, almonds, brain, california, DNA, dna methylation, economics, hospitals, los angeles, neurons, R, R programming, regular expressions, san francisco, water

Eric’s Enlightenment for Thursday, April 23, 2015

April 23, 2015 Leave a comment

Reaching the NBA Finals has been much more difficult in the Western Conference than in the Eastern Conference in the past 15 years.
In terms of points above average shooter per 100 shots, Kyle Korver ranks first in 2014-2015 with +30.4 points. DeAndre Jordan ranks second with +17.4 points. (Incredible!)
Evan Soltas evaluates “the rent hypothesis” – the claim that a larger share of income in recent years are unearned gains. (More rigorous, rent is “a payment for a resource in excess of its opportunity cost, one that instead reflects market power”.) This is Evan’s most read article.
A research team led by Junjiu Huang from 中山大学 (Sun Yat-Sen University) have successfully “edited the genes of human embryos using a new technique called CRISPR”. Carl Zimmer provides some background. (HT: Tyler Cowen.)

Filed under Eric's Enlightenment Tagged with analytics, basketball, carl zimmer, crispr, deandre jordan, DNA, embryo, gene, human embryo, kyle korver, nba, nba finals, sports analytics, sun yat-sen university

Useful Functions in R for Manipulating Text Data

February 27, 2014 12 Comments

Introduction

In my current job, I study HIV at the genetic and biochemical levels. Thus, I often work with data involving the sequences of nucleotides or amino acids of various patient samples of HIV, and this type of work involves a lot of manipulating text. (Strictly speaking, I analyze sequences of nucleotides from DNA that are reverse-transcribed from the HIV’s RNA.) In this post, I describe some common functions in R that I often use for text processing.

Obtaining Basic Information about Character Variables

In R, I often work with text data in the form of character variables. To check if a variable is a character variable, use the is.character() function.

> year = 2014
> is.character(year)
[1] FALSE

If a variable is not a character variable, you can convert it to a character variable using the as.character() function.

> year.char = as.character(year)
> is.character(year.char)
[1] TRUE

A basic piece of information about a character variable is the number of characters that exist in this string. Use the nchar() function to obtain this information.

> nchar(year.char)
[1] 4

	Eric Cai - The Chemi… on Convert multiple variables bet…
	Jack on Convert multiple variables bet…
	Eric Cai - The Chemi… on Getting the names, types, form…
	Emily V on Getting the names, types, form…
	Lauren McClain on Convert multiple variables bet…
	Eric Cai - The Chemi… on Convert multiple variables bet…
	Lauren McClain on Convert multiple variables bet…
	Eric Cai - The Chemi… on Exploratory Data Analysis: Com…
	CK on Exploratory Data Analysis: Com…
	Eric Cai - The Chemi… on Video Tutorial: Breaking Down…

The Chemical Statistician

Eric’s Enlightenment for Wednesday, June 3, 2015

Eric’s Enlightenment for Tuesday, April 28, 2015

Eric’s Enlightenment for Thursday, April 23, 2015

Useful Functions in R for Manipulating Text Data

Introduction

Obtaining Basic Information about Character Variables

Read more of this post

Eric’s Twitter Feed (@chemstateric)

Recent Comments

Popular Topics

Recent Posts

About Eric

Blogs and Web Sites That I Like to Read

Archives

Categories