Video Tutorial: Breaking Down the Definition of the Hazard Function

The hazard function is a fundamental quantity in survival analysis.  For an event occurring at some time on a continuous time scale, the hazard function, h(t), for that event is defined as

h(t) = \lim_{\Delta t \rightarrow 0} [P(t < X \leq t + \Delta t \ | \ X > t) \ \div \ \Delta t],


  • t is the time,
  • X is the time of the occurrence of the event.

However, what does this actually mean?  In this Youtube video, I break down the mathematics of this definition into its individual components and explain the intuition behind each component.

I am very excited about the release of this first video in my new Youtube channel!  This is yet another mode of expansion of The Chemical Statistician since the beginning of 2014.  As always, your comments are most appreciated!


6 Responses to Video Tutorial: Breaking Down the Definition of the Hazard Function

  1. jc says:

    Please do more on survival analysis! It was a great breakdown. Thank you.

  2. Alen says:

    Dear Eric,

    I have seen your youtube video on the definition of hazard function which I really liked. I had just read a paper on the possible association between fruit/vegetable intake and cancer incidence and was wondering if there was any way of calculating NUMBER NEEDED TO TREAT (NNT) from hazard ratio. More specifically, the authors of the study divided participants into five groups according to their fruit/vegetable intake (quintiles) and recorded the number of cancer events. I’ll try to put it as simple as possible: is there any way of calculating NNT from the following data:

    Quintile 1 (0–226 g/d) 6163 1.00 (reference)
    Quintile 2 (227–338 g/d) 6194 0.95 (0.92 to 0.99)
    Quintile 3 (339–462 g/d) 6263 0.91 (0.88 to 0.95)
    Quintile 4 (463–646 g/d) 6482 0.93 (0.89 to 0.97)
    Quintile 5 (≥647 g/d) 5502 0.89 (0.85 to 0.93)

    If it is of any help, the authors wrote: “As an example, under the assumption that study subjects shift one quintile upward in the distribution of fruit and vegetable intake corresponding to an average increase of approximately 150 g/d, 2.6% cancers in men and 2.3% cancers in women could be avoided.” I would be very grateful for any suggestion.

    Thank you!

    • Hi Alen,

      1) Could you please elaborate on what you mean by NNT?

      2) What does “N” mean in your above table?

      3) Of what event is the hazard representing?

      4) What are the 2 groups being compared in the hazard ratio?

      • Alen says:

        1. Wikipedia offers the following description of NNT: “The NNT is the average number of patients who need to be treated to prevent one additional bad outcome (i.e. the number of patients that need to be treated for one to benefit compared with a control in a clinical trial). It is defined as the inverse of the absolute risk reduction.” In other words, what I am interested in finding out is how many people would have to increase the intake of fruit and vegetable by approximately 150 g/d to prevent one event (cancer). As I understand it the NNT is calculated relative to a time interval. The median follow-up time was 8.7 years, during which time 9604 men and 21 000 women were diagnosed with cancer. The authors also mention that the “crude cancer incidence rates were 7.9 per 1000 person-years in men and 7.1 per 1000 person-years in women.”

        2. N is the number of men and women who were diagnosed with cancer within each category. The entire cohort consisted of 142 605 men and 335 873 women.

        3. Hazard ratios (HRs) and 95% confidence intervals (CIs) for incident cancer and distribution of incident cancers.

        4. This is actually a prospective cohort study. The authors divided the participants into five groups according to their intake of fruit and vegetable per day.

        (Boffetta, Fruit and Vegetable Intake and Overall Cancer Risk in the European Prospective Investigation Into Cancer and Nutrition (EPIC ))

        I hope this is helpful. Thank you.

      • Hi Alen,

        I’ve thought about this for a while, and I don’t think that NNT can be estimated from your data. The calculation of NNT is based on the event rates between 2 groups, and I just don’t see how the hazard ratios can be used to estimate NNT.

        I am not an expert on estimating NNT, so I encourage you to see the advice of someone with more expertise in this domain. If you do find the answer, please do share it with us here.

        Good luck,


Your thoughtful comments are much appreciated!

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: