Rectangular Integration (a.k.a. The Midpoint Rule) – Conceptual Foundations and a Statistical Application in R

Introduction

Continuing on the recently born series on numerical integration, this post will introduce rectangular integration.  I will describe the concept behind rectangular integration, show a function in R for how to do it, and use it to check that the Beta(2, 5) distribution actually integrates to 1 over its support set.  This post follows from my previous post on trapezoidal integration.

midpoint rule

Image courtesy of Qef from Wikimedia Commons.

Conceptual Background of Rectangular Integration (a.k.a. The Midpoint Rule)

Rectangular integration is a numerical integration technique that approximates the integral of a function with a rectangle.  It uses rectangles to approximate the area under the curve.  Here are its features:

  • The rectangle’s width is determined by the interval of integration.
    • One rectangle could span the width of the interval of integration and approximate the entire integral.
    • Alternatively, the interval of integration could be sub-divided into n smaller intervals of equal lengths, and n rectangles would used to approximate the integral; each smaller rectangle has the width of the smaller interval.
  • The rectangle’s height is the function’s value at the midpoint of its base.
  • Within a fixed interval of integration, the approximation becomes more accurate as more rectangles are used; each rectangle becomes narrower, and the height of the rectangle better captures the values of the function within that interval.

Read more of this post

Trapezoidal Integration – Conceptual Foundations and a Statistical Application in R

Introduction

Today, I will begin a series of posts on numerical integration, which has a wide range of applications in many fields, including statistics.  I will introduce trapezoidal integration by discussing its conceptual foundations, write my own R function to implement trapezoidal integration, and use it to check that the Beta(2, 5) probability density function actually integrates to 1 over its support set.  Fully commented and readily usable R code will be provided at the end.

beta pdf

Given a probability density function (PDF) and its support set as vectors in an array programming language like R, how do you integrate the PDF over its support set to ensure that it equals to 1?  Read the rest of this post to view my own R function to implement trapezoidal integration and learn how to use it to numerically approximate integrals.

Read more of this post

Using the Golden Section Search Method to Minimize the Sum of Absolute Deviations

Introduction

Recently, I introduced the golden search method – a special way to save computation time by modifying the bisection method with the golden ratio – and I illustrated how to minimize a cusped function with this script.  I also wrote an R function to implement this method and an R script to apply this method with an example.  Today, I will use apply this method to a statistical topic: minimizing the sum of absolute deviations with the median.

While reading Page 148 (Section 6.3) in Michael Trosset’s “An Introduction to Statistical Inference and Its Applications”, I learned 2 basic, simple, yet interesting theorems.

If X is a random variable with a population mean \mu and a population median q_2, then

a) \mu minimizes the function f(c) = E[(X - c)^2]

b) q_2 minimizes the function h(c) = E(|X - c|)

I won’t prove these theorems in this blog post (perhaps later), but I want to use the golden section search method to show a result similar to b):

c) The sample median, \tilde{m} , minimizes the function

g(c) = \sum_{i=1}^{n} |X_i - c|.

This is not surprising, of course, since

|X - c| is just a function of the random variable X

– by the law of large numbers,

\lim_{n\to \infty}\sum_{i=1}^{n} |X_i - c| = E(|X - c|)

Thus, if the median minimizes E(|X - c|), then, intuitively, it minimizes \lim_{n\to \infty}\sum_{i=1}^{n} |X_i - c|.  Let’s show this with the golden section search method, and let’s explore any differences that may arise between odd-numbered and even-numbered data sets.

Read more of this post

Scripts and Functions: Using R to Implement the Golden Section Search Method for Numerical Optimization

In an earlier post, I introduced the golden section search method – a modification of the bisection method for numerical optimization that saves computation time by using the golden ratio to set its test points.  This post contains the R function that implements this method, the R functions that contain the 3 functions that were minimized by this method, and the R script that ran the minimization.

I learned some new R functions while learning this new algorithm.

– the curve() function for plotting curves

– the cat() function for concatenating strings and variables and, hence, for printing debugging statements

Read more of this post

The Golden Section Search Method: Modifying the Bisection Method with the Golden Ratio for Numerical Optimization

Introduction

The first algorithm that I learned for root-finding in my undergraduate numerical analysis class (MACM 316 at Simon Fraser University) was the bisection method.  It’s very intuitive and easy to implement in any programming language (I was using MATLAB at the time).  The bisection method can be easily adapted for optimizing 1-dimensional functions with a slight but intuitive modification.  As there are numerous books and web sites on the bisection method, I will not dwell on it in my blog post.

Instead, I will explain a clever and elegant way to modify the bisection method with the golden ratio that results in faster computation; I learned this method while reading “A First Course in Statistical Programming with R” by John Braun and Duncan Murdoch.  Using a script in R to implement this special algorithm, I will illustrate how to minimize a non-differentiable function with the golden section search method.  In a later post (for the sake of brevity), I will use the same method to show that the minimizer of the sum of the absolute deviations from a univariate data set is the median.  The R functions and script for doing everything are in another post.

Fibonacci_spiral

The Fibonacci spiral approximates the golden spiral, a logarithmic spiral whose growth factor is the golden ratio.

Source: Dicklyon via Wikimedia

Read more of this post