We often need to know the probability that the range of n samples drawn from a given normally distributed population will exceed a certain value. A special case of this is with n=2, which can be treated as simply finding the distribution of the differences between two normally distributed populations. It's well known that the sum (or difference, by simply negating one or more of the distributions) of n normally distributed random variables with means u1,u2,..,un and standard deviations s1,s2,..sn is also a normally distributed random variable with mean and standard deviation given by U = u1 + u2 + ... + un (0) S^2 = s1^2 + s2^2 + ... + sn^2 Hence the (signed) difference between two standard normal random variables is normally distributed with a mean of zero and standard deviation of sqrt(2). In other words, it's the density of the difference is e^(-x^2 / 4) h(x) = ------------ 2 sqrt(PI) Of course, the unsigned difference has twice this density, restricted to the range x > 0. The additivity of normal distributions according to equations (0) is so familiar that we often assume it's self-evident, but it's interesting to review how this additivity (which is closely related to the central limit theorem and the special properties of the normal distribution) is actually proven. To illustrate, let's just take the simple case of finding the distribution of the difference between 2 standard normal random variables. Letting f(t) denote the normal density function, the probability that two random samples t1 and t2 will differ by more than u can be expressed as _ _ inf | inf s-u | / | / / | Pr{|t1-t2| > u} = | f(s) | | f(t)dt + | f(t)dt | ds / | / / | s=-inf |_ t=s+u t=-inf _| Now, if we let F(x) denote the normal probablity function given by initegrating the normal density function inf / F(x) = | f(t) dt / t=x and if we note the equality of the two terms inside the square brackets of the prior expression, we have inf / Pr{|t1-t2| > x} = 2 | f(s) F(s+x) ds (1) / s=-inf Differentiating 1 minus this function with respect to x gives the density distribution inf / h(x) = 2 | f(s) f(s+x) ds / s=-inf which can be evaluated explicitly to give the unsigned density distribution e^(-x^2 / 4) h(x) = ------------ x > 0 sqrt(PI) This confirms what we already knew, namely, that the density distribution of the difference between two samples from a standard normal distribution is just a scaled version of the standard normal density, i.e., / x \ h(x) = sqrt(2) f( ------- ) x > 0 \sqrt(2)/ It follows that the probability that the difference between two random samples t1,t2 from a standard normal distribution will exceed x is exactly / x \ Pr{|t2-t1| > x} = 2 F( ------- ) x > 0 \sqrt(2)/ Tables of the normal density integral F (or sometimes 1-F) are given in many statistics books, so this formula is convenient for evaluating the probability of differences of various magnitudes. Notice that this is a special case of the more general problem of finding the probability density function for the RANGE of n samples, i.e., the difference between the max and min values of n samples drawn from a population with density f(x) and distribution F(x). In this case it's more convenient to express the generalization of (1) in terms of the probability that the range of n samples will be LESS than x, which is given by the integral inf / Pr{|tmax-tmin| < x} = n | [F(t+x) - F(t)]^(n-1) f(t) dt / t=-inf However, for n greater than 2, this integral cannot be evaluated in closed form (as far as I know), nor expressed simply in terms of the standard normal functions, so it must be evaluated numerically.

Return to MathPages Main Menu