## Differences Between Normal Samples

We often need to know the probability that the range of n samples
drawn from a given normally distributed population will exceed a
certain value.  A special case of this is with n=2, which can
be treated as simply finding the distribution of the differences
between two normally distributed populations.  It's well known
that the sum (or difference, by simply negating one or more of
the distributions) of n normally distributed random variables with
means u1,u2,..,un and standard deviations s1,s2,..sn is also a
normally distributed random variable with mean and standard
deviation given by

U = u1 + u2 + ... + un
(0)
S^2 = s1^2 + s2^2 + ... + sn^2

Hence the (signed) difference between two standard normal random
variables is normally distributed with a mean of zero and standard
deviation of sqrt(2).  In other words, it's the density of the
difference is
e^(-x^2 / 4)
h(x)  =  ------------
2 sqrt(PI)

Of course, the unsigned difference has twice this density, restricted
to the range x > 0.

The additivity of normal distributions according to equations (0)
is so familiar that we often assume it's self-evident, but it's
interesting to review how this additivity (which is closely related
to the central limit theorem and the special properties of the normal
distribution) is actually proven.  To illustrate, let's just take the
simple case of finding the distribution of the difference between 2
standard normal random variables.  Letting f(t) denote the normal
density function, the probability that two random samples t1 and t2
will differ by more than u can be expressed as

_                         _
inf     |   inf          s-u        |
/       |   /             /         |
Pr{|t1-t2| > u}  =   |  f(s) |   | f(t)dt  +   | f(t)dt  | ds
/       |   /             /         |
s=-inf    |_ t=s+u        t=-inf     _|

Now, if we let F(x) denote the normal probablity function given by
initegrating the normal density function

inf
/
F(x)  =  | f(t) dt
/
t=x

and if we note the equality of the two terms inside the square
brackets of the prior expression, we have

inf
/
Pr{|t1-t2| > x}  =  2 |  f(s) F(s+x) ds              (1)
/
s=-inf

Differentiating 1 minus this function with respect to x gives the
density distribution

inf
/
h(x)  =  2 |  f(s) f(s+x) ds
/
s=-inf

which can be evaluated explicitly to give the unsigned density
distribution

e^(-x^2 / 4)
h(x)  =  ------------          x > 0
sqrt(PI)

This confirms what we already knew, namely, that the density
distribution of the difference between two samples from a standard
normal distribution is just a scaled version of the standard normal
density, i.e.,

/   x   \
h(x)  =  sqrt(2) f( ------- )         x > 0
\sqrt(2)/

It follows that the probability that the difference between two random
samples t1,t2 from a standard normal distribution will exceed x is
exactly

/   x   \
Pr{|t2-t1| > x}  =  2 F( ------- )         x > 0
\sqrt(2)/

Tables of the normal density integral F (or sometimes 1-F) are given
in many statistics books, so this formula is convenient for evaluating
the probability of differences of various magnitudes.

Notice that this is a special case of the more general problem of
finding the probability density function for the RANGE of n samples,
i.e., the difference between the max and min values of n samples
drawn from a population with density f(x) and distribution F(x).
In this case it's more convenient to express the generalization of
(1) in terms of the probability that the range of n samples will be
LESS than x, which is given by the integral

inf
/
Pr{|tmax-tmin| < x}  =  n | [F(t+x) - F(t)]^(n-1) f(t) dt
/
t=-inf

However, for n greater than 2, this integral cannot be evaluated in
closed form (as far as I know), nor expressed simply in terms of the
standard normal functions, so it must be evaluated numerically.