# Measures of Spread in Discrete Distributions

.

In Year 9, we worked with:

• Range (largest minus smallest) and
• Inter-Quartile Range (Upper Quartile minus Lower Quartile)

as two ways of measuring the spread of the data.
.

In higher levels of maths we use variance and standard deviation

• They give us a finer measurement of the spread of the data.
• This is because they are based on all the data, not just the endpoints (or the quartiles).

.

• a larger value means the data is more spread out,
• a smaller value means the data is clumped more closely together.

.

## Variance

.

Notation:

… … Var(X) … … or

… … $\sigma^2$ … … { $\sigma$ is the lower case sigma }

.

The rule for the Variance of the Variable X is:

… … $\sigma^2 = E(X - \mu)^2$ … … { $\mu$ is the mean or Expected Value of X }
.

… … $\sigma^2 = \sum \, (x - \mu)^2 \times p(x)$

.

### Example 1a (note shortcut below)

.

In the distribution below, calculate Var(X)

Solution

… … First calculate the mean (Expected Value)

… … … … $\mu = \Sigma \; x \times p(x)$
.

… … … … $\mu = 0\times 0.25 + 1 \times 0.35 + 2 \times 0.2 + 3 \times 0.2$
.

… … … … $\mu = 1.35$

.

… … Then fill in the table as shown:

.

… … Hence the Variance Var(X) is:
.

… … … … $\sigma^2 = \sum \, (X - \mu)^2 \times p(x)$
.

… … … … $\sigma^2 = 1.1275$
.

… … {This is a lot of calculations. There is an easier way!!}

.

## Alternate Rule for Variance

.

… … $\sigma^2 = E(X^2) - \mu^2$
.

Recall that:
.

… … $E(X) = \sum \, x \, p(x)$

… … therefore

… … $E(X^2) = \sum \, x^2 \, p(x)$
.

So the alternate rule for Variance can be expressed as:
.

… … $\sigma^2 = \sum \, x^2 \, p(x) - \mu^2$

.

### Example 1b (using alternate rule)

In the distribution below, calculate Var(X)

Solution

… … First calculate the mean (Expected Value)

… … … … $\mu = \Sigma \; x \times p(x)$
.

… … … … $\mu = 0\times 0.25 + 1 \times 0.35 + 2 \times 0.2 + 3 \times 0.2$
.

… … … … $\mu = 1.35$
.

… … Then fill in the table as shown

… … … $E(X^2) = \sum \, x^2 \times p(x)$

… … … $E(X^2) = 2.95$

… … So

… … … $\sigma^2 = E(X^2) - \mu^2$

… … … $\sigma^2 = 2.95 - 1.35^2$

… … … $\sigma^2 = 1.1275$
.

## Variance Theorems

… … $Var(aX) = a^2 Var(X)$ … … where a is any constant

… … $Var(X + b) = Var(X)$ … … where b is any constant

… … {This second result is because the mean has changed but the spread of data is not affected}
.

… … Eg: $Var(3X + 2) = 9Var(X)$

.

## Standard Deviation

The Standard Deviation of X is the square root of the Variance.

Hence the Standard Deviation is simply $\sigma$

… … $\sigma = \sqrt{Var(X)}$

.
The Standard Deviation is useful because it cancels out the x2 in the calculation for Variance.

Therefore Standard Deviation has the same magnitude as the original data.

.

### Example 1c

Calculate the Standard Deviation for Example 1 above

Solution

… … In the above example, the variance was:

… … … … $Var(X) = \sigma^2 = 1.1275$

… … hence

… … … … $SD(X) = \sigma = 1.0618$

.

## Mean and Standard Deviation on the CAS Calculator

… … From the Main Menu select the Statistics package.

… … Enter the x values into list1 and $Pr(X=x)$ into list2

… … Go to the Calc menu and Select One-Variable

… … Set Xlist to List1

… … Set Freq to List2

… … Mean will appear as $\bar{x}$

… … Standard Deviation will appear as $\sigma_x$

.

For the Variance you will need to square the Standard Deviation

.

## Confidence Intervals

In many distributions, approximately 95% of the data will lie between two standard deviations each side of the mean.

This means that approximately 95% of the data will lie between

… … $\mu - 2\sigma \text{ and } \mu + 2\sigma$

OR

… … $Pr(\mu - 2\sigma \leqslant X \leqslant \mu + 2\sigma) \approx 0.95$

.
This is known as the 95% confidence interval.

.

## Other Confidence Intervals

(not in the course - why not is a very deep mystery, they are mentioned in continuous distributions)

68% confidence interval (1 standard deviation from the mean)

… … $Pr(\mu - \sigma \leqslant X \leqslant \mu + \sigma) \approx 0.68$

.

99.7% confidence interval (3 standard deviations from the mean)

… … $Pr(\mu - 3\sigma \leqslant X \leqslant \mu + 3\sigma) \approx 0.997$

.

### Example 2

The probability distribution of X is given by

… … $p(x) = \dfrac{x^2}{54} \qquad \text { where } \; x \in \big\{ 2,\; 3,\; 4,\; 5 \big\}$

a) .. find the mean, variance and the standard deviation correct to 4 decimal places

b) .. hence find the probability that x is within 2 standard deviations of the mean correct to 4 decimal places
.

Solution

a) .. find the mean, variance and the standard deviation correct to 4 decimal places

… … First fill out the probability distribution table

… … Mean (Expected Value)

… … … … $\mu = E(X) = \Sigma \; xp(x)$

… … … … $\mu = 2 \times \dfrac{4}{54} + 3 \times \dfrac{9}{54} + 4 \times \dfrac{16}{54} + 5 \times \dfrac{25}{54}$

… … … … $\mu = 4\dfrac{8}{54}$

… … … … $\mu \approx 4.1481$
.

… … Variance

… … … … $E(X^2) = 4 \times \dfrac{4}{54} + 9 \times \dfrac{9}{54} + 16 \times \dfrac{16}{54} + 25 \times \dfrac{25}{54}$

… … … … $E(X^2) = 18\dfrac{6}{54}$

… … … … $E(X^2) \approx 18.1111$

.
… … … … $\sigma^2 = 18.1111 - 4.1481^2$

… … … … $\sigma^2 = 0.9040$
.

… … Standard Deviation

… … … … $\sigma = \sqrt{0.9040}$

… … … … $\sigma = 0.9508$
.

b) .. hence find the probability that x is within 2 standard deviations of the mean correct to 4 decimal places

… … Confidence Interval

… … … … $\mu-2\sigma = 2.2465$

… … … … $\mu+2\sigma = 6.0497$

… … … … but $x = \big\{2,\; 3,\; 4,\; 5\big\}$

… … … … so

… … … … $Pr(\mu-2\sigma \leqslant X \leqslant \mu+2\sigma)$

… … … … … $= Pr(2.2465 \leqslant X \leqslant 6.0497)$

… … … … … $= Pr(3 \leqslant X \leqslant 5)$

… … … … … $= \dfrac{9}{54} + \dfrac{16}{54} + \dfrac{25}{54}$

… … … … … $= \dfrac{50}{54}$

… … … … … $= 0.9259$ … or … $92.59\%$
.

… … We expected this confidence interval to be approximately 95%. This is sufficiently close.

.