1 / 64

Chapter 2Minimum Variance Unbiased estimation

Introduction

- In this chapter we will begin our search for good

estimators of unknown deterministic parameters. - We will restrict our attention to estimators

which on the average yield the true parameter

value. - Then, within this class of estimators the goal

will be to find the one that exhibits the least

variability. - The estimator thus obtained will produce values

close to the true value most of the time. - The notion of a minimum variance unbiased

estimator is examined within this chapter.

Unbiased Estimators

- For an estimator to be unbiased we mean that on

the average the estimator will yield the true

value of the unknown parameter. - Since the parameter value may in general be

anywhere in the interval ,

unbiasedness asserts that no matter what the true

value of ?, our estimator will yield it on the

average.

(2.1)

Example 2.1 (1/2)

- Consider the observations
- where A is the parameter to be estimated and

wn is WGN. The parameter A can take on any

value in the interval . - The reasonable estimator for the average value of

xn is - or the sample mean.

(2.2)

Example 2.1 (2/2)

- Due to the linearity properties of the

expectation operator - for all A. The sample mean estimator is unbiased.

Unbiased Estimators

- The restriction that for all ?

is an important one. - It is possible that may hold for some values of ?

and not others.

Example 2.2

- Consider again Example 2.1 but with the modified

sample mean estimator - Then,
- It is seen that (2.3) holds for the modified

estimator only for A 0. - Clearly, it is a biased estimator.

Unbiased Estimators

- That an estimator is unbiased does not

necessarily mean that it is a good estimator. - It only guarantees that on the average it will

attain the true value. - A persistent bias will always result in a poor

estimator. - As an example, the unbiased property has an

important implication when several estimators are

combined. A reasonable procedure is to combine

these estimates into a better one by averaging

them to form

Unbiased Estimators

- Assuming the estimators are unbiased, with the

same variance, and uncorrelated with each other, - and
- so that as more estimates are averaged, the

variance will decrease.

Unbiased Estimators

- However, if the estimators are biased or

- , then
- and no mater how many estimators are averaged,

will not converge to the true value. - Note that, in general,
- is defined as the bias of the estimator.

(No Transcript)

Minimum Variance Criterion

- In searching for optimal estimators we need to

adopt some optimality criterion. - A natural one is the mean square error (MSE),

defined as - Unfortunately, adoption of this natural criterion

leads to unrealizable estimators, ones that

cannot be written solely as a function of the

data.

Minimum Variance Criterion

- To understand the problem which arises we first

rewrite the MSE as - which shows that the MSE is composed of errors

due to the variance of the estimator as well as

the bias.

(2.6)

Minimum Variance Criterion

- As an example, for the problem in Example 2.1

consider the modified estimator - for come constant a.
- We will attempt to find the a which results in

the minimum MSE. - Since and

, we have

Minimum Variance Criterion

- Differentiating the MSE with respect to a yields
- which upon setting to zero and solving yields

the optimum value - It is seen that the optimal value of a depends

upon the unknown parameter A. The estimator is

therefore not realizable.

Minimum Variance Criterion

- In retrospect the estimator depends upon A since

the bias term in (2.6) is a function of A. - It would seem that any criterion which depends on

the bias will lead to an unrealizable estimator. - From a practical view point the minimum MSE

estimator needs to be abandoned.

Minimum Variance Criterion

- An alternative approach is to constrain the bias

to be zero and find the estimator which minimizes

the variance. - Such an estimator is termed the minimum variance

unbiased (MVU) estimator. - Note that from (2.6) that the MSE of an unbiased

estimator is just the variance. - Minimizing the variance of an unbiased estimator

also has the effect of concentrating the PDF of

the estimation error about zero. - The estimation error will therefore be less

likely to be large.

Existence of the Minimum Variance Unbiased

Estimator

- The question arises as to whether a MVU estimator

exists, i.e., an unbiased estimator with minimum

variance for all ?.

Example 2.3 (1/3)

- Assume that we have two independent observations

x0 and x1 with PDF - The two estimators
- can easy be shown to be unbiased.

Example 2.3 (2/3)

- To compute the variances we have that
- so that
- and

Example 2.3 (3/3)

- Clearly, between these two estimators no MVU

estimator exists. - No single estimator can have a variance uniformly

less than or equal the minima.

Finding the Minimum Variance Unbiased Estimator

- Even if a MV estimator exists, we may not be able

to find it. - In the next few chapters we shall discuss several

possible approaches. - They are
- Determine the Cramer-Rao lower bound (CRLB) and

check to see if some estimator satisfies it

(Chapters 3 and 4). - Apply the Rao-Blackwell-Lehmann-Scheffe (RBLS)

theorem (Chapter 5). - Further restrict the class of estimators to be

not only unbiased but also linear. Ten, find the

minimum variance estimator within this restricted

class (Chapter 6).

Finding the Minimum Variance Unbiased Estimator

- The CRLB allow us to determine that for any

unbiased estimator the variance must be greater

than or equal to a given value. - If an estimator exists whose variance equals the

CRLB for each value of ?, then it must be the MVU

estimator.

Extension to a Vector Parameter

- If is a vector of

unknown parameters, then we say that an estimator

is unbiased if - for i 1, 2, , p.
- By defining

(2.7)

Extension to a Vector Parameter

- We can equivalently define an unbiased estimator

to have the property - for every ? contained within the space defined

in (2.7). - A MVU estimator has the additional property that

for i 1, 2, , p is minimum among all

unbiased estimators.

Chapter 3Cramer-Rao Lower Bound

Introduction

- Place a lower bound on the variance of any

unbiased estimator and assert that an estimator

is the MVU estimator. - Although many such variance bounds exist McAulay

and Hofstetter 1971, Kendall and Stuart 1979,

Seidman 1970, Ziv and Zakai 1969, the Cramer-Rao

lower bound (CRLB) is the easiest to determine.

3.3 Estimator Accuracy Considerations

- Consider the hidden factors that determine how

well we can estimate a parameter. - The more the PDF is influenced by the unknown

parameter, the better we should be able to

estimate it. - Example 3.1 - PDF dependence on unknown

parameterIf a single sample is observed

aswhere , and it

is desired to estimate A

3.3 Estimator Accuracy Considerations

- Example 3.1(cont.)A good unbiased estimator

isThe variance isThe estimator accuracy

improves as decreases. If

and

3.3 Estimator Accuracy Considerations

- Example 3.1(cont.)the latter is a much

weaker dependence on A.

3.3 Estimator Accuracy Considerations

- The sharpness of the likelihood functions

determines how accurately we can estimate the

unknown parameter.

3.3 Estimator Accuracy Considerations

- For this examplethe second derivative does

not depend on - In general ,a more appropriate measure of

curvature is

3.3 Estimator Accuracy Considerations

- Which measures the average curvature of the

log-likelihood function. - The expectation is taken with respect to

,resulting in a function of A only. - The larger the quantity, the smaller the variance

of the estimator.

3.4 Cramer-Rao Lower Bound

- Theorem 3.1 (CRLB Scalar Parameter)It is

assumed that the PDF satisfies the

regularity condition

for allthen , the

variance of any unbiased estimator must

satisfy

3.4 Cramer-Rao Lower Bound

- Theorem 3.1(cont.)furthermore, an unbiased

estimator attains the bound if and only

ifand min variance

3.4 Cramer-Rao Lower Bound

- Prove when the CRLB is attained,

thenproofBecause CRLB is attained and

3.4 Cramer-Rao Lower Bound

- Proof(cont.)so we getand thenfinally,

3.4 Cramer-Rao Lower Bound

- Regularity

3.4 Cramer-Rao Lower Bound

- Example 3.2 CRLB for Example 3.1

3.4 Cramer-Rao Lower Bound

- Example 3.3 DC level in white Gaussian

Noiseconsider the multiple observationsPDF

3.4 Cramer-Rao Lower Bound

- Example 3.3(cont.)

3.4 Cramer-Rao Lower Bound

- Example 3.3(cont.)we see that the sample mean

estimator attains the bound and must therefore be

the MVU estimator.

3.4 Cramer-Rao Lower Bound

- Example 3.4 Phase EstimatorA and f0 are

assumed known, and we wish to estimate the phase

3.4 Cramer-Rao Lower Bound

- Example 3.4(cont.) So we get

3.4 Cramer-Rao Lower Bound

- Example 3.4(cont.)In this example the condition

for the bound to hold is not satisfied.Hence,

a phase estimator does not exist which unbiased

and attains the CRLB. - But, a MVU estimator may exist

3.4 Cramer-Rao Lower Bound

- Efficiency vs min

variance

3.4 Cramer-Rao Lower Bound

- Fisher information properties
- Nonnegative
- Additive for independent observations

3.4 Cramer-Rao Lower Bound

- The latter property leads to the result that the

CRLB for N IID observations is 1/N times that for

one observation. - For completely dependent samples,

3.5 General CRLB for Signals in White Gaussian

Noise

- Consider

3.5 General CRLB for Signals in White Gaussian

Noise

- finally,

3.5 General CRLB for Signals in White Gaussian

Noise

- Example 3.5 Sinusoidal Frequency

EstimationAssume where A and phase are known.

So we get the CRLB - If , the CRLB goes to infinity.

3.5 General CRLB for Signals in White Gaussian

Noise

- Example 3.5 (cont.)

3.6 Transformation of Parameters

- Usually, the parameter we wish to estimate is a

function of some more fundamental parameter. - In Example 3.3, we wish to estimate A2. Knowing

the CRLB for A, we can easily obtain it for A2. - As shown in Appendix 3A, if it is desired to

estimate - , then the CRLB is

3.6 Transformation of Parameters

- For the present example this becomes

and - In Example 3.3, the sample mean estimator was

efficient for A. It might be supposed that

is efficient for A2. - But actually, is not even an unbiased

estimator! - proof Since

3.6 Transformation of Parameters

- The efficiency of an estimator is destroyed by a

nonlinear transformation. - But the efficiency is maintained for linear

transformation. - Proof Assume that an efficient estimator for

exists and is given by . It is desired to

estimate . - We choose .

Then - So that is unbiased.

3.6 Transformation of Parameters

- The CRLB for
- But

, so that the CRLB is achieved. - So, the efficiency is maintained for linear

transformation.

3.6 Transformation of Parameters

- The efficiency is approximately maintained over

nonlinear transformations if the data record is

large enough. - Ex The example of estimating .

Although is biased, we note that

is asymptotically

unbiased or unbiased as . - Since , we can evaluate the

variance

3.6 Transformation of Parameters

- Using the result that if
- therefore
- For our problem, we have
- Hence, as , is an asymptotically

efficient estimator of A2.

3.6 Transformation of Parameters

- This situation occurs due to the statistical

linearity of the transformation, as illustrated

in figure. As N increased, the PDF of

becomes more concentrated about the mean A.

3.6 Transformation of Parameters

- If we linearize g about A, we have the

approximation - Within this approximation,
- the estimator is unbiased (asymptotically).

Also, - so the estimator achieves the CRLB

(asymptotically).

3.7 Extension to a Vector Parameter

- Now we extend the results to a vector parameter
- . As derived in Appendix 3B,

the CRLB is found as the i, i element of the

inverse of a matrix - where is the Fisher

information matrix. - for .

3.7 Extension to a Vector Parameter

- Example 3.6 DC Level in White Gaussian Noise
- We now extend example 3.3
- to the case where in addition to A the noise

variance - is also unknown.
- The parameter vector is , hence

p 2. The 2x2 Fisher information matrix is

3.7 Extension to a Vector Parameter

- The log-likelihood function is
- The derivatives are easily found as

3.7 Extension to a Vector Parameter

- Taking the negative expectation, the Fisher

information matrix becomes - Although not true in general, for this example

the Fisher information matrix is diagonal and

hence easily inverted to yield