概述
Reference:
Kay S M. Fundamentals of statistical signal processing[M]. Prentice Hall PTR, 1993. (Chapter 2)
Slides of ET4386, TUD
Content
- An Example
- Mean Square Error Criterion
- Minimum Variance Unbiased Estimator
- Existence of the Minimum Variance Unbiased Estimator
- Finding the Minimum Variance Unbiased Estimator
- Appendix: Some Useful Supplements
An Example
Consider a process e.g., a constant in noise
x
[
n
]
=
A
+
w
[
n
]
,
n
=
0
,
…
,
N
−
1
x[n]=A+w[n], quad n=0, ldots, N-1
x[n]=A+w[n],n=0,…,N−1
where, we assume
- A A A is deterministic and unknown,
- w [ n ] w[n] w[n] is a zero-mean random process with variance σ 2 sigma^{2} σ2,
- x [ n ] x[n] x[n] is the measured data.
Potential estimators for A A A:
- A ^ 1 = x [ 0 ] hat{A}_{1}=x[0] A^1=x[0]
- A ^ 2 = 1 N ∑ n = 0 N − 1 x [ n ] hat{A}_{2}=frac{1}{N} sum_{n=0}^{N-1} x[n] A^2=N1∑n=0N−1x[n]
- A ^ 3 = a N ∑ n = 0 N − 1 x [ n ] hat{A}_{3}=frac{a}{N} sum_{n=0}^{N-1} x[n] A^3=Na∑n=0N−1x[n], for some constant a a a
- ⋯ cdots ⋯
Which estimator is good (or optimal) ?
Mean Square Error Criterion
In searching for optimal estimators, we need to adopt some optimality criterion. A natural one is the mean square error (MSE), defined as
m
s
e
(
θ
^
)
=
E
[
(
θ
^
−
θ
)
2
]
mathrm{mse}(hat theta)=Eleft[(hat theta-theta)^2right]
mse(θ^)=E[(θ^−θ)2]
To get more insight, we can rewrite MSE as
m
s
e
(
θ
^
)
=
E
[
(
θ
^
−
E
(
θ
^
)
+
E
(
θ
^
)
−
θ
)
2
]
=
E
[
(
θ
^
−
E
(
θ
^
)
)
2
]
+
[
E
(
θ
^
)
−
θ
]
2
=
var
(
θ
^
)
+
b
2
(
θ
)
begin{aligned} mathrm{mse}(hat theta)&=Eleft[(hat theta-E(hat theta)+E(hat theta)-theta)^2right]\ &=E[(hat theta-E(hat theta))^2]+[E(hat theta)-theta]^2\ &=operatorname{var}(hat theta)+b^2(theta) end{aligned}
mse(θ^)=E[(θ^−E(θ^)+E(θ^)−θ)2]=E[(θ^−E(θ^))2]+[E(θ^)−θ]2=var(θ^)+b2(θ)
which shows that the MSE is composed of errors due to the variance of the estimator as well as the bias. Unfortunately, adoption of this natural criterion leads to unrealizable estimators, ones that cannot be written solely as a function of the data.
For instance, consider the estimator
A
ˇ
=
a
1
N
∑
n
=
0
N
−
1
x
[
n
]
check A=afrac{1}{N}sum_{n=0}^{N-1}x[n]
Aˇ=aN1n=0∑N−1x[n]
for our example with some constant
a
a
a. We will attempt to find the
a
a
a which results in the minimum MSE. Since
E
(
A
ˇ
)
=
a
A
E(check A)=a A
E(Aˇ)=aA and
var
(
A
ˇ
)
=
a
2
σ
2
/
N
,
operatorname{var}(check A)=a^{2} sigma^{2} / N,
var(Aˇ)=a2σ2/N, we have
mse
(
A
ˇ
)
=
a
2
σ
2
N
+
(
a
−
1
)
2
A
2
operatorname{mse}(check{A})=frac{a^{2} sigma^{2}}{N}+(a-1)^{2} A^{2}
mse(Aˇ)=Na2σ2+(a−1)2A2
Differentiating the MSE with respect to
a
a
a yields
d
mse
(
A
ˇ
)
d
a
=
2
a
σ
2
N
+
2
(
a
−
1
)
A
2
frac{d operatorname{mse}(check{A})}{d a}=frac{2 a sigma^{2}}{N}+2(a-1) A^{2}
dadmse(Aˇ)=N2aσ2+2(a−1)A2
which upon setting to zero and solving yields the optimum value
a
o
p
t
=
A
2
A
2
+
σ
2
/
N
a_{mathrm{opt}}=frac{A^{2}}{A^{2}+sigma^{2} / N}
aopt=A2+σ2/NA2
It is seen that, the optimal value of
a
a
a depends upon the unknown parameter
A
A
A. The estimator is therefore not realizable.
From a practical viewpoint the minimum MSE estimator needs to be abandoned. An alternative approach is to constrain the bias to be zero and find the estimator which minimizes the variance. Such an estimator is termed the minimum variance unbiased (MVU) estimator.
Minimum Variance Unbiased Estimator
Constrain the bias of the MSE to zero, i.e., consider
E
(
θ
^
)
=
θ
,
{E}(hat{theta})=theta,
E(θ^)=θ, then
m
s
e
(
θ
^
)
=
E
[
(
θ
^
−
E
(
θ
^
)
)
2
]
+
(
E
(
θ
^
)
−
θ
)
2
=
var
(
θ
^
)
mathrm{m s e}(hat{theta})={E}left[(hat{theta}-{E}(hat{theta}))^{2}right]+({E}(hat{theta})-theta)^{2}=operatorname{var}(hat{theta})
mse(θ^)=E[(θ^−E(θ^))2]+(E(θ^)−θ)2=var(θ^)
where
θ
^
hat{theta}
θ^ is an unbiased estimator, and let
var
(
θ
^
)
≤
var
(
θ
~
)
operatorname{var}(hat{theta}) leq operatorname{var}(tilde{theta})
var(θ^)≤var(θ~)
for any other unbiased estimator
θ
~
,
tilde{theta},
θ~, then
θ
^
hat{theta}
θ^ is the minimum variance unbiased estimator (MVU) for all
θ
theta
θ.
For the example, consider a more general estimator
A
^
=
∑
n
=
0
N
−
1
a
n
x
[
n
]
hat A=sum_{n=0}^{N-1}a_n x[n]
A^=n=0∑N−1anx[n]
To achieve unbiasedness, we should have
∑
n
=
0
N
−
1
a
n
=
1
sum_{n=0}^{N-1}a_n=1
n=0∑N−1an=1
The variance of
A
^
hat A
A^ is
var
(
A
^
)
=
∑
n
=
0
N
−
1
a
n
2
var
(
x
[
n
]
)
=
σ
2
∑
n
=
0
N
−
1
a
n
2
operatorname{var}(hat A)=sum_{n=0}^{N-1}a_n^2 operatorname{var}(x[n])=sigma^2sum_{n=0}^{N-1}a_n^2
var(A^)=n=0∑N−1an2var(x[n])=σ2n=0∑N−1an2
Use Lagrangian multipliers with unbiasedness as the constraint equation. Let
L
(
a
,
λ
)
=
σ
2
a
T
a
−
λ
(
1
T
a
)
L(mathbf a,lambda)=sigma^2 mathbf a^T mathbf a-lambda(mathbf 1^Tmathbf a)
L(a,λ)=σ2aTa−λ(1Ta)
Differentiate
L
L
L with respect to
a
mathbf a
a and set the result to zero:
2
σ
2
a
−
λ
1
=
0
2sigma^2mathbf a-lambda mathbf 1=0
2σ2a−λ1=0
Combine it with the constraint
∑
n
=
0
N
−
1
a
n
=
1
sum_{n=0}^{N-1}a_n=1
∑n=0N−1an=1, we obtain
a
=
1
N
1
,
mathbf a=frac{1}{N}mathbf 1,
a=N11,
i.e.,
A
^
=
1
N
∑
n
=
0
N
−
1
x
[
n
]
hat A=frac{1}{N}sum_{n=0}^{N-1}x[n]
A^=N1n=0∑N−1x[n]
Existence of the Minimum Variance Unbiased Estimator
The question arises as to whether a MVU estimator exists, i.e., an unbiased estimator with minimum variance for all θ theta θ.
In general, the MVU estimator does not always exist.
Another example: Given a single observation
x
[
0
]
x[0]
x[0] from the distribution
U
[
0
,
1
/
θ
]
mathcal{U}[0,1/theta]
U[0,1/θ], it is desired to estimate
θ
theta
θ. It is assumed that
θ
>
0
theta >0
θ>0. For an unbiased estimator, we must have
∫
0
1
/
θ
θ
g
(
u
)
d
u
=
θ
⟺
∫
0
1
/
θ
g
(
u
)
d
u
=
1
int_0^{1/theta}theta g(u)du=thetaiff int_0^{1/theta} g(u)du=1
∫01/θθg(u)du=θ⟺∫01/θg(u)du=1
Assume that we can find a function
g
(
u
)
g(u)
g(u) such that for all
θ
>
0
theta>0
θ>0, the condition above will be satisfied. Then for any
θ
1
>
θ
2
>
0
theta_1>theta_2>0
θ1>θ2>0, we have
∫
0
1
/
θ
1
g
(
u
)
d
u
=
1
,
∫
0
1
/
θ
2
g
(
u
)
d
u
=
1
⟹
∫
1
/
θ
1
1
/
θ
2
g
(
u
)
d
u
=
0
int_0^{1/theta_1} g(u)du=1,int_0^{1/theta_2} g(u)du=1 Longrightarrow int_{1/theta_1}^{1/theta_2} g(u)du=0
∫01/θ1g(u)du=1,∫01/θ2g(u)du=1⟹∫1/θ11/θ2g(u)du=0
Clearly, we must have
g
(
u
)
=
0
g(u)=0
g(u)=0 for all
u
u
u, which produces a biased estimator.
Finding the Minimum Variance Unbiased Estimator
Even if a MVU estimator exists, we may not be able to find it. In the next few chapters we shall discuss several possible approaches. They are:
- Determine the Cramer-Rao lower bound (CRLB) and check to see if some estimator satisfies it (Chapters 3 and 4).
- Apply the Rao-Blackwell-Lehmann-Scheffe (RBLS) theorem (Chapter 5).
- Further restrict the class of estimators to be not only unbiased but also linear. Then, find the minimum variance estimator within this restricted class (Chapter 6).
Appendix: Some Useful Supplements
An estimator is unbiased does not necessarily mean that it is a good estimator. It only guarantees that on the average it will attain the true value. On the other hand, biased estimators are ones that are characterized by a systematic error, which presumably should not be present. A persistent bias will always result in a poor estimator.
It sometimes occurs that multiple estimates of the same parameter are available, i.e.,
{
θ
^
1
,
θ
^
2
,
⋯
,
θ
^
n
}
{hat{theta}_1,hat{theta}_2,cdots,hat{theta}_n}
{θ^1,θ^2,⋯,θ^n}. A reasonable procedure is to combine these estimates into a better one by averaging them to form
θ
=
1
n
∑
i
=
1
n
θ
^
i
theta=frac{1}{n}sum_{i=1}^n hat{theta}_i
θ=n1i=1∑nθ^i
Assuming the estimators are unbiased, with the same variance, and uncorrelated with each other,
E
(
θ
^
)
=
θ
,
var
(
θ
^
)
=
1
n
2
∑
i
=
1
n
var
(
θ
^
i
)
=
var
(
θ
^
1
)
n
E(hat theta)=theta,quad operatorname{var}(hat theta)=frac{1}{n^2}sum_{i=1}^n operatorname{var}(hat {theta}_i)=frac{operatorname{var}(hat {theta}_1)}{n}
E(θ^)=θ,var(θ^)=n21i=1∑nvar(θ^i)=nvar(θ^1)
so that as more estimates are averaged, the variance will decrease. Ultimately, as
n
→
∞
,
θ
^
→
θ
n to infty, hat theta to theta
n→∞,θ^→θ. However, if the estimators are biased, then no matter how many estimators are averaged,
θ
^
hat theta
θ^ will not converge to the true value, as is shown in the figure above.
The PDF of A ^ = 1 N ∑ n = 0 N − 1 x [ n ] hat A=frac{1}{N} sum_{n=0}^{N-1} x[n] A^=N1∑n=0N−1x[n] given in the example is N ( A , σ 2 / N ) mathcal{N}(A,sigma^2/N) N(A,σ2/N):
Note that
w
[
n
]
∼
N
(
0
,
σ
2
)
w[n]sim mathcal{N}(0,sigma^2)
w[n]∼N(0,σ2), then
x
[
n
]
∼
N
(
A
,
σ
2
)
x[n]sim mathcal{N}(A,sigma^2)
x[n]∼N(A,σ2). Since
x
[
n
]
x[n]
x[n] is independent to each other,
A
^
hat A
A^ follows Gaussian distribution. It is easy to verify that
E
(
A
^
)
=
A
,
var
(
A
^
)
=
σ
2
/
N
E(hat A)=A,operatorname{var}(hat A)=sigma^2/N
E(A^)=A,var(A^)=σ2/N. Thus
A
^
∼
N
(
A
,
σ
2
/
N
)
hat Asim mathcal{N}(A,sigma^2/N)
A^∼N(A,σ2/N)
The estimator can be proved to be consistent, i.e., as
N
→
∞
,
A
^
→
A
Nto infty,hat A to A
N→∞,A^→A by showing that
lim
N
→
∞
Pr
{
∣
A
^
−
A
∣
>
ϵ
}
=
0
lim_{Nto infty}Pr{|hat A-A|>epsilon}=0
N→∞limPr{∣A^−A∣>ϵ}=0
for any
ϵ
>
0
epsilon>0
ϵ>0:
Since
A
^
−
A
σ
2
/
N
∼
N
(
0
,
1
)
frac{hat A-A}{sqrt{sigma^2/N}}sim mathcal{N}(0,1)
σ2/NA^−A∼N(0,1)
lim N → ∞ Pr { ∣ A ^ − A ∣ > ϵ } = lim N → ∞ Pr { ∣ A ^ − A σ 2 / N ∣ > ϵ σ 2 / N } = 0 lim_{Nto infty}Pr{|hat A-A|>epsilon}=lim_{Nto infty}Prleft{left|frac{hat A -A}{sqrt{sigma^2/N}} right|>frac{epsilon}{sqrt{sigma^2/N}}right}=0 N→∞limPr{∣A^−A∣>ϵ}=N→∞limPr{∣∣∣∣∣σ2/NA^−A∣∣∣∣∣>σ2/Nϵ}=0
A probabilistic perspective of minimum variance:
Two unbiased estimators are proposed whose variances satisfy
var
(
θ
^
)
<
var
(
θ
ˇ
)
operatorname{var}(hat theta)<operatorname{var}(check theta)
var(θ^)<var(θˇ). If both estimators are Gaussian, prove that
Pr
{
∣
θ
^
−
θ
∣
>
ϵ
}
<
Pr
{
∣
θ
ˇ
−
θ
∣
>
ϵ
}
Pr {|hat theta -theta|>epsilon}<Pr {|check theta -theta|>epsilon}
Pr{∣θ^−θ∣>ϵ}<Pr{∣θˇ−θ∣>ϵ}
for any
ϵ
epsilon
ϵ. This says that the estimator with less variance is to be preferred since its PDF is more concentrated about the true value.
Since
θ
^
−
θ
var
(
θ
^
)
∼
N
(
0
,
1
)
,
θ
ˇ
−
θ
var
(
θ
ˇ
)
∼
N
(
0
,
1
)
frac{hat theta-theta}{sqrt{operatorname{var}(hat theta)}}sim mathcal{N}(0,1),quad frac{check theta-theta}{sqrt{operatorname{var}(check theta)}}sim mathcal{N}(0,1)
var(θ^)θ^−θ∼N(0,1),var(θˇ)θˇ−θ∼N(0,1)
Let the cumulative distribution function for
N
(
0
,
1
)
mathcal{N}(0,1)
N(0,1)
Φ
(
x
)
=
∫
−
∞
x
1
2
π
e
−
1
2
t
2
d
t
Phi (x)=int_{-infty}^x frac{1}{sqrt{2pi}}e^{-frac{1}{2}t^2}dt
Φ(x)=∫−∞x2π1e−21t2dt
Then
Pr
{
∣
θ
^
−
θ
∣
>
ϵ
}
=
Pr
{
∣
θ
^
−
θ
var
(
θ
^
)
∣
>
ϵ
var
(
θ
^
)
}
=
2
Φ
{
−
ϵ
var
(
θ
^
)
}
Pr{|hat theta -theta|>epsilon}=Prleft{left|frac{hat theta -theta}{sqrt{operatorname{var}(hat theta)}} right|>frac{epsilon}{sqrt{operatorname{var}(hat theta)}}right}=2Phileft{frac{-epsilon}{sqrt{operatorname{var}(hat theta)}} right}
Pr{∣θ^−θ∣>ϵ}=Pr⎩⎨⎧∣∣∣∣∣∣var(θ^)θ^−θ∣∣∣∣∣∣>var(θ^)ϵ⎭⎬⎫=2Φ⎩⎨⎧var(θ^)−ϵ⎭⎬⎫
If
var
(
θ
^
)
<
var
(
θ
ˇ
)
{operatorname{var}(hat theta)}<{operatorname{var}(check theta)}
var(θ^)<var(θˇ),
Φ
{
−
ϵ
var
(
θ
^
)
}
<
Φ
{
−
ϵ
var
(
θ
ˇ
)
}
Phileft{frac{-epsilon}{sqrt{operatorname{var}(hat theta)}} right}<Phileft{frac{-epsilon}{sqrt{operatorname{var}(check theta)}} right}
Φ⎩⎨⎧var(θ^)−ϵ⎭⎬⎫<Φ⎩⎨⎧var(θˇ)−ϵ⎭⎬⎫
or
Pr
{
∣
θ
^
−
θ
∣
>
ϵ
}
<
Pr
{
∣
θ
ˇ
−
θ
∣
>
ϵ
}
Pr {|hat theta -theta|>epsilon}<Pr {|check theta -theta|>epsilon}
Pr{∣θ^−θ∣>ϵ}<Pr{∣θˇ−θ∣>ϵ}.
What will happen if an unbiased estimator undergoes a nonlinear transformation? For instance, if we choose to estimate the unknown parameter
θ
=
A
2
theta=A^2
θ=A2 by
θ
^
=
(
1
N
∑
n
=
0
N
−
1
x
[
n
]
)
2
,
hat theta =left( frac{1}{N}sum_{n=0}^{N-1}x[n]right)^2,
θ^=(N1n=0∑N−1x[n])2,
can we say that the estimator is unbiased? What happens as
N
→
∞
Nto infty
N→∞?
We know that
θ
^
=
A
^
2
A
^
∼
N
(
A
,
σ
2
/
N
)
hat theta={hat A}^2quad hat A sim mathcal{N}(A,sigma^2/N)
θ^=A^2A^∼N(A,σ2/N)
Therefore,
E
(
θ
^
)
=
E
(
A
^
2
)
=
var
(
A
^
)
+
E
2
(
A
^
)
=
σ
2
/
N
+
A
2
=
σ
2
/
N
+
θ
≠
θ
E(hat theta)=E(hat {A}^2)=operatorname{var}(hat A)+E^2(hat A)=sigma^2/N+A^2=sigma^2/N+thetane theta
E(θ^)=E(A^2)=var(A^)+E2(A^)=σ2/N+A2=σ2/N+θ=θ
which is biased but asymptotically unbiased.
In our example, if the value of
σ
2
sigma^2
σ2 is also unknown, an unbiased estimator is
θ
^
=
[
A
^
σ
^
2
]
=
[
1
N
∑
n
=
0
N
−
1
x
[
n
]
1
N
−
1
∑
n
=
0
N
−
1
(
x
[
n
]
−
A
^
)
2
]
hat {boldsymbol{theta}}=left[begin{matrix}hat A\hat {sigma}^2end{matrix}right]=left[begin{matrix}frac{1}{N} sum_{n=0}^{N-1} x[n]\frac{1}{N-1} sum_{n=0}^{N-1} (x[n]-hat A)^2end{matrix}right]
θ^=[A^σ^2]=[N1∑n=0N−1x[n]N−11∑n=0N−1(x[n]−A^)2]
最后
以上就是淡定紫菜为你收集整理的Minimum Variance Unbiased Estimation (MVU)的全部内容,希望文章能够帮你解决Minimum Variance Unbiased Estimation (MVU)所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复