Delta Method
The delta method is a result concerning the approximate probability distribution for a function of an asymptotically normal statistical estimator from knowledge of the limiting variance of that estimator.
Method
While the delta method generalizes easily to a multivariate setting, careful motivation of the technique is more easily demonstrated in univariate terms. Roughly, if there is a sequence of random variables [math]X_n[/math] satisfying
where [math]\theta[/math] and [math]\sigma^2[/math] are finite valued constants and [math]\xrightarrow{D}[/math] denotes convergence in distribution, then
for any function [math]g[/math] satisfying the property that [math]g'(\theta) [/math] exists and is non-zero valued.
The method extends to the multivariate case. By definition, a consistent estimator [math]B[/math] converges in probability to its true value [math]\beta[/math], and often a central limit theorem can be applied to obtain asymptotic normality:
where n is the number of observations and [math]\Sigma[/math] is a covariance matrix. The multivariate delta method yields the following asymptotic property of a function [math]h[/math] of the estimator [math]B[/math] under the assumption that the gradient [math]\nabla h[/math] is non-zero:
We start with the univariate case. Demonstration of this result is fairly straightforward under the assumption that [math]g'(\theta)[/math] is continuous. To begin, we use the mean value theorem:
- [[math]]g(X_n)=g(\theta)+g'(\tilde{\theta})(X_n-\theta),[[/math]]
where [math]\tilde{\theta}[/math] lies between [math]X_n[/math] and [math]\theta[/math]. Note that since [math]X_n\,\xrightarrow{P}\,\theta[/math] and [math]X_n \lt \tilde{\theta} \lt \theta [/math], it must be that [math]\tilde{\theta} \,\xrightarrow{P}\,\theta[/math] and since [math]g'(\theta)[/math] is continuous, applying the continuous mapping theorem yields
- [[math]]g'(\tilde{\theta})\,\xrightarrow{P}\,g'(\theta),[[/math]]
where [math]\xrightarrow{P}[/math] denotes convergence in probability. Rearranging the terms and multiplying by [math]\sqrt{n}[/math] gives
- [[math]]\sqrt{n}[g(X_n)-g(\theta)]=g'(\tilde{\theta})\sqrt{n}[X_n-\theta].[[/math]]Since
- [[math]]{\sqrt{n}[X_n-\theta] \xrightarrow{D} \mathcal{N}(0,\sigma^2)}[[/math]]
by assumption, it follows immediately from appeal to Slutsky's Theorem that
- [[math]]{\sqrt{n}[g(X_n)-g(\theta)] \xrightarrow{D} \mathcal{N}(0,\sigma^2[g'(\theta)]^2)}.[[/math]]
Now we proceed to the multivariate case. Keeping only the first two terms of the Taylor series, and using vector notation for the gradient, we can estimate [math]h(B)[/math] as
which implies the variance of h(B) is approximately
One can use the mean value theorem (for real-valued functions of many variables) to see that this does not rely on taking first order approximation.
The univariate delta method therefore implies that
References
- Wikipedia contributors. "Delta method". Wikipedia. Wikipedia. Retrieved 30 May 2019.