Revision as of 01:34, 30 May 2024 by Bot

Computational Challenges

[math] \newcommand{\edis}{\stackrel{d}{=}} \newcommand{\fd}{\stackrel{f.d.}{\rightarrow}} \newcommand{\dom}{\operatorname{dom}} \newcommand{\eig}{\operatorname{eig}} \newcommand{\epi}{\operatorname{epi}} \newcommand{\lev}{\operatorname{lev}} \newcommand{\card}{\operatorname{card}} \newcommand{\comment}{\textcolor{Green}} \newcommand{\B}{\mathbb{B}} \newcommand{\C}{\mathbb{C}} \newcommand{\G}{\mathbb{G}} \newcommand{\M}{\mathbb{M}} \newcommand{\N}{\mathbb{N}} \newcommand{\Q}{\mathbb{Q}} \newcommand{\T}{\mathbb{T}} \newcommand{\R}{\mathbb{R}} \newcommand{\E}{\mathbb{E}} \newcommand{\W}{\mathbb{W}} \newcommand{\bU}{\mathfrak{U}} \newcommand{\bu}{\mathfrak{u}} \newcommand{\bI}{\mathfrak{I}} \newcommand{\cA}{\mathcal{A}} \newcommand{\cB}{\mathcal{B}} \newcommand{\cC}{\mathcal{C}} \newcommand{\cD}{\mathcal{D}} \newcommand{\cE}{\mathcal{E}} \newcommand{\cF}{\mathcal{F}} \newcommand{\cG}{\mathcal{G}} \newcommand{\cg}{\mathcal{g}} \newcommand{\cH}{\mathcal{H}} \newcommand{\cI}{\mathcal{I}} \newcommand{\cJ}{\mathcal{J}} \newcommand{\cK}{\mathcal{K}} \newcommand{\cL}{\mathcal{L}} \newcommand{\cM}{\mathcal{M}} \newcommand{\cN}{\mathcal{N}} \newcommand{\cO}{\mathcal{O}} \newcommand{\cP}{\mathcal{P}} \newcommand{\cQ}{\mathcal{Q}} \newcommand{\cR}{\mathcal{R}} \newcommand{\cS}{\mathcal{S}} \newcommand{\cT}{\mathcal{T}} \newcommand{\cU}{\mathcal{U}} \newcommand{\cu}{\mathcal{u}} \newcommand{\cV}{\mathcal{V}} \newcommand{\cW}{\mathcal{W}} \newcommand{\cX}{\mathcal{X}} \newcommand{\cY}{\mathcal{Y}} \newcommand{\cZ}{\mathcal{Z}} \newcommand{\sF}{\mathsf{F}} \newcommand{\sM}{\mathsf{M}} \newcommand{\sG}{\mathsf{G}} \newcommand{\sT}{\mathsf{T}} \newcommand{\sB}{\mathsf{B}} \newcommand{\sC}{\mathsf{C}} \newcommand{\sP}{\mathsf{P}} \newcommand{\sQ}{\mathsf{Q}} \newcommand{\sq}{\mathsf{q}} \newcommand{\sR}{\mathsf{R}} \newcommand{\sS}{\mathsf{S}} \newcommand{\sd}{\mathsf{d}} \newcommand{\cp}{\mathsf{p}} \newcommand{\cc}{\mathsf{c}} \newcommand{\cf}{\mathsf{f}} \newcommand{\eU}{{\boldsymbol{U}}} \newcommand{\eb}{{\boldsymbol{b}}} \newcommand{\ed}{{\boldsymbol{d}}} \newcommand{\eu}{{\boldsymbol{u}}} \newcommand{\ew}{{\boldsymbol{w}}} \newcommand{\ep}{{\boldsymbol{p}}} \newcommand{\eX}{{\boldsymbol{X}}} \newcommand{\ex}{{\boldsymbol{x}}} \newcommand{\eY}{{\boldsymbol{Y}}} \newcommand{\eB}{{\boldsymbol{B}}} \newcommand{\eC}{{\boldsymbol{C}}} \newcommand{\eD}{{\boldsymbol{D}}} \newcommand{\eW}{{\boldsymbol{W}}} \newcommand{\eR}{{\boldsymbol{R}}} \newcommand{\eQ}{{\boldsymbol{Q}}} \newcommand{\eS}{{\boldsymbol{S}}} \newcommand{\eT}{{\boldsymbol{T}}} \newcommand{\eA}{{\boldsymbol{A}}} \newcommand{\eH}{{\boldsymbol{H}}} \newcommand{\ea}{{\boldsymbol{a}}} \newcommand{\ey}{{\boldsymbol{y}}} \newcommand{\eZ}{{\boldsymbol{Z}}} \newcommand{\eG}{{\boldsymbol{G}}} \newcommand{\ez}{{\boldsymbol{z}}} \newcommand{\es}{{\boldsymbol{s}}} \newcommand{\et}{{\boldsymbol{t}}} \newcommand{\ev}{{\boldsymbol{v}}} \newcommand{\ee}{{\boldsymbol{e}}} \newcommand{\eq}{{\boldsymbol{q}}} \newcommand{\bnu}{{\boldsymbol{\nu}}} \newcommand{\barX}{\overline{\eX}} \newcommand{\eps}{\varepsilon} \newcommand{\Eps}{\mathcal{E}} \newcommand{\carrier}{{\mathfrak{X}}} \newcommand{\Ball}{{\mathbb{B}}^{d}} \newcommand{\Sphere}{{\mathbb{S}}^{d-1}} \newcommand{\salg}{\mathfrak{F}} \newcommand{\ssalg}{\mathfrak{B}} \newcommand{\one}{\mathbf{1}} \newcommand{\Prob}[1]{\P\{#1\}} \newcommand{\yL}{\ey_{\mathrm{L}}} \newcommand{\yU}{\ey_{\mathrm{U}}} \newcommand{\yLi}{\ey_{\mathrm{L}i}} \newcommand{\yUi}{\ey_{\mathrm{U}i}} \newcommand{\xL}{\ex_{\mathrm{L}}} \newcommand{\xU}{\ex_{\mathrm{U}}} \newcommand{\vL}{\ev_{\mathrm{L}}} \newcommand{\vU}{\ev_{\mathrm{U}}} \newcommand{\dist}{\mathbf{d}} \newcommand{\rhoH}{\dist_{\mathrm{H}}} \newcommand{\ti}{\to\infty} \newcommand{\comp}[1]{#1^\mathrm{c}} \newcommand{\ThetaI}{\Theta_{\mathrm{I}}} \newcommand{\crit}{q} \newcommand{\CS}{CS_n} \newcommand{\CI}{CI_n} \newcommand{\cv}[1]{\hat{c}_{n,1-\alpha}(#1)} \newcommand{\idr}[1]{\mathcal{H}_\sP[#1]} \newcommand{\outr}[1]{\mathcal{O}_\sP[#1]} \newcommand{\idrn}[1]{\hat{\mathcal{H}}_{\sP_n}[#1]} \newcommand{\outrn}[1]{\mathcal{O}_{\sP_n}[#1]} \newcommand{\email}[1]{\texttt{#1}} \newcommand{\possessivecite}[1]{\ltref name="#1"\gt\lt/ref\gt's \citeyear{#1}} \newcommand\xqed[1]{% \leavevmode\unskip\penalty9999 \hbox{}\nobreak\hfill \quad\hbox{#1}} \newcommand\qedex{\xqed{$\triangle$}} \newcommand\independent{\protect\mathpalette{\protect\independenT}{\perp}} \DeclareMathOperator{\Int}{Int} \DeclareMathOperator{\conv}{conv} \DeclareMathOperator{\cov}{Cov} \DeclareMathOperator{\var}{Var} \DeclareMathOperator{\Sel}{Sel} \DeclareMathOperator{\Bel}{Bel} \DeclareMathOperator{\cl}{cl} \DeclareMathOperator{\sgn}{sgn} \DeclareMathOperator{\essinf}{essinf} \DeclareMathOperator{\esssup}{esssup} \newcommand{\mathds}{\mathbb} \renewcommand{\P}{\mathbb{P}} [/math]

\label{sec:computations} As a rule of thumb, the difficulty in computing estimators of identification regions and confidence sets depends on whether a closed form expression is available for the boundary of the set. For example, often nonparametric bounds on functionals of a partially identified distribution are known functionals of observed conditional distributions, as in Section. Then ‘`plug in" estimation is possible, and the computational cost is the same as for estimation and construction of confidence intervals (or confidence bands) for point-identified nonparametric regressions (incurred twice, once for the lower bound and once for the upper bound). Similarly, support function based inference is easy to implement when [math]\idr{\theta}[/math] is convex. Sometimes the extreme points of [math]\idr{\theta}[/math] can be expressed as known functionals of observed distributions. Even if not, level sets of convex functions are easy to compute. But as it was shown in Section, many problems of interest yield a set [math]\idr{\theta}[/math] that is ’'not convex. In this case, [math]\idr{\theta}[/math] is obtained as a level set of a criterion function. Because [math]\idr{\theta}[/math] (or its associated confidence set) is often a subset of [math]\R^d[/math] (rather than [math]\R[/math]), even a moderate value for [math]d[/math], e.g., 8 or 10, can lead to extremely challenging computational problems. This is because if one wants to compute [math]\idr{\theta}[/math] or a set that covers it or its elements with a prespecified asymptotic probability (possibly uniformly over [math]\sP\in\cP[/math]), one has to map out a level set in [math]\R^d[/math]. If one is interested in confidence intervals for scalar projections or other smooth functions of [math]\vartheta\in\idr{\theta}[/math], one needs to solve complex nonlinear optimization problems, as for example in eq:CI:BCS and eq:KMS:proj. This can be difficult to do, especially because [math]c_{1-\alpha}(\vartheta)[/math] is typically an unknown function of [math]\vartheta[/math] for which gradients are not available in closed form. Mirroring the fact that computation is easier when the boundary of [math]\idr{\theta}[/math] is a known function of observed conditional distributions, several portable software packages are available to carry out estimation and inference in this case. For example, [1] provide STATA and MatLab packages implementing the methods proposed by [2][3][4][5][6], [7][8], and [9]. [10] provides a STATA package to implement the bounds proposed by [11]. [12] provide a STATA package to implement bounds on treatment effects with endogenous and misreported treatment assignment and under the assumptions of monotone treatment selection, monotone treatment response, and monotone instrumental variables as in [6], [9], [13], [14], and [15]. The code computes the confidence intervals proposed by [16]. In the more general context of inference for a one-dimensional parameter defined by intersection bounds, as for example the one in eq:intersection:bounds, [17] and [18] provide portable STATA code implementing, respectively, methods to test hypotheses and build confidence intervals in [19] and in [20]. [21] provide portable STATA code implementing [22]'s method for estimation and inference for best linear prediction with interval outcome data as in Identification Problem. [23] provide R code implementing [24]'s method for estimation and inference for best linear approximations of set identified functions.\medskip On the other hand, there is a paucity of portable software implementing the theoretical methods for inference in structural partially identified models discussed in Section. [25] compute [26] confidence sets for a parameter vector in [math]\R^d[/math] in an entry game with six players, with [math]d[/math] in the order of [math]20[/math] and with tens of thousands of inequalities, through a ‘`guess and verify" algorithm based on simulated annealing (with no cooling) that visits many candidate values [math]\vartheta\in\Theta[/math], evaluates [math]\crit_n(\vartheta)[/math], and builds [math]\CS[/math] by retaining the visited values [math]\vartheta[/math] that satisfy [math]n\crit_n(\vartheta)\le c_{1-\alpha}(\vartheta)[/math] with [math]c_{1-\alpha}[/math] defined to satisfy eq:CS_coverage:point:pw. Given the computational resources commonly available at this point in time, this is a tremendously hard task, due to the dimension of [math]\theta[/math] and the number of moment inequalities employed. As explained in Section An Inference Approach Robust to the Presence of Multiple Equilibria, these inequalities, which in a game of entry with [math]J[/math] players and discrete observable payoff shifters are [math]2^J|\cX|[/math] (with [math]\cX[/math] the support of the observable payoff shifters), yield an outer region [math]\outr{\theta}[/math]. It is natural to wonder what are the additional challenges faced to compute [math]\idr{\theta}[/math] as described in Section Characterization of Sharpness through Random Set Theory. A definitive answer to this question is hard to obtain. If one employs ’'all inequalities listed in Theorem, the number of inequalities jumps to [math](2^{2^J}-2)|\cX|[/math], increasing the computational cost. However, as suggested by [27] and extended by other authors (e.g., [28][29][30][31]), often many moment inequalities are redundant, substantially reducing the number of inequalities to be checked. Specifically, [27] propose the notion of core determining sets, a collection of compact sets such that if the inequality in Theorem holds for these sets, it holds for all sets in [math]\cK[/math], see Definition and the surrounding discussion in Appendix. This often yields a number of restrictions similar to the one incurred to obtain outer regions. For example, [28](Section 4.2) analyze a four player, two type entry game with pure strategy Nash equilibrium as solution concept, originally proposed by [32], and show that while a direct application of Theorem entails [math]512|\cX|[/math] inequality restrictions, [math]26|\cX|[/math] suffice. In this example, [25]'s outer region is based on checking [math]18|\cX|[/math] inequalities. A related but separate question is how to best allocate the computational effort. As one moves from partial identification analysis to finite sample considerations, one may face a trade-off between sharpness of the identification region and statistical efficiency. This is because inequalities that are redundant from the perspective of identification analysis might nonetheless be estimated with high precision, and hence improve the finite sample statistical properties of a confidence set or of a test of hypothesis. Recent contributions by [33], [34] and [35], provide methods to build confidence set, respectively, with a continuum of conditional moment inequalities, and with a number of moment inequalities that may exceed sample size. These contributions, however, do not yet answer the question of how to optimally select inequalities to yield confidence sets with best finite sample properties according to some specified notion of ‘`best". A different approach proposed by [36] uses directly a quasi-likelihood criterion function. In the context, e.g., of entry games, this entails assuming that the selection mechanism depends only on observable payoff shifters, using it to obtain the exact model implied distribution as in eq:games_model:pred, and partially identifying an enlarged parameter vector that includes [math]\theta[/math] and the selection mechanism. In an empirical application with discrete covariates, [36] apply their method to a two player entry game with correlated errors, where [math]\theta\in\R^9[/math] and the selection mechanism is a vector in [math]\R^8[/math], for a total of 17 parameters. In another application to the analysis of trade flows, their empirical application includes 46 parameters. \medskip In terms of general purpose portable code that can be employed in moment inequality models, I am only aware of the MatLab package provided by [37] to implement the inference method of [38] for projections and smooth functions of parameter vectors in models defined by a finite number of unconditional moment (in)equalities. More broadly, their method can be used to compute confidence intervals for optimal values of optimization problems with estimated constraints. Here I summarize their approach to further highlight why the computational task is challenging even in the case of projections. The confidence interval in eq:def:CI-eq:KMS:proj requires solving two nonlinear programs, each with a linear objective and nonlinear constraints involving a critical value which in general is an unknown function of [math]\vartheta[/math], with unknown gradient. When the dimension of the parameter vector is large, directly solving optimization problems with such constraints can be expensive even if evaluating the critical value at each [math]\vartheta[/math] is cheap.Cite error: Closing </ref> missing for <ref> tag propose a linearization method whereby [math]c_{1-\alpha}[/math] is calibrated through repeatedly solving bootstrap linear programs, hence it is reasonably cheap to compute.</ref> Hence, [38] propose to use an algorithm (called E-A-M for Evaluation-Approximation-Maximization) to solve these nonlinear programs, which belongs to the family of ’'expected improvement algorithms (see e.g. [39][40][41](and references therein)). Given a constrained optimization problem of the form

[[math]] \begin{align*} \max_{\vartheta \in \Theta}u^\top\vartheta~\text{s.t. }g_j(\vartheta)\le c(\vartheta),j=1,\dots,J, \end{align*} [[/math]]

to which eq:KMS:proj belongs,[Notes 1] the algorithm attempts to solve it by cycling over three steps:

  • The true critical level function [math]c[/math] is evaluated at an initial (uniformly randomly drawn from [math]\Theta[/math]) set of points [math]\vartheta^1,\dots,\vartheta^k[/math]. These values are used to compute a current guess for the optimal value, [math]u^\top\vartheta^{*,k}=\max\{u^\top\vartheta:~\vartheta\in\{\vartheta^1,\dots,\vartheta^k\}\text{ and }\bar g(\vartheta)\le c(\vartheta)\}[/math], where [math]\bar g(\vartheta)=\max_{j=1,\dots,J}g_j(\vartheta)[/math]. The ‘`training data" [math](\vartheta^{\ell},c(\vartheta^{\ell})_{\ell=1}^k[/math] is used to compute an ’'approximating surface [math]c_k[/math] through a Gaussian-process regression model (kriging), as described in [42](Section 4.1.3);
  • For [math]L\ge k+1[/math], with probability [math]1-\epsilon[/math] the next evaluation point [math]\theta^L[/math] for the true critical level function [math]c[/math] is chosen by finding the point that maximizes expected improvement with respect to the approximating surface, [math]\mathbb{EI}_{L-1}(\vartheta)=(u^\top\vartheta-u^\top\vartheta^{*,L-1})_+\{1-\Phi([\bar g(\vartheta)-c_{L-1}(\vartheta)]/[\hat\varsigma s_{L-1}(\vartheta)])\}[/math]. Here [math]c_{L-1}(\vartheta)[/math] and [math]\hat\varsigma^2 s_{L-1}^2(\vartheta)[/math] are estimators of the posterior mean and variance of the approximating surface. To aim for global search, with probability [math]\epsilon[/math], [math]\vartheta^L[/math] is drawn uniformly from [math]\Theta[/math]. The approximating surface is then recomputed using [math](\vartheta^{\ell},c(\vartheta^{\ell})_{\ell=1}^L)[/math]. Steps 1 and 2 are repeated until a convergence criterion is met.
  • The extreme point of [math]CI_n[/math] is reported as the value [math]u^\top\vartheta^{*,L}[/math] that maximizes [math]u^\top\vartheta[/math] among the evaluation points that satisfy the true constraints, i.e. [math]u^\top\vartheta^{*,L}=\max\{u^\top\vartheta:~\vartheta\in\{\vartheta^1,\dots,\vartheta^L\}\text{ and }\bar g(\vartheta)\le c(\vartheta)\}[/math].

The only place where the approximating surface is used is in Step 2, to choose a new evaluation point. In particular, the reported extreme points of [math]\CI[/math] in eq:def:CI are the extreme values of [math]u^\top\vartheta[/math] that are consistent with the true surface where this surface was computed, not with the approximating surface. [38] establish convergence of their algorithm and obtain a convergence rate, as the number of evaluation points increases, for constrained optimization problems in which the constraints are sufficiently smooth ‘`black box" functions, building on an earlier contribution of [43]. [43] establishes convergence of an expected improvement algorithm for unconstrained optimization problems where the objective is a ``black box" function. The rate of convergence that [43] derives depends on the smoothness of the black box objective function. The rate of convergence obtained by [38] depends on the smoothness of the black box constraints, and is slightly slower than [43]’s rate. [38]'s Monte Carlo experiments suggest that the E-A-M algorithm is fast and accurate at computing their confidence intervals. The E-A-M algorithm also allows for very rapid computation of projections of the confidence set proposed by [44], and for a substantial improvement in the computational time of the profiling-based confidence intervals proposed by [45].Cite error: Closing </ref> missing for <ref> tag's method does not require solving a nonlinear program such as the one in eq:KMS:proj. Rather it obtains [math]\CI[/math] as in eq:CI:BCS. However, it approximates [math]c_{1-\alpha}[/math] by repeatedly solving bootstrap nonlinear programs, thereby incurring a very high computational cost at that stage.</ref> In all cases, the speed improvement results from a reduced number of evaluation points required to approximate the optimum. In an application to a point identified setting, [46](Supplement Section S.3) use [38]'s E-A-M method to construct uniform confidence bands for an unknown function of interest under (nonparametric) shape restrictions. They benchmark it against gridding and find it to be accurate at considerably improved speed.

General references

Molinari, Francesca (2020). "Microeconometrics with Partial Identification". arXiv:2004.11751 [econ.EM].

Notes

  1. To see this it suffices to set [math]g_j(\vartheta)=\frac{\sqrt{n}\bar{m}_{n,j}(\vartheta)}{\hat{\sigma}_{n,j}(\vartheta)}[/math] and [math]c(\vartheta)= c_{1-\alpha}(\vartheta)[/math].

References

  1. Cite error: Invalid <ref> tag; no text was provided for refs named ber:man00
  2. Cite error: Invalid <ref> tag; no text was provided for refs named man89
  3. Cite error: Invalid <ref> tag; no text was provided for refs named man90
  4. Cite error: Invalid <ref> tag; no text was provided for refs named man94
  5. Cite error: Invalid <ref> tag; no text was provided for refs named man95
  6. 6.0 6.1 Cite error: Invalid <ref> tag; no text was provided for refs named man97:monotone
  7. Cite error: Invalid <ref> tag; no text was provided for refs named hor:man98
  8. Cite error: Invalid <ref> tag; no text was provided for refs named hor:man00
  9. 9.0 9.1 Cite error: Invalid <ref> tag; no text was provided for refs named man:pep00
  10. Cite error: Invalid <ref> tag; no text was provided for refs named tau14
  11. Cite error: Invalid <ref> tag; no text was provided for refs named lee09
  12. Cite error: Invalid <ref> tag; no text was provided for refs named mcc:mil:roy15
  13. Cite error: Invalid <ref> tag; no text was provided for refs named kre:pep07
  14. Cite error: Invalid <ref> tag; no text was provided for refs named gun:kre:pep12
  15. Cite error: Invalid <ref> tag; no text was provided for refs named kre:pep:gun:jol12
  16. Cite error: Invalid <ref> tag; no text was provided for refs named imb:man04
  17. Cite error: Invalid <ref> tag; no text was provided for refs named che:kim:lee:ros15
  18. Cite error: Invalid <ref> tag; no text was provided for refs named and:kim:shi17
  19. Cite error: Invalid <ref> tag; no text was provided for refs named che:lee:ros13
  20. Cite error: Invalid <ref> tag; no text was provided for refs named and:shi13
  21. Cite error: Invalid <ref> tag; no text was provided for refs named ber:mol:mor10
  22. Cite error: Invalid <ref> tag; no text was provided for refs named ber:mol08
  23. Cite error: Invalid <ref> tag; no text was provided for refs named cha:che:mol:sch12_code
  24. Cite error: Invalid <ref> tag; no text was provided for refs named cha:che:mol:sch18
  25. 25.0 25.1 Cite error: Invalid <ref> tag; no text was provided for refs named cil:tam09
  26. Cite error: Invalid <ref> tag; no text was provided for refs named che:hon:tam07
  27. 27.0 27.1 Cite error: Invalid <ref> tag; no text was provided for refs named gal:hen06
  28. 28.0 28.1 Cite error: Invalid <ref> tag; no text was provided for refs named ber:mol:mol08
  29. Cite error: Invalid <ref> tag; no text was provided for refs named ber:mol:mol11
  30. Cite error: Invalid <ref> tag; no text was provided for refs named che:ros:smo13
  31. Cite error: Invalid <ref> tag; no text was provided for refs named che:ros17
  32. Cite error: Invalid <ref> tag; no text was provided for refs named ber:tam06
  33. Cite error: Invalid <ref> tag; no text was provided for refs named and:shi17
  34. Cite error: Invalid <ref> tag; no text was provided for refs named che:che:kat18
  35. Cite error: Invalid <ref> tag; no text was provided for refs named bel:bug:che18
  36. 36.0 36.1 Cite error: Invalid <ref> tag; no text was provided for refs named che:chr:tam18
  37. Cite error: Invalid <ref> tag; no text was provided for refs named kai:mol:sto:thi17
  38. 38.0 38.1 38.2 38.3 38.4 38.5 Cite error: Invalid <ref> tag; no text was provided for refs named kai:mol:sto19
  39. Cite error: Invalid <ref> tag; no text was provided for refs named jon:sch:wel98
  40. Cite error: Invalid <ref> tag; no text was provided for refs named sch:wel:jon98
  41. Cite error: Invalid <ref> tag; no text was provided for refs named jon01
  42. Cite error: Invalid <ref> tag; no text was provided for refs named san:wil:not13
  43. 43.0 43.1 43.2 43.3 Cite error: Invalid <ref> tag; no text was provided for refs named bul11
  44. Cite error: Invalid <ref> tag; no text was provided for refs named and:soa10
  45. Cite error: Invalid <ref> tag; no text was provided for refs named bug:can:shi17
  46. Cite error: Invalid <ref> tag; no text was provided for refs named fre:rev17