guide:A85a6b6ff1: Difference between revisions
mNo edit summary |
mNo edit summary |
||
Line 164: | Line 164: | ||
Mirroring the fact that computation is easier when the boundary of <math>\idr{\theta}</math> is a known function of observed conditional distributions, several portable software packages are available to carry out estimation and inference in this case. | Mirroring the fact that computation is easier when the boundary of <math>\idr{\theta}</math> is a known function of observed conditional distributions, several portable software packages are available to carry out estimation and inference in this case. | ||
For example, <ref name="ber:man00"></ref> provide STATA and MatLab packages implementing the methods proposed by <ref name="man89"></ref><ref name="man90"></ref><ref name="man94"></ref><ref name="man95"></ref><ref name="man97:monotone"></ref>, <ref name="hor:man98"></ref><ref name="hor:man00"></ref>, and <ref name="man:pep00"></ref>. | For example, <ref name="ber:man00"><span style="font-variant-caps:small-caps">Beresteanu, A., <span style="font-variant-caps:normal">and</span> C.F. Manski</span> (2000): “Bounds for STATA and Bounds for MatLab” available at [http://faculty.wcas.northwestern.edu/cfm754/bounds_stata.pdf http://faculty.wcas.northwestern.edu/cfm754/bounds_stata.pdf].</ref> provide STATA and MatLab packages implementing the methods proposed by <ref name="man89"><span style="font-variant-caps:small-caps">Manski, C.F.</span> (1989): “Anatomy of the Selection Problem” ''The Journal of Human Resources'', 24(3), 343--360.</ref><ref name="man90"><span style="font-variant-caps:small-caps">Manski, C.F.</span> (1990): “Nonparametric Bounds on Treatment Effects” ''The American Economic Review Papers and Proceedings'', 80(2), 319--323.</ref><ref name="man94"><span style="font-variant-caps:small-caps">Manski, C.F.</span> (1994): “The selection problem” in ''Advances in Econometrics: Sixth World Congress'', ed. by C.A. Sims, vol.1 of ''Econometric Society Monographs'', pp. 143--170. Cambridge University Press.</ref><ref name="man95"><span style="font-variant-caps:small-caps">Manski, C.F.</span> (1995): ''Identification Problems in the Social Sciences''. Harvard University Press.</ref><ref name="man97:monotone"><span style="font-variant-caps:small-caps">Manski, C.F.</span> (1997b): “Monotone Treatment Response” ''Econometrica'', 65(6), 1311--1334.</ref>, <ref name="hor:man98"><span style="font-variant-caps:small-caps">Horowitz, J.L., <span style="font-variant-caps:normal">and</span> C.F. Manski</span> (1998): “Censoring of outcomes and regressors due to survey nonresponse: Identification and estimation using weights and imputations” ''Journal of Econometrics'', 84(1), 37 -- 58.</ref><ref name="hor:man00"><span style="font-variant-caps:small-caps">Horowitz, J.L., <span style="font-variant-caps:normal">and</span> C.F. Manski</span> (2000): “Nonparametric Analysis of Randomized Experiments with Missing Covariate and Outcome Data” ''Journal of the American Statistical Association'', 95(449), 77--84.</ref>, and <ref name="man:pep00"><span style="font-variant-caps:small-caps">Manski, C.F., <span style="font-variant-caps:normal">and</span> J.V. Pepper</span> (2000): “Monotone Instrumental Variables: With an Application to the Returns to Schooling” ''Econometrica'', 68(4), 997--1010.</ref>. | ||
<ref name="tau14"></ref> provides a STATA package to implement the bounds proposed by <ref name="lee09"></ref>. | <ref name="tau14"><span style="font-variant-caps:small-caps">Tauchmann, H.</span> (2014): “Lee (2009) treatment-effect bounds for nonrandom sample selection” ''Stata Journal'', 14(4), 884--894.</ref> provides a STATA package to implement the bounds proposed by <ref name="lee09"><span style="font-variant-caps:small-caps">Lee, D.S.</span> (2009): “Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects” ''The Review of Economic Studies'', 76(3), 1071--1102.</ref>. | ||
<ref name="mcc:mil:roy15"></ref> provide a STATA package to implement bounds on treatment effects with endogenous and misreported treatment assignment and under the assumptions of monotone treatment selection, monotone treatment response, and monotone instrumental variables as in <ref name="man97:monotone" | <ref name="mcc:mil:roy15"><span style="font-variant-caps:small-caps">McCarthy, I., D.L. Millimet, <span style="font-variant-caps:normal">and</span> M.Roy</span> (2015): “Bounding treatment effects: A command for the partial identification of the average treatment effect with endogenous and misreported treatment assignment” ''Stata Journal'', 15(2), 411--436.</ref> provide a STATA package to implement bounds on treatment effects with endogenous and misreported treatment assignment and under the assumptions of monotone treatment selection, monotone treatment response, and monotone instrumental variables as in <ref name="man97:monotone"/>, <ref name="man:pep00"/>, <ref name="kre:pep07"><span style="font-variant-caps:small-caps">Kreider, B., <span style="font-variant-caps:normal">and</span> J.V. Pepper</span> (2007): “Disability and Employment: Reevaluating the Evidence in Light of Reporting Errors” ''Journal of the American Statistical Association'', 102(478), 432--441.</ref>, <ref name="gun:kre:pep12"><span style="font-variant-caps:small-caps">Gundersen, C., B.Kreider, <span style="font-variant-caps:normal">and</span> J.Pepper</span> (2012): “The impact of the National School Lunch Program on child health: A nonparametric bounds analysis” ''Journal of Econometrics'', 166(1), 79--91.</ref>, and <ref name="kre:pep:gun:jol12"><span style="font-variant-caps:small-caps">Kreider, B., J.V. Pepper, C.Gundersen, <span style="font-variant-caps:normal">and</span> D.Jolliffe</span> (2012): “Identifying the Effects of SNAP (Food Stamps) on Child Health Outcomes When Participation Is Endogenous and Misreported” ''Journal of the American Statistical Association'', 107(499), 958--975.</ref>. | ||
The code computes the confidence intervals proposed by <ref name="imb:man04"></ref>. | The code computes the confidence intervals proposed by <ref name="imb:man04"><span style="font-variant-caps:small-caps">Imbens, G.W., <span style="font-variant-caps:normal">and</span> C.F. Manski</span> (2004): “Confidence Intervals for Partially Identified Parameters” ''Econometrica'', 72(6), 1845--1857.</ref>. | ||
In the more general context of inference for a one-dimensional parameter defined by intersection bounds, as for example the one in [[guide:Ec36399528#eq:intersection:bounds |eq:intersection:bounds]], <ref name="che:kim:lee:ros15"></ref> and <ref name="and:kim:shi17"></ref> provide portable STATA code implementing, respectively, methods to test hypotheses and build confidence intervals in <ref name="che:lee:ros13"></ref> and in <ref name="and:shi13"></ref>. | In the more general context of inference for a one-dimensional parameter defined by intersection bounds, as for example the one in [[guide:Ec36399528#eq:intersection:bounds |eq:intersection:bounds]], <ref name="che:kim:lee:ros15"><span style="font-variant-caps:small-caps">Chernozhukov, V., W.Kim, S.Lee, <span style="font-variant-caps:normal">and</span> A.M. Rosen</span> (2015): “Implementing intersection bounds in Stata” ''Stata Journal'', 15(1), 21--44.</ref> and <ref name="and:kim:shi17"><span style="font-variant-caps:small-caps">Andrews, D. W.K., W.Kim, <span style="font-variant-caps:normal">and</span> X.Shi</span> (2017): “Commands for testing conditional moment inequalities and equalities” ''Stata Journal'', 17(1), 56--72.</ref> provide portable STATA code implementing, respectively, methods to test hypotheses and build confidence intervals in <ref name="che:lee:ros13"><span style="font-variant-caps:small-caps">Chernozhukov, V., S.Lee, <span style="font-variant-caps:normal">and</span> A.M. Rosen</span> (2013): “Intersection Bounds: estimation and inference” ''Econometrica'', 81(2), 667--737.</ref> and in <ref name="and:shi13"><span style="font-variant-caps:small-caps">Andrews, D. W.K., <span style="font-variant-caps:normal">and</span> X.Shi</span> (2013): “Inference based on conditional moment inequalities” ''Econometrica'', 81(2), 609--666.</ref>. | ||
<ref name="ber:mol:mor10"></ref> provide portable STATA code implementing <ref name="ber:mol08"></ref>'s method for estimation and inference for best linear prediction with interval outcome data as in Identification [[guide:Ec36399528#IP:param_pred_interval |Problem]]. | <ref name="ber:mol:mor10"><span style="font-variant-caps:small-caps">Beresteanu, A., F.Molinari, <span style="font-variant-caps:normal">and</span> D.S. Morris</span> (2010): “Asymptotics for Partially Identified Models in STATA” available at [https://molinari.economics.cornell.edu/programs/Stata\_SetBLP.zip https://molinari.economics.cornell.edu/programs/Stata\_SetBLP.zip].</ref> provide portable STATA code implementing <ref name="ber:mol08"><span style="font-variant-caps:small-caps">Beresteanu, A., <span style="font-variant-caps:normal">and</span> F.Molinari</span> (2008): “Asymptotic Properties for a Class of Partially Identified Models” ''Econometrica'', 76(4), 763--814.</ref>'s method for estimation and inference for best linear prediction with interval outcome data as in Identification [[guide:Ec36399528#IP:param_pred_interval |Problem]]. | ||
<ref name="cha:che:mol:sch12_code"></ref> provide R code implementing <ref name="cha:che:mol:sch18"></ref>'s method for estimation and inference for best linear approximations of set identified functions.\medskip | <ref name="cha:che:mol:sch12_code"><span style="font-variant-caps:small-caps">Chandrasekhar, A., V.Chernozhukov, F.Molinari, <span style="font-variant-caps:normal">and</span> P.Schrimpf</span> (2012): “R code implementing best linear approximations to set identified functions” available at [https://bitbucket.org/paulschrimpf/mulligan-rubinstein-bounds https://bitbucket.org/paulschrimpf/mulligan-rubinstein-bounds].</ref> provide R code implementing <ref name="cha:che:mol:sch18"><span style="font-variant-caps:small-caps">Chandrasekhar, A., V.Chernozhukov, F.Molinari, <span style="font-variant-caps:normal">and</span> P.Schrimpf</span> (2018): “Best linear approximations to set identified functions: with an application to the gender wage gap” CeMMAP working paper CWP09/19, available at [https://www.cemmap.ac.uk/publication/id/13913 https://www.cemmap.ac.uk/publication/id/13913].</ref>'s method for estimation and inference for best linear approximations of set identified functions.\medskip | ||
On the other hand, there is a paucity of portable software implementing the theoretical methods for inference in structural partially identified models discussed in [[guide:6d1a428897#sec:inference |Section]]. | On the other hand, there is a paucity of portable software implementing the theoretical methods for inference in structural partially identified models discussed in [[guide:6d1a428897#sec:inference |Section]]. | ||
<ref name="cil:tam09"></ref> compute <ref name="che:hon:tam07"></ref> confidence sets for a parameter vector in <math>\R^d</math> in an entry game with six players, with <math>d</math> in the order of <math>20</math> and with tens of thousands of inequalities, through a “guess and verify” algorithm based on simulated annealing (with no cooling) that visits many candidate values <math>\vartheta\in\Theta</math>, evaluates <math>\crit_n(\vartheta)</math>, and builds <math>\CS</math> by retaining the visited values <math>\vartheta</math> that satisfy <math>n\crit_n(\vartheta)\le c_{1-\alpha}(\vartheta)</math> with <math>c_{1-\alpha}</math> defined to satisfy [[guide:6d1a428897#eq:CS_coverage:point:pw |eq:CS_coverage:point:pw]]. | <ref name="cil:tam09"><span style="font-variant-caps:small-caps">Ciliberto, F., <span style="font-variant-caps:normal">and</span> E.Tamer</span> (2009): “Market Structure and Multiple Equilibria in Airline Markets” ''Econometrica'', 77(6), 1791--1828.</ref> compute <ref name="che:hon:tam07"><span style="font-variant-caps:small-caps">Chernozhukov, V., H.Hong, <span style="font-variant-caps:normal">and</span> E.Tamer</span> (2007): “Estimation and Confidence Regions for Parameter Sets in Econometric Models” ''Econometrica'', 75(5), 1243--1284.</ref> confidence sets for a parameter vector in <math>\R^d</math> in an entry game with six players, with <math>d</math> in the order of <math>20</math> and with tens of thousands of inequalities, through a “guess and verify” algorithm based on simulated annealing (with no cooling) that visits many candidate values <math>\vartheta\in\Theta</math>, evaluates <math>\crit_n(\vartheta)</math>, and builds <math>\CS</math> by retaining the visited values <math>\vartheta</math> that satisfy <math>n\crit_n(\vartheta)\le c_{1-\alpha}(\vartheta)</math> with <math>c_{1-\alpha}</math> defined to satisfy [[guide:6d1a428897#eq:CS_coverage:point:pw |eq:CS_coverage:point:pw]]. | ||
Given the computational resources commonly available at this point in time, this is a tremendously hard task, due to the dimension of <math>\theta</math> and the number of moment inequalities employed. | Given the computational resources commonly available at this point in time, this is a tremendously hard task, due to the dimension of <math>\theta</math> and the number of moment inequalities employed. | ||
Line 184: | Line 184: | ||
A definitive answer to this question is hard to obtain. | A definitive answer to this question is hard to obtain. | ||
If one employs ’'all'' inequalities listed in [[guide:379e0dcd67#thr:artstein |Theorem]], the number of inequalities jumps to <math>(2^{2^J}-2)|\cX|</math>, increasing the computational cost. | If one employs ’'all'' inequalities listed in [[guide:379e0dcd67#thr:artstein |Theorem]], the number of inequalities jumps to <math>(2^{2^J}-2)|\cX|</math>, increasing the computational cost. | ||
However, as suggested by <ref name="gal:hen06"></ref> and extended by other authors (e.g., <ref name="ber:mol:mol08"></ref><ref name="ber:mol:mol11"></ref><ref name="che:ros:smo13"></ref><ref name="che:ros17"></ref>), often many moment inequalities are redundant, substantially reducing the number of inequalities to be checked. | However, as suggested by <ref name="gal:hen06"><span style="font-variant-caps:small-caps">Galichon, A., <span style="font-variant-caps:normal">and</span> M.Henry</span> (2006): “Inference in Incomplete Models” available at [http://dx.doi.org/10.2139/ssrn.886907 http://dx.doi.org/10.2139/ssrn.886907].</ref> and extended by other authors (e.g., <ref name="ber:mol:mol08"><span style="font-variant-caps:small-caps">Beresteanu, A., I.Molchanov, <span style="font-variant-caps:normal">and</span> F.Molinari</span> (2008): “Sharp Identification Regions in Games” CeMMAP working paper CWP15/08, available at [https://www.cemmap.ac.uk/publication/id/4264 https://www.cemmap.ac.uk/publication/id/4264].</ref><ref name="ber:mol:mol11"><span style="font-variant-caps:small-caps">Beresteanu, A., I.Molchanov, <span style="font-variant-caps:normal">and</span> F.Molinari</span> (2011): “Sharp identification regions in models with convex moment predictions” ''Econometrica'', 79(6), 1785--1821.</ref><ref name="che:ros:smo13"><span style="font-variant-caps:small-caps">Chesher, A., A.M. Rosen, <span style="font-variant-caps:normal">and</span> K.Smolinski</span> (2013): “An instrumental variable model of multiple discrete choice” ''Quantitative Economics'', 4(2), 157--196.</ref><ref name="che:ros17"><span style="font-variant-caps:small-caps">Chesher, A., <span style="font-variant-caps:normal">and</span> A.M. Rosen</span> (2017a): “Generalized instrumental variable models” ''Econometrica'', 85, 959--989.</ref>), often many moment inequalities are redundant, substantially reducing the number of inequalities to be checked. | ||
Specifically, <ref name="gal:hen06" | Specifically, <ref name="gal:hen06"/> propose the notion of ''core determining sets'', a collection of compact sets such that if the inequality in [[guide:379e0dcd67#thr:artstein |Theorem]] holds for these sets, it holds for all sets in <math>\cK</math>, see [[guide:379e0dcd67#def:core-det |Definition]] and the surrounding discussion in [[guide:379e0dcd67#app:RCS |Appendix]]. | ||
This often yields a number of restrictions similar to the one incurred to obtain outer regions. | This often yields a number of restrictions similar to the one incurred to obtain outer regions. | ||
For example, <ref name="ber:mol:mol08" | For example, <ref name="ber:mol:mol08"/>{{rp|at=Section 4.2}} analyze a four player, two type entry game with pure strategy Nash equilibrium as solution concept, originally proposed by <ref name="ber:tam06"><span style="font-variant-caps:small-caps">Berry, S.T., <span style="font-variant-caps:normal">and</span> E.Tamer</span> (2006): “Identification in Models of Oligopoly Entry” in ''Advances in Economics and Econometrics: Theory and Applications, Ninth World Congress'', ed. by R.Blundell, W.K. Newey, <span style="font-variant-caps:normal">and</span> T.E. Persson, vol.2 of ''Econometric Society Monographs'', p. 46–85. Cambridge University Press.</ref>, and show that while a direct application of [[guide:379e0dcd67#thr:artstein |Theorem]] entails <math>512|\cX|</math> inequality restrictions, <math>26|\cX|</math> suffice. | ||
In this example, <ref name="cil:tam09" | In this example, <ref name="cil:tam09"/>'s outer region is based on checking <math>18|\cX|</math> inequalities. | ||
A related but separate question is how to ''best'' allocate the computational effort. | A related but separate question is how to ''best'' allocate the computational effort. | ||
Line 194: | Line 194: | ||
This is because inequalities that are redundant from the perspective of identification analysis might nonetheless be estimated with high precision, and hence improve the finite sample statistical properties of a confidence set or of a test of hypothesis. | This is because inequalities that are redundant from the perspective of identification analysis might nonetheless be estimated with high precision, and hence improve the finite sample statistical properties of a confidence set or of a test of hypothesis. | ||
Recent contributions by <ref name="and:shi17"></ref>, <ref name="che:che:kat18"></ref> and <ref name="bel:bug:che18"></ref>, provide methods to build confidence set, respectively, with a continuum of conditional moment inequalities, and with a number of moment inequalities that may exceed sample size. | Recent contributions by <ref name="and:shi17"><span style="font-variant-caps:small-caps">Andrews, D. W.K., <span style="font-variant-caps:normal">and</span> X.Shi</span> (2017): “Inference based on many conditional moment inequalities” ''Journal of Econometrics'', 196(2), 275 -- 287.</ref>, <ref name="che:che:kat18"><span style="font-variant-caps:small-caps">Chernozhukov, V., D.Chetverikov, <span style="font-variant-caps:normal">and</span> K.Kato</span> (2018): “Inference on causal and structural parameters using many moment inequalities” ''Review of Economic Studies'', forthcoming, available at [https://doi.org/10.1093/restud/rdy065 https://doi.org/10.1093/restud/rdy065].</ref> and <ref name="bel:bug:che18"><span style="font-variant-caps:small-caps">Belloni, A., F.A. Bugni, <span style="font-variant-caps:normal">and</span> V.Chernozhukov</span> (2018): “Subvector inference in partially identified models with many moment inequalities” available at [https://arxiv.org/abs/1806.11466 https://arxiv.org/abs/1806.11466].</ref>, provide methods to build confidence set, respectively, with a continuum of conditional moment inequalities, and with a number of moment inequalities that may exceed sample size. | ||
These contributions, however, do not yet answer the question of how to optimally select inequalities to yield confidence sets with best finite sample properties according to some specified notion of “best”. | These contributions, however, do not yet answer the question of how to optimally select inequalities to yield confidence sets with best finite sample properties according to some specified notion of “best”. | ||
A different approach proposed by <ref name="che:chr:tam18"></ref> uses directly a quasi-likelihood criterion function. | A different approach proposed by <ref name="che:chr:tam18"><span style="font-variant-caps:small-caps">Chen, X., T.M. Christensen, <span style="font-variant-caps:normal">and</span> E.Tamer</span> (2018): “MCMC Confidence Sets for Identified Sets” ''Econometrica'', 86(6), 1965--2018.</ref> uses directly a quasi-likelihood criterion function. | ||
In the context, e.g., of entry games, this entails assuming that the selection mechanism depends only on observable payoff shifters, using it to obtain the exact model implied distribution as in [[guide:521939d27a#eq:games_model:pred |eq:games_model:pred]], and partially identifying an enlarged parameter vector that includes <math>\theta</math> and the selection mechanism. | In the context, e.g., of entry games, this entails assuming that the selection mechanism depends only on observable payoff shifters, using it to obtain the exact model implied distribution as in [[guide:521939d27a#eq:games_model:pred |eq:games_model:pred]], and partially identifying an enlarged parameter vector that includes <math>\theta</math> and the selection mechanism. | ||
In an empirical application with discrete covariates, <ref name="che:chr:tam18" | In an empirical application with discrete covariates, <ref name="che:chr:tam18"/> apply their method to a two player entry game with correlated errors, where <math>\theta\in\R^9</math> and the selection mechanism is a vector in <math>\R^8</math>, for a total of 17 parameters. | ||
In another application to the analysis of trade flows, their empirical application includes 46 parameters. | In another application to the analysis of trade flows, their empirical application includes 46 parameters. | ||
In terms of general purpose portable code that can be employed in moment inequality models, I am only aware of the MatLab package provided by <ref name="kai:mol:sto:thi17"></ref> to implement the inference method of <ref name="kai:mol:sto19"></ref> for projections and smooth functions of parameter vectors in models defined by a finite number of unconditional moment (in)equalities. | In terms of general purpose portable code that can be employed in moment inequality models, I am only aware of the MatLab package provided by <ref name="kai:mol:sto:thi17"><span style="font-variant-caps:small-caps">Kaido, H., F.Molinari, J.Stoye, <span style="font-variant-caps:normal">and</span> M.Thirkettle</span> (2017): “Calibrated Projection in MATLAB” documentation available at [https://arxiv.org/abs/1710.09707 https://arxiv.org/abs/1710.09707] and code available at [https://github.com/MatthewThirkettle/calibrated-projection-MATLAB https://github.com/MatthewThirkettle/calibrated-projection-MATLAB].</ref> to implement the inference method of <ref name="kai:mol:sto19"><span style="font-variant-caps:small-caps">{Kaido}, H., F.{Molinari}, <span style="font-variant-caps:normal">and</span> J.{Stoye}</span> (2019a): “{Confidence Intervals for Projections of Partially Identified Parameters}” ''Econometrica'', 87(4), 1397--1432.</ref> for projections and smooth functions of parameter vectors in models defined by a finite number of unconditional moment (in)equalities. | ||
More broadly, their method can be used to compute confidence intervals for optimal values of optimization problems with estimated constraints. | More broadly, their method can be used to compute confidence intervals for optimal values of optimization problems with estimated constraints. | ||
Here I summarize their approach to further highlight why the computational task is challenging even in the case of projections. | Here I summarize their approach to further highlight why the computational task is challenging even in the case of projections. | ||
Line 208: | Line 208: | ||
The confidence interval in [[guide:6d1a428897#eq:def:CI |eq:def:CI]]-[[guide:6d1a428897#eq:KMS:proj |eq:KMS:proj]] requires solving two nonlinear programs, each with a linear objective and nonlinear constraints involving a critical value which in general is an unknown function of <math>\vartheta</math>, with unknown gradient. | The confidence interval in [[guide:6d1a428897#eq:def:CI |eq:def:CI]]-[[guide:6d1a428897#eq:KMS:proj |eq:KMS:proj]] requires solving two nonlinear programs, each with a linear objective and nonlinear constraints involving a critical value which in general is an unknown function of <math>\vartheta</math>, with unknown gradient. | ||
When the dimension of the parameter vector is large, directly solving optimization problems with such constraints can be expensive even if evaluating the critical value at each <math>\vartheta</math> is cheap.<ref group="Notes" >{{ref|name=kai:mol:sto19}} propose a linearization method whereby <math>c_{1-\alpha}</math> is calibrated through repeatedly solving bootstrap linear programs, hence it is reasonably cheap to compute.</ref> | When the dimension of the parameter vector is large, directly solving optimization problems with such constraints can be expensive even if evaluating the critical value at each <math>\vartheta</math> is cheap.<ref group="Notes" >{{ref|name=kai:mol:sto19}} propose a linearization method whereby <math>c_{1-\alpha}</math> is calibrated through repeatedly solving bootstrap linear programs, hence it is reasonably cheap to compute.</ref> | ||
Hence, <ref name="kai:mol:sto19" | Hence, <ref name="kai:mol:sto19"/> propose to use an algorithm (called E-A-M for Evaluation-Approximation-Maximization) to solve these nonlinear programs, which belongs to the family of ’'expected improvement algorithms'' (see e.g. <ref name="jon:sch:wel98"><span style="font-variant-caps:small-caps">Jones, D.R., M.Schonlau, <span style="font-variant-caps:normal">and</span> W.J. Welch</span> (1998): “Efficient Global Optimization of Expensive {Black-Box} Functions” ''Journal of Global Optimization'', 13(4), 455--492.</ref><ref name="sch:wel:jon98"><span style="font-variant-caps:small-caps">Schonlau, M., W.J. Welch, <span style="font-variant-caps:normal">and</span> D.R. Jones</span> (1998): “Global versus Local Search in Constrained Optimization of Computer Models” ''Lecture Notes-Monograph Series'', 34, 11--25.</ref><ref name="jon01"><span style="font-variant-caps:small-caps">Jones, D.R.</span> (2001): “A Taxonomy of Global Optimization Methods Based on Response Surfaces” ''Journal of Global Optimization'', 21(4), 345--383.</ref>{{rp|at=and references therein}}). | ||
Given a constrained optimization problem of the form | Given a constrained optimization problem of the form | ||
Line 220: | Line 220: | ||
<ul><li> The ''true'' critical level function <math>c</math> is evaluated at an initial (uniformly randomly drawn from <math>\Theta</math>) set of points <math>\vartheta^1,\dots,\vartheta^k</math>. | <ul><li> The ''true'' critical level function <math>c</math> is evaluated at an initial (uniformly randomly drawn from <math>\Theta</math>) set of points <math>\vartheta^1,\dots,\vartheta^k</math>. | ||
These values are used to compute a current guess for the optimal value, <math>u^\top\vartheta^{*,k}=\max\{u^\top\vartheta:~\vartheta\in\{\vartheta^1,\dots,\vartheta^k\}\text{ and }\bar g(\vartheta)\le c(\vartheta)\}</math>, where <math>\bar g(\vartheta)=\max_{j=1,\dots,J}g_j(\vartheta)</math>. | These values are used to compute a current guess for the optimal value, <math>u^\top\vartheta^{*,k}=\max\{u^\top\vartheta:~\vartheta\in\{\vartheta^1,\dots,\vartheta^k\}\text{ and }\bar g(\vartheta)\le c(\vartheta)\}</math>, where <math>\bar g(\vartheta)=\max_{j=1,\dots,J}g_j(\vartheta)</math>. | ||
The “training data” <math>(\vartheta^{\ell},c(\vartheta^{\ell})_{\ell=1}^k</math> is used to compute an ’'approximating surface'' <math>c_k</math> through a Gaussian-process regression model (kriging), as described in <ref name="san:wil:not13"></ref>{{rp|at=Section 4.1.3}}; | The “training data” <math>(\vartheta^{\ell},c(\vartheta^{\ell})_{\ell=1}^k</math> is used to compute an ’'approximating surface'' <math>c_k</math> through a Gaussian-process regression model (kriging), as described in <ref name="san:wil:not13"><span style="font-variant-caps:small-caps">Santner, T.J., B.J. Williams, <span style="font-variant-caps:normal">and</span> W.I. Notz</span> (2013): ''The design and analysis of computer experiments''. Springer Science & Business Media.</ref>{{rp|at=Section 4.1.3}}; | ||
</li> | </li> | ||
<li> For <math>L\ge k+1</math>, with probability <math>1-\epsilon</math> the next evaluation point <math>\theta^L</math> for the ''true'' critical level function <math>c</math> is chosen by finding the point that maximizes ''expected improvement'' with respect to the ''approximating surface'', <math>\mathbb{EI}_{L-1}(\vartheta)=(u^\top\vartheta-u^\top\vartheta^{*,L-1})_+\{1-\Phi([\bar g(\vartheta)-c_{L-1}(\vartheta)]/[\hat\varsigma s_{L-1}(\vartheta)])\}</math>. | <li> For <math>L\ge k+1</math>, with probability <math>1-\epsilon</math> the next evaluation point <math>\theta^L</math> for the ''true'' critical level function <math>c</math> is chosen by finding the point that maximizes ''expected improvement'' with respect to the ''approximating surface'', <math>\mathbb{EI}_{L-1}(\vartheta)=(u^\top\vartheta-u^\top\vartheta^{*,L-1})_+\{1-\Phi([\bar g(\vartheta)-c_{L-1}(\vartheta)]/[\hat\varsigma s_{L-1}(\vartheta)])\}</math>. | ||
Line 234: | Line 234: | ||
The only place where the approximating surface is used is in Step 2, to choose a new evaluation point. | The only place where the approximating surface is used is in Step 2, to choose a new evaluation point. | ||
In particular, the reported extreme points of <math>\CI</math> in [[guide:6d1a428897#eq:def:CI |eq:def:CI]] are the extreme values of <math>u^\top\vartheta</math> that are consistent with the true surface where this surface was computed, ''not'' with the approximating surface. | In particular, the reported extreme points of <math>\CI</math> in [[guide:6d1a428897#eq:def:CI |eq:def:CI]] are the extreme values of <math>u^\top\vartheta</math> that are consistent with the true surface where this surface was computed, ''not'' with the approximating surface. | ||
<ref name="kai:mol:sto19" | <ref name="kai:mol:sto19"/> establish convergence of their algorithm and obtain a convergence rate, as the number of evaluation points increases, for constrained optimization problems in which the constraints are sufficiently smooth “black box” functions, building on an earlier contribution of <ref name="bul11"><span style="font-variant-caps:small-caps">Bull, A.D.</span> (2011): “Convergence rates of efficient global optimization algorithms” ''Journal of Machine Learning Research'', 12(Oct), 2879--2904.</ref>. | ||
<ref name="bul11" | <ref name="bul11"/> establishes convergence of an expected improvement algorithm for unconstrained optimization problems where the objective is a “black box” function. | ||
The rate of convergence that <ref name="bul11" | The rate of convergence that <ref name="bul11"/> derives depends on the smoothness of the black box objective function. | ||
The rate of convergence obtained by <ref name="kai:mol:sto19" | The rate of convergence obtained by <ref name="kai:mol:sto19"/> depends on the smoothness of the black box constraints, and is slightly slower than <ref name="bul11"/>’s rate. | ||
<ref name="kai:mol:sto19" | <ref name="kai:mol:sto19"/>'s Monte Carlo experiments suggest that the E-A-M algorithm is fast and accurate at computing their confidence intervals. | ||
The E-A-M algorithm also allows for very rapid computation of projections of the confidence set proposed by <ref name="and:soa10"></ref>, and for a substantial improvement in the computational time of the profiling-based confidence intervals proposed by <ref name="bug:can:shi17"></ref>.<ref group="Notes" >{{ref|name=bug:can:shi17}}'s method does not require solving a nonlinear program such as the one in [[guide:6d1a428897#eq:KMS:proj |eq:KMS:proj]]. | The E-A-M algorithm also allows for very rapid computation of projections of the confidence set proposed by <ref name="and:soa10"><span style="font-variant-caps:small-caps">Andrews, D. W.K., <span style="font-variant-caps:normal">and</span> G.Soares</span> (2010): “Inference for Parameters Defined by Moment Inequalities Using Generalized Moment Selection” ''Econometrica'', 78(1), 119--157.</ref>, and for a substantial improvement in the computational time of the profiling-based confidence intervals proposed by <ref name="bug:can:shi17"><span style="font-variant-caps:small-caps">Bugni, F.A., I.A. Canay, <span style="font-variant-caps:normal">and</span> X.Shi</span> (2017): “Inference for subvectors and other functions of partially identified parameters in moment inequality models” ''Quantitative Economics'', 8(1), 1--38.</ref>.<ref group="Notes" >{{ref|name=bug:can:shi17}}'s method does not require solving a nonlinear program such as the one in [[guide:6d1a428897#eq:KMS:proj |eq:KMS:proj]]. | ||
Rather it obtains <math>\CI</math> as in [[guide:6d1a428897#eq:CI:BCS |eq:CI:BCS]]. However, it approximates <math>c_{1-\alpha}</math> by repeatedly solving bootstrap nonlinear programs, thereby incurring a very high computational cost at that stage.</ref> | Rather it obtains <math>\CI</math> as in [[guide:6d1a428897#eq:CI:BCS |eq:CI:BCS]]. However, it approximates <math>c_{1-\alpha}</math> by repeatedly solving bootstrap nonlinear programs, thereby incurring a very high computational cost at that stage.</ref> | ||
In all cases, the speed improvement results from a reduced number of evaluation points required to approximate the optimum. | In all cases, the speed improvement results from a reduced number of evaluation points required to approximate the optimum. | ||
In an application to a point identified setting, <ref name="fre:rev17"></ref>{{rp|at=Supplement Section S.3}} use <ref name="kai:mol:sto19" | In an application to a point identified setting, <ref name="fre:rev17"><span style="font-variant-caps:small-caps">Freyberger, J., <span style="font-variant-caps:normal">and</span> B.Reeves</span> (2017): “Inference Under Shape Restrictions” available at [https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3011474 https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3011474].</ref>{{rp|at=Supplement Section S.3}} use <ref name="kai:mol:sto19"/>'s E-A-M method to construct uniform confidence bands for an unknown function of interest under (nonparametric) shape restrictions. | ||
They benchmark it against gridding and find it to be accurate at considerably improved speed. | They benchmark it against gridding and find it to be accurate at considerably improved speed. | ||
==General references== | ==General references== |
Revision as of 03:27, 31 May 2024
As a rule of thumb, the difficulty in computing estimators of identification regions and confidence sets depends on whether a closed form expression is available for the boundary of the set. For example, often nonparametric bounds on functionals of a partially identified distribution are known functionals of observed conditional distributions, as in Section. Then “plug in” estimation is possible, and the computational cost is the same as for estimation and construction of confidence intervals (or confidence bands) for point-identified nonparametric regressions (incurred twice, once for the lower bound and once for the upper bound).
Similarly, support function based inference is easy to implement when [math]\idr{\theta}[/math] is convex. Sometimes the extreme points of [math]\idr{\theta}[/math] can be expressed as known functionals of observed distributions. Even if not, level sets of convex functions are easy to compute.
But as it was shown in Section, many problems of interest yield a set [math]\idr{\theta}[/math] that is ’'not convex. In this case, [math]\idr{\theta}[/math] is obtained as a level set of a criterion function. Because [math]\idr{\theta}[/math] (or its associated confidence set) is often a subset of [math]\R^d[/math] (rather than [math]\R[/math]), even a moderate value for [math]d[/math], e.g., 8 or 10, can lead to extremely challenging computational problems. This is because if one wants to compute [math]\idr{\theta}[/math] or a set that covers it or its elements with a prespecified asymptotic probability (possibly uniformly over [math]\sP\in\cP[/math]), one has to map out a level set in [math]\R^d[/math]. If one is interested in confidence intervals for scalar projections or other smooth functions of [math]\vartheta\in\idr{\theta}[/math], one needs to solve complex nonlinear optimization problems, as for example in eq:CI:BCS and eq:KMS:proj. This can be difficult to do, especially because [math]c_{1-\alpha}(\vartheta)[/math] is typically an unknown function of [math]\vartheta[/math] for which gradients are not available in closed form.
Mirroring the fact that computation is easier when the boundary of [math]\idr{\theta}[/math] is a known function of observed conditional distributions, several portable software packages are available to carry out estimation and inference in this case. For example, [1] provide STATA and MatLab packages implementing the methods proposed by [2][3][4][5][6], [7][8], and [9].
[10] provides a STATA package to implement the bounds proposed by [11]. [12] provide a STATA package to implement bounds on treatment effects with endogenous and misreported treatment assignment and under the assumptions of monotone treatment selection, monotone treatment response, and monotone instrumental variables as in [6], [9], [13], [14], and [15]. The code computes the confidence intervals proposed by [16].
In the more general context of inference for a one-dimensional parameter defined by intersection bounds, as for example the one in eq:intersection:bounds, [17] and [18] provide portable STATA code implementing, respectively, methods to test hypotheses and build confidence intervals in [19] and in [20]. [21] provide portable STATA code implementing [22]'s method for estimation and inference for best linear prediction with interval outcome data as in Identification Problem.
[23] provide R code implementing [24]'s method for estimation and inference for best linear approximations of set identified functions.\medskip On the other hand, there is a paucity of portable software implementing the theoretical methods for inference in structural partially identified models discussed in Section.
[25] compute [26] confidence sets for a parameter vector in [math]\R^d[/math] in an entry game with six players, with [math]d[/math] in the order of [math]20[/math] and with tens of thousands of inequalities, through a “guess and verify” algorithm based on simulated annealing (with no cooling) that visits many candidate values [math]\vartheta\in\Theta[/math], evaluates [math]\crit_n(\vartheta)[/math], and builds [math]\CS[/math] by retaining the visited values [math]\vartheta[/math] that satisfy [math]n\crit_n(\vartheta)\le c_{1-\alpha}(\vartheta)[/math] with [math]c_{1-\alpha}[/math] defined to satisfy eq:CS_coverage:point:pw.
Given the computational resources commonly available at this point in time, this is a tremendously hard task, due to the dimension of [math]\theta[/math] and the number of moment inequalities employed. As explained in Section An Inference Approach Robust to the Presence of Multiple Equilibria, these inequalities, which in a game of entry with [math]J[/math] players and discrete observable payoff shifters are [math]2^J|\cX|[/math] (with [math]\cX[/math] the support of the observable payoff shifters), yield an outer region [math]\outr{\theta}[/math].
It is natural to wonder what are the additional challenges faced to compute [math]\idr{\theta}[/math] as described in Section Characterization of Sharpness through Random Set Theory. A definitive answer to this question is hard to obtain. If one employs ’'all inequalities listed in Theorem, the number of inequalities jumps to [math](2^{2^J}-2)|\cX|[/math], increasing the computational cost. However, as suggested by [27] and extended by other authors (e.g., [28][29][30][31]), often many moment inequalities are redundant, substantially reducing the number of inequalities to be checked. Specifically, [27] propose the notion of core determining sets, a collection of compact sets such that if the inequality in Theorem holds for these sets, it holds for all sets in [math]\cK[/math], see Definition and the surrounding discussion in Appendix. This often yields a number of restrictions similar to the one incurred to obtain outer regions. For example, [28](Section 4.2) analyze a four player, two type entry game with pure strategy Nash equilibrium as solution concept, originally proposed by [32], and show that while a direct application of Theorem entails [math]512|\cX|[/math] inequality restrictions, [math]26|\cX|[/math] suffice. In this example, [25]'s outer region is based on checking [math]18|\cX|[/math] inequalities.
A related but separate question is how to best allocate the computational effort. As one moves from partial identification analysis to finite sample considerations, one may face a trade-off between sharpness of the identification region and statistical efficiency. This is because inequalities that are redundant from the perspective of identification analysis might nonetheless be estimated with high precision, and hence improve the finite sample statistical properties of a confidence set or of a test of hypothesis.
Recent contributions by [33], [34] and [35], provide methods to build confidence set, respectively, with a continuum of conditional moment inequalities, and with a number of moment inequalities that may exceed sample size. These contributions, however, do not yet answer the question of how to optimally select inequalities to yield confidence sets with best finite sample properties according to some specified notion of “best”.
A different approach proposed by [36] uses directly a quasi-likelihood criterion function. In the context, e.g., of entry games, this entails assuming that the selection mechanism depends only on observable payoff shifters, using it to obtain the exact model implied distribution as in eq:games_model:pred, and partially identifying an enlarged parameter vector that includes [math]\theta[/math] and the selection mechanism. In an empirical application with discrete covariates, [36] apply their method to a two player entry game with correlated errors, where [math]\theta\in\R^9[/math] and the selection mechanism is a vector in [math]\R^8[/math], for a total of 17 parameters. In another application to the analysis of trade flows, their empirical application includes 46 parameters.
In terms of general purpose portable code that can be employed in moment inequality models, I am only aware of the MatLab package provided by [37] to implement the inference method of [38] for projections and smooth functions of parameter vectors in models defined by a finite number of unconditional moment (in)equalities. More broadly, their method can be used to compute confidence intervals for optimal values of optimization problems with estimated constraints. Here I summarize their approach to further highlight why the computational task is challenging even in the case of projections.
The confidence interval in eq:def:CI-eq:KMS:proj requires solving two nonlinear programs, each with a linear objective and nonlinear constraints involving a critical value which in general is an unknown function of [math]\vartheta[/math], with unknown gradient. When the dimension of the parameter vector is large, directly solving optimization problems with such constraints can be expensive even if evaluating the critical value at each [math]\vartheta[/math] is cheap.[Notes 1] Hence, [38] propose to use an algorithm (called E-A-M for Evaluation-Approximation-Maximization) to solve these nonlinear programs, which belongs to the family of ’'expected improvement algorithms (see e.g. [39][40][41](and references therein)). Given a constrained optimization problem of the form
to which eq:KMS:proj belongs,[Notes 2] the algorithm attempts to solve it by cycling over three steps:
- The true critical level function [math]c[/math] is evaluated at an initial (uniformly randomly drawn from [math]\Theta[/math]) set of points [math]\vartheta^1,\dots,\vartheta^k[/math]. These values are used to compute a current guess for the optimal value, [math]u^\top\vartheta^{*,k}=\max\{u^\top\vartheta:~\vartheta\in\{\vartheta^1,\dots,\vartheta^k\}\text{ and }\bar g(\vartheta)\le c(\vartheta)\}[/math], where [math]\bar g(\vartheta)=\max_{j=1,\dots,J}g_j(\vartheta)[/math]. The “training data” [math](\vartheta^{\ell},c(\vartheta^{\ell})_{\ell=1}^k[/math] is used to compute an ’'approximating surface [math]c_k[/math] through a Gaussian-process regression model (kriging), as described in [42](Section 4.1.3);
- For [math]L\ge k+1[/math], with probability [math]1-\epsilon[/math] the next evaluation point [math]\theta^L[/math] for the true critical level function [math]c[/math] is chosen by finding the point that maximizes expected improvement with respect to the approximating surface, [math]\mathbb{EI}_{L-1}(\vartheta)=(u^\top\vartheta-u^\top\vartheta^{*,L-1})_+\{1-\Phi([\bar g(\vartheta)-c_{L-1}(\vartheta)]/[\hat\varsigma s_{L-1}(\vartheta)])\}[/math]. Here [math]c_{L-1}(\vartheta)[/math] and [math]\hat\varsigma^2 s_{L-1}^2(\vartheta)[/math] are estimators of the posterior mean and variance of the approximating surface. To aim for global search, with probability [math]\epsilon[/math], [math]\vartheta^L[/math] is drawn uniformly from [math]\Theta[/math]. The approximating surface is then recomputed using [math](\vartheta^{\ell},c(\vartheta^{\ell})_{\ell=1}^L)[/math]. Steps 1 and 2 are repeated until a convergence criterion is met.
- The extreme point of [math]CI_n[/math] is reported as the value [math]u^\top\vartheta^{*,L}[/math] that maximizes [math]u^\top\vartheta[/math] among the evaluation points that satisfy the true constraints, i.e. [math]u^\top\vartheta^{*,L}=\max\{u^\top\vartheta:~\vartheta\in\{\vartheta^1,\dots,\vartheta^L\}\text{ and }\bar g(\vartheta)\le c(\vartheta)\}[/math].
The only place where the approximating surface is used is in Step 2, to choose a new evaluation point. In particular, the reported extreme points of [math]\CI[/math] in eq:def:CI are the extreme values of [math]u^\top\vartheta[/math] that are consistent with the true surface where this surface was computed, not with the approximating surface. [38] establish convergence of their algorithm and obtain a convergence rate, as the number of evaluation points increases, for constrained optimization problems in which the constraints are sufficiently smooth “black box” functions, building on an earlier contribution of [43]. [43] establishes convergence of an expected improvement algorithm for unconstrained optimization problems where the objective is a “black box” function. The rate of convergence that [43] derives depends on the smoothness of the black box objective function. The rate of convergence obtained by [38] depends on the smoothness of the black box constraints, and is slightly slower than [43]’s rate. [38]'s Monte Carlo experiments suggest that the E-A-M algorithm is fast and accurate at computing their confidence intervals. The E-A-M algorithm also allows for very rapid computation of projections of the confidence set proposed by [44], and for a substantial improvement in the computational time of the profiling-based confidence intervals proposed by [45].[Notes 3] In all cases, the speed improvement results from a reduced number of evaluation points required to approximate the optimum. In an application to a point identified setting, [46](Supplement Section S.3) use [38]'s E-A-M method to construct uniform confidence bands for an unknown function of interest under (nonparametric) shape restrictions. They benchmark it against gridding and find it to be accurate at considerably improved speed.
General references
Molinari, Francesca (2020). "Microeconometrics with Partial Identification". arXiv:2004.11751 [econ.EM].
Notes
- [1] propose a linearization method whereby [math]c_{1-\alpha}[/math] is calibrated through repeatedly solving bootstrap linear programs, hence it is reasonably cheap to compute.
- To see this it suffices to set [math]g_j(\vartheta)=\frac{\sqrt{n}\bar{m}_{n,j}(\vartheta)}{\hat{\sigma}_{n,j}(\vartheta)}[/math] and [math]c(\vartheta)= c_{1-\alpha}(\vartheta)[/math].
- [2]'s method does not require solving a nonlinear program such as the one in eq:KMS:proj. Rather it obtains [math]\CI[/math] as in eq:CI:BCS. However, it approximates [math]c_{1-\alpha}[/math] by repeatedly solving bootstrap nonlinear programs, thereby incurring a very high computational cost at that stage.
References
- Beresteanu, A., and C.F. Manski (2000): “Bounds for STATA and Bounds for MatLab” available at http://faculty.wcas.northwestern.edu/cfm754/bounds_stata.pdf.
- Manski, C.F. (1989): “Anatomy of the Selection Problem” The Journal of Human Resources, 24(3), 343--360.
- Manski, C.F. (1990): “Nonparametric Bounds on Treatment Effects” The American Economic Review Papers and Proceedings, 80(2), 319--323.
- Manski, C.F. (1994): “The selection problem” in Advances in Econometrics: Sixth World Congress, ed. by C.A. Sims, vol.1 of Econometric Society Monographs, pp. 143--170. Cambridge University Press.
- Manski, C.F. (1995): Identification Problems in the Social Sciences. Harvard University Press.
- 6.0 6.1 Manski, C.F. (1997b): “Monotone Treatment Response” Econometrica, 65(6), 1311--1334.
- Horowitz, J.L., and C.F. Manski (1998): “Censoring of outcomes and regressors due to survey nonresponse: Identification and estimation using weights and imputations” Journal of Econometrics, 84(1), 37 -- 58.
- Horowitz, J.L., and C.F. Manski (2000): “Nonparametric Analysis of Randomized Experiments with Missing Covariate and Outcome Data” Journal of the American Statistical Association, 95(449), 77--84.
- 9.0 9.1 Manski, C.F., and J.V. Pepper (2000): “Monotone Instrumental Variables: With an Application to the Returns to Schooling” Econometrica, 68(4), 997--1010.
- Tauchmann, H. (2014): “Lee (2009) treatment-effect bounds for nonrandom sample selection” Stata Journal, 14(4), 884--894.
- Lee, D.S. (2009): “Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects” The Review of Economic Studies, 76(3), 1071--1102.
- McCarthy, I., D.L. Millimet, and M.Roy (2015): “Bounding treatment effects: A command for the partial identification of the average treatment effect with endogenous and misreported treatment assignment” Stata Journal, 15(2), 411--436.
- Kreider, B., and J.V. Pepper (2007): “Disability and Employment: Reevaluating the Evidence in Light of Reporting Errors” Journal of the American Statistical Association, 102(478), 432--441.
- Gundersen, C., B.Kreider, and J.Pepper (2012): “The impact of the National School Lunch Program on child health: A nonparametric bounds analysis” Journal of Econometrics, 166(1), 79--91.
- Kreider, B., J.V. Pepper, C.Gundersen, and D.Jolliffe (2012): “Identifying the Effects of SNAP (Food Stamps) on Child Health Outcomes When Participation Is Endogenous and Misreported” Journal of the American Statistical Association, 107(499), 958--975.
- Imbens, G.W., and C.F. Manski (2004): “Confidence Intervals for Partially Identified Parameters” Econometrica, 72(6), 1845--1857.
- Chernozhukov, V., W.Kim, S.Lee, and A.M. Rosen (2015): “Implementing intersection bounds in Stata” Stata Journal, 15(1), 21--44.
- Andrews, D. W.K., W.Kim, and X.Shi (2017): “Commands for testing conditional moment inequalities and equalities” Stata Journal, 17(1), 56--72.
- Chernozhukov, V., S.Lee, and A.M. Rosen (2013): “Intersection Bounds: estimation and inference” Econometrica, 81(2), 667--737.
- Andrews, D. W.K., and X.Shi (2013): “Inference based on conditional moment inequalities” Econometrica, 81(2), 609--666.
- Beresteanu, A., F.Molinari, and D.S. Morris (2010): “Asymptotics for Partially Identified Models in STATA” available at https://molinari.economics.cornell.edu/programs/Stata\_SetBLP.zip.
- Beresteanu, A., and F.Molinari (2008): “Asymptotic Properties for a Class of Partially Identified Models” Econometrica, 76(4), 763--814.
- Chandrasekhar, A., V.Chernozhukov, F.Molinari, and P.Schrimpf (2012): “R code implementing best linear approximations to set identified functions” available at https://bitbucket.org/paulschrimpf/mulligan-rubinstein-bounds.
- Chandrasekhar, A., V.Chernozhukov, F.Molinari, and P.Schrimpf (2018): “Best linear approximations to set identified functions: with an application to the gender wage gap” CeMMAP working paper CWP09/19, available at https://www.cemmap.ac.uk/publication/id/13913.
- 25.0 25.1 Ciliberto, F., and E.Tamer (2009): “Market Structure and Multiple Equilibria in Airline Markets” Econometrica, 77(6), 1791--1828.
- Chernozhukov, V., H.Hong, and E.Tamer (2007): “Estimation and Confidence Regions for Parameter Sets in Econometric Models” Econometrica, 75(5), 1243--1284.
- 27.0 27.1 Galichon, A., and M.Henry (2006): “Inference in Incomplete Models” available at http://dx.doi.org/10.2139/ssrn.886907.
- 28.0 28.1 Beresteanu, A., I.Molchanov, and F.Molinari (2008): “Sharp Identification Regions in Games” CeMMAP working paper CWP15/08, available at https://www.cemmap.ac.uk/publication/id/4264.
- Beresteanu, A., I.Molchanov, and F.Molinari (2011): “Sharp identification regions in models with convex moment predictions” Econometrica, 79(6), 1785--1821.
- Chesher, A., A.M. Rosen, and K.Smolinski (2013): “An instrumental variable model of multiple discrete choice” Quantitative Economics, 4(2), 157--196.
- Chesher, A., and A.M. Rosen (2017a): “Generalized instrumental variable models” Econometrica, 85, 959--989.
- Berry, S.T., and E.Tamer (2006): “Identification in Models of Oligopoly Entry” in Advances in Economics and Econometrics: Theory and Applications, Ninth World Congress, ed. by R.Blundell, W.K. Newey, and T.E. Persson, vol.2 of Econometric Society Monographs, p. 46–85. Cambridge University Press.
- Andrews, D. W.K., and X.Shi (2017): “Inference based on many conditional moment inequalities” Journal of Econometrics, 196(2), 275 -- 287.
- Chernozhukov, V., D.Chetverikov, and K.Kato (2018): “Inference on causal and structural parameters using many moment inequalities” Review of Economic Studies, forthcoming, available at https://doi.org/10.1093/restud/rdy065.
- Belloni, A., F.A. Bugni, and V.Chernozhukov (2018): “Subvector inference in partially identified models with many moment inequalities” available at https://arxiv.org/abs/1806.11466.
- 36.0 36.1 Chen, X., T.M. Christensen, and E.Tamer (2018): “MCMC Confidence Sets for Identified Sets” Econometrica, 86(6), 1965--2018.
- Kaido, H., F.Molinari, J.Stoye, and M.Thirkettle (2017): “Calibrated Projection in MATLAB” documentation available at https://arxiv.org/abs/1710.09707 and code available at https://github.com/MatthewThirkettle/calibrated-projection-MATLAB.
- 38.0 38.1 38.2 38.3 38.4 38.5 {Kaido}, H., F.{Molinari}, and J.{Stoye} (2019a): “{Confidence Intervals for Projections of Partially Identified Parameters}” Econometrica, 87(4), 1397--1432.
- Jones, D.R., M.Schonlau, and W.J. Welch (1998): “Efficient Global Optimization of Expensive {Black-Box} Functions” Journal of Global Optimization, 13(4), 455--492.
- Schonlau, M., W.J. Welch, and D.R. Jones (1998): “Global versus Local Search in Constrained Optimization of Computer Models” Lecture Notes-Monograph Series, 34, 11--25.
- Jones, D.R. (2001): “A Taxonomy of Global Optimization Methods Based on Response Surfaces” Journal of Global Optimization, 21(4), 345--383.
- Santner, T.J., B.J. Williams, and W.I. Notz (2013): The design and analysis of computer experiments. Springer Science & Business Media.
- 43.0 43.1 43.2 43.3 Bull, A.D. (2011): “Convergence rates of efficient global optimization algorithms” Journal of Machine Learning Research, 12(Oct), 2879--2904.
- Andrews, D. W.K., and G.Soares (2010): “Inference for Parameters Defined by Moment Inequalities Using Generalized Moment Selection” Econometrica, 78(1), 119--157.
- Bugni, F.A., I.A. Canay, and X.Shi (2017): “Inference for subvectors and other functions of partially identified parameters in moment inequality models” Quantitative Economics, 8(1), 1--38.
- Freyberger, J., and B.Reeves (2017): “Inference Under Shape Restrictions” available at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3011474.