Institute of Statistical Science Academia Sinica

Postdoc Seminars

Seminars Postdoc Seminars

The forest for the trees: a meta-test of model nonlinearity

2019-03-20 (Wed.), 14:30 PM
R6005, Research Center for Environmental Changes Building
The reception will be held at 15:30 at the R6005, Research Center for Environmental Changes Building
Prof. HEIKO RACHINGER
Universidad de las Islas Baleares, Departamento de Econom?a Aplicada, Spain.

Abstract

In observational research, one often seeks to investigate the empirical nexus between two variables, y and x. A technically convenient assumption to adopt from the outset is that of linearity, i.e., that y=a+bx+u, since statistical methods are better developed---and their properties better understood--- in this linear context. Not surprisingly, and even though it is unlikely to be generally and strictly valid, the vast majority of observational research relies on the assumption of linearity. Linearity may be approximately valid in a narrow domain of the independent variable, may be difficult to reject for a particular sample of data, or may simply be assumed without further checks on its validity. Yet, the true relationship may be nonlinear. ? ??? In this paper we show that meta-analysis can be used to detect and inform on the nature of a nonlinear relationship between two variables even if all available estimates are derived assuming linearity. Let us illustrate the problem with an example taken from economics. Suppose that y and x denote monthly household consumption and monthly household disposable income, respectively. In this case, b measures the so-called marginal propensity to consume, a parameter that plays a major role in macroeconomic models. Now assume that many estimates of b are available in the literature, obtained assuming linearity from different samples spread out over a wider domain of disposable income. We then ask: Can we use these linear estimates to paint a broader nonlinear picture of the relationship between consumption and income? Can we test at the meta-analytical level whether, as is often theoretically argued, the marginal propensity to consume actually depends on the levels of consumption or disposable income? ? ??? We start by noting that this exercise would be relatively straightforward if the primary studies reported the means of the variables along with the estimates of a and b. A meta-specification test would then simply check whether the estimates of b depend on the sample means of x (or y). Under the null hypothesis of linearity, a regression of the former on the latter should deliver a zero coefficient. In the consumption-income example, the marginal propensity to consume should not depend on the level of income. If, on the contrary, the two variables are nonlinearly related, then the estimates of the slope coefficient b should depend on the support of the variable, increasing with the mean of x if the relationship is convex, and decreasing if it is concave. The problem of this test is that, in practice, the means of x and y are often not reported. We thus refer to this test as infeasible. ? ??? The main challenge of this paper is then to construct a feasible meta-specification test that only require information that is routinely reported in the primary studies, meaning the estimates of a and b and the respective standard errors (but not the means of x and y). We show that such a feasible test involves regressing the estimated values of b on the ratio of the standard errors of a and b. Indeed, this ratio implicitly contains information on the means of x, which can be exploited to test whether a nonlinear relationship is actually present. ? ??? We conduct Monte Carlo simulations to show that this feasible test controls the probability of erroneously rejecting the null of linearity (i.e., its size) and has quite some power to reject the null of linearity when the true relationship is nonlinear, only slightly less than that of the infeasible test. This means that little information is lost by not knowing the sample means of the variables. We investigate and discuss the performance of this test under a variety of research conditions: degree of nonlinearity, number of available estimates, degree of within-study sample variation, and degree of between-study sample variation.

Update：2026-01-30 21:40

Back