WPS6335 Policy Research Working Paper 6335 Structural Change and Cross-Country Growth Empirics Markus Eberhardt Francis Teal The World Bank Development Economics Vice Presidency Partnerships, Capacity Building Unit January 2013 Policy Research Working Paper 6335 Abstract One of the most striking features of economic growth functions for agriculture and manufacturing in a panel is the process of structural change whereby the share of 40 developing and developed countries for the period of agriculture in GDP decreases as countries develop. from 1963 to 1992. It empirically models dimensions The cross-country growth literature typically estimates of heterogeneity across countries, allowing for different an aggregate homogeneous production function or choices of technology within both sectors. The paper convergence regression model that abstracts from this argues that heterogeneity is important within sectors process of structural change. This paper investigates across countries implying that an analysis of aggregate the extent to which assumptions about aggregation data will not produce useful measures of the nature of and homogeneity matter for inferences regarding the the technology or productivity. It shows that many of the nature of technology differences across countries. Using puzzling elements in aggregate cross-country empirics a unique World Bank dataset, it estimates production can be explained by inappropriate aggregation across heterogeneous sectors. This paper is a product of the Partnerships, Capacity Building Unit, Development Economics Vice Presidency. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at markus.eberhardt@nottingham.ac.uk and francis.teal@economics.ox.ac.uk. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Structural Change and Cross-Country Growth Empirics Markus Eberhardt and Francis Teal* JEL codes: O47, O11, C23 Keywords: dual economy model; cross-country production function; technology heterogeneity; aggregation; common factor model; panel time series econometrics Sector Board: Economic Policy (EPOL) * Markus Eberhardt (markus.eberhardt@nottingham.ac.uk, corresponding author) is a lecturer in economics at the University of Nottingham and a research associate at the Centre for the Study of African Economies (CSAE), Department of Economics, University of Oxford. Francis Teal (francis.teal@economics.ox.ac.uk) is a university reader in economics at the University of Oxford, deputy director of CSAE, and a research fellow at the Institute for the Study of Labor (IZA), Bonn. This research was supported by the UK Economic and Social Research Council [grant numbers PTA-031-2004-00345 and PTA-026-27-2048 to M.E.] and the Bill and Melinda Gates Foundation [to M.E.]. The authors thank Anindya Banerjee, Alberto Behar, Steve Bond, Josep Carrion-i-Silvestre, Areendam Chanda, Hashem Pesaran, MÃ¥ns Söderbom, Ron Smith, Dietz Vollrath, three anonymous referees, and seminar/session attendants at Oxford, Manchester, and Birmingham as well as the CSAE Annual Conference 2009, the 13th Applied Economics Meeting, the 16th International Panel Data Conference, and the 7th Annual Meeting of the Irish Society of New Economists for useful comments and suggestions. The usual disclaimers apply. The early literature on developing countries distinguished between the processes of economic development and economic growth. Economic development was considered a process of structural transformation by which, in Arthur Lewis’ frequently cited phrase, an economy that was “previously saving and investing 4 or 5 percent of its national income or less, converts itself into an economy where voluntary savings is running at about 12 to 15 percent of national incomeâ€? (Lewis 1954: 155). An acceleration in the investment rate was only one part of this process of structural transformation; of equal importance was the process by which an economy moves from dependence on subsistence agriculture to one in which a modern industrial sector absorbs an increasing proportion of the labor force (e.g., Jorgensen 1961; Ranis and Fei 1961; Robinson 1971). In contrast to these models of “development for backward economiesâ€? (Jorgensen 1961: 309), where duality between the modern and traditional sectors was a key feature of the model, was the analysis of economic growth in developed economies. 1 Here, the processes of factor accumulation and technical progress occur in an economy that is already developed, in the sense that it has a modern industrial sector and agriculture has ceased to be a major part of the economy (e.g. Solow 1956; Swan 1956). Since the early 1990s, the literature on economic development and economic growth has yielded a wide array of models with increasing interaction between theory and empirics (Durlauf and Quah, 1999; Easterly, 2002; Durlauf, Johnson, and Temple, 2005). The applied literature continues to be dominated by an empirical version of the aggregate Solow-Swan model (Temple 2005), with much of the debate focusing on the roles of factor accumulation versus technical progress (Young 1995; Klenow and Rodriguez-Clare 1997a, b; Easterly and Levine 2001; Baier, Dwyer, and Tamura 2006). Although some new theoretical and empirical work has used a dual economy approach (e.g., Vollrath 2009a, b; Lin, 2011; McMillan and Rodrik 2011; Page 2012), 2 this model is largely absent from textbooks on economic growth and has not been the central focus for most empirical analyses (Temple 2005). A primary reason for this focus has been the availability of data. The Penn World Table (PWT) dataset (most recently, Heston, Summers, and Aten 2011) and the Barro-Lee data on human capital (most recently, Barro and Lee 2010) have supplied macrodata that facilitate the estimation of the aggregate human capital-augmented Solow-Swan model. However, a team at the World Bank has developed comparable sectoral data for agriculture and manufacturing (Crego, Larson, Butzer, and Mundlak 1998) that allow for a closer matching between a dual economy framework and the data, which we seek to exploit in this paper. We estimate production functions for the manufacturing and agriculture sectors and contrast the results with those from ‘stylized’ aggregate production functions where we construct all variables by adding up the sectoral values in each country. In addition, we follow the standard approach in the literature using data from the PWT to estimate aggregate functions. Our findings indicate that technological differences across countries and sectors are important and that aggregate specifications are likely to produce misleading inferences regarding total factor productivity (TFP). The remainder of the paper is organized as follows: section I provides motivations for technology heterogeneity across sectors and countries. In section II, we introduce an empirical specification for our dual economy framework, discuss the data, and briefly review the empirical methods and estimators employed. Section III reports and discusses empirical findings at the sector level. Section IV presents empirical findings from stylized and PWT aggregate data as well as evidence for technology heterogeneity. Summary remarks and conclusions are provided in section V. 3 <>TECHNOLOGY HETEROGENEITY In the following sections, we sketch our theoretical arguments for technology heterogeneity across sectors of production and across countries, building on the dual economy and new growth literature. <>Technology Heterogeneity across Sectors From a technical point of view, an aggregate production function only offers an appropriate construct in a cross-country empirical framework if the economies under investigation do not display large differences in sectoral structure (Temple 2005) because a single production function framework assumes common production technology across all firms facing the same factor prices. Consider two distinct sectors, assuming marginal labor product equalization and capital homogeneity across sectors, and Cobb-Douglas-type production technology. Then, if technology parameters differ between sectors, aggregated production technology cannot be of the (standard) Cobb-Douglas form (Stoker 1993; Temple and Wößmann 2006). Thus, finding different technology parameters across sectoral production functions is potentially a serious challenge to treating production in the form of an aggregated function. An alternative motivation for focusing on sector-level rather than aggregate growth across countries is the following: it is common practice in applied work to exclude oil-producing countries from any aggregate growth analysis because “the bulk of recorded GDP for these countries represents the extraction of existing resources, not value addedâ€? (Mankiw, Romer, and Weil 1992: 413). The underlying argument is that sectoral ‘distortions,’ such as resource wealth, justify the exclusion of these observations. Therefore, it could be argued that given the large share of agriculture in GDP for countries such as Malawi (25 to 50 percent over the period between 1970 and 2000), India (25 to 46 percent), or Malaysia (8 to 30 percent), these countries 4 should be excluded from any aggregate growth analysis because a significant share of their aggregate GDP is derived from a single resource, namely, land. 2 A sector-level analysis mitigates this problem because manufacturing and agriculture are clearly more homogeneous sectors than any aggregate construct. <>Technology Heterogeneity across Countries A theoretical justification for heterogeneous technology parameters across countries can be found in the ‘new growth’ literature. This strand of the literature on theories of economic growth argues that production functions differ across countries and seeks to determine the sources of this heterogeneity (Durlauf, Kourtellos, and Minkin 2001). As Brock and Durlauf (2001: 8/9) remark, “… the assumption of parameter homogeneity seems particularly inappropriate when one is studying complex heterogeneous objects such as countries.â€? Azariadis and Drazen’s (1990) model can be considered the ‘grandfather’ for many of the theoretical attempts to allow countries to possess technologies that differ from one another or over time. Other theoretical studies lead to interpretations of multiple equilibria as factor parameter heterogeneity in the production function (e.g., Murphy, Shleifer, and Vishny 1989; Durlauf 1993; Banerjee and Newman 1993). The ‘appropriate technology’ literature provides a further challenge to the assumption of a common technology, arguing that different technologies are appropriate for different factor endowments (see Basu and Weil 1998): global R&D leaders develop productivity-enhancing technologies that are suitable for their own capital-labor ratios and that cannot be used effectively by poorer countries; therefore, the latter do not develop. Empirical evidence that lends support to this hypothesis can be found in Clark (2007) and Jerzmanowski (2007). A simpler justification for heterogeneous production functions is offered by Durlauf, Kourtellos, and Minkin (2001: 929), who suggest that the Solow model was never intended to be valid in a 5 homogeneous specification for all countries but that it might be a good way to investigate each country, that is, if we allow for parameter differences across countries. Formal insights for empirical modeling can be gained from the microproduction framework introduced in Mundlak (1988) and applied to macrodata for agriculture in Mundlak, Larson, and Butzer (1999) and Mundlak, Butzer, and Larson (2012). In these studies, the technology of production available to individual firms is a collection of possible techniques, each with its own production function, with optimal output over implemented techniques defined as (1) Y * ≡ F ( X * , s) = Ï• (s) where X* and Y* represent (optimal) inputs and output aggregated over implemented techniques, and s is a vector of state variables determining both optimal input choice X*and implemented technique F(â‹…). 3 In each period, 4 firms face the economic problem of choosing inputs and the appropriate production technique. This joint determination of inputs and technique makes it difficult to identify parameter coefficients in an empirical equivalent of equation (1) unless additional structure is imposed on the problem. Adopting a number of simplifying assumptions, Mundlak, Butzer, and Larson (2012) provide the following approximation for their empirical model of output and inputs (i.e., production/supply and factor demand functions), explicitly including the exogenous state variables s (2) yit = xit β ( s ) + sit γ + m0it + u0it (3) x jit = sitγ + m0it + ε jit where subscript j refers to the specific observed input to production x, and y is observed output; 5 m0it represents a firm-specific productivity shock at time t that is observed by the firm, thus influencing its input choice but is unknown to the econometrician. A large body of microeconometric literature (for a recent survey, see Eberhardt and Helmers 2010) has attempted 6 to address the resulting ‘transmission bias’ first highlighted by Marschak and Andrews (1944). Mundlak, Butzer, and Larson (2012) simplify this productivity shock by requiring that it be decomposable into firm- and time-specific effects, m0it = m0i + m0t (similarly for the input equations). This setup further highlights two ‘technology shifters’: first, the state variables affect output directly and indirectly through the selection of inputs, acting as input/output shifters; second, the state variables directly influence the technology parameters β. The state variables act as technology shifters in the sense that, conditional on s, (i) different countries might have different β coefficients, and/or (ii) at different points in time, the same country might have different β coefficients. The presence of the state variables in the equations for y and x prevents the straightforward application of instrumental variables. 6 Following some simplifying assumptions regarding aggregation (see Mundlak, 1988), the above framework is extended to apply at the country level. Empirical testing in the case of the cross-country production function for agriculture is conducted with the following set of state variables: proxies for human capital, level of development, institutions, peak agricultural yield, and a number of indicators for prices and price variability. 7 Using the simplifying assumption β(s) = β, where β is referred to as a ‘sample-dependent constant,’ the model is estimated using ordinary least squares (OLS) following a within-country-time transformation of the variables (i.e., applying the two-way fixed effects estimator). The authors refer to the results from this regression as ‘core technology.’ 8 Further empirical analysis in this paper and in a related study by one of the coauthors (Butzer, 2011) investigates parameter constancy over time and parameter heterogeneity across countries by splitting the data into two periods and two country groups. Our own empirical approach discussed below builds on the theoretical model by Mundlak (1988) but 7 allows for more flexibility in the empirical implementation than Mundlak, Butzer, and Larson (2012). <>AN EMPIRICAL MODEL OF A DUAL ECONOMY In the following section, we first present a general, empirical specification for our sector-specific analysis of agriculture and manufacturing that shows how recent developments in the econometric modeling of production functions link to the framework proposed by Mundlak. Next, we review a number of empirical estimators, focusing on those arising from the recent panel time series literature, before we briefly discuss the data. <>Empirical Specification Our empirical framework adopts a ‘common factor’ representation for a standard log-linearized Cobb-Douglas production function model. Each sector/level of aggregation is modeled separately. For ease of notation, we do not identify this multiplicity in our general model. Let (4) yit = β i′xit + uit uit = α i + λi′ ft + ε it (5) ′ g mt + φ1mi f1mt + ï?‹ + φnmi f nmt + vmit xmit = Ï€ mi + δ mi (6) ft = Ï„ + Ï? ′ft −1 + ωt and gt = µ + κ ′gt −1 + Ï…t for i = 1,…, N countries, t = 1,…, T time periods, and m = 1,…, k inputs. 9 Equation (4) represents the production function, with y as sectoral or aggregated value-added and x as a set of inputs: labor, physical capital stock, and a measure for natural capital stock (arable and permanent crop land) in the agriculture specification (all variables are transformed to log values). We consider additional inputs (human capital, livestock, and fertilizer) as robustness checks for our general findings (see supplemental appendix S4, available at http:/wber.oxfordjournals.org/). The output elasticities associated with each input (βi) are allowed to differ across countries. 10 8 For unobserved TFP, we employ the combination of a country-specific TFP level (αi) and a set of common factors (ft) with country-specific factor loadings (λi). TFP is therefore, in the spirit of a ‘measure of our ignorance’ (Abramowitz 1956), driven by latent processes that are either difficult to measure or that are truly unobservable. Equation (6) provides some structure for these unobserved common processes that are modeled as simple AR(1) processes with drift terms. We do not exclude the possibility of unit root processes (Ï? = 1, κ = 1) leading to nonstationary observables and unobservables. Note that the potential for spurious regression results arises in this setup if the empirical equation is misspecified. Equation (5) details the evolution of the set of inputs, that is, the input demand functions. Crucially, some of the same processes determining the evolution of inputs are assumed to drive TFP in the production function equation. 11 Economically, this assumption implies that the processes that make up TFP (e.g., knowledge, innovation, absorptive capacity) affect choices of inputs, including the accumulation of capital stock, the evolution of the labor force, and (in the agriculture equation) the area of land under cultivation, while at the same time affecting the production of output directly. Thus, technical progress affects both production and the choice of productive inputs. Econometrically, this setup leads to endogeneity whereby the regressors are correlated with the unobservables, making it difficult to identify βi separately from λi and φi (Kapetanios, Pesaran, and Yamagata 2011). The nature of macroeconomic variables in a globalized world, where economies are strongly connected to each other and latent forces drive all of the outcomes, provides a conceptual justification for the pervasive character of unobserved common factors. The presence of these latent factors makes it difficult to argue for the validity of traditional approaches to causal interpretation of cross-country empirical analyses. Instrumental variable estimation in cross-section growth regressions or Arellano and Bond-type (1991) lag- 9 instrumentation within pooled panel models become invalid in the face of common factors and/or heterogeneous equilibrium relationships (Pesaran and Smith, 1995; Lee, Pesaran, and Smith, 1997). This framework can be viewed as an empirical version of theoretical Mundlak model, developed above. Equations (4) and (5) capture the jointness property that is made explicit in their empirical model by the inclusion of a set of ‘state variables,’ which affect inputs and output in an identical fashion: γ in equations (2) and (3). Conversely, our framework allows underlying unobserved factors to affect inputs and output differentially via the country-specific factor loadings λi. 12 These factors are conceptually similar to the state variables in the Mundlak model; they represent any variable or process that might affect both factor choice and TFP. The empirical implementation of our model differs from that of Mundlak. We allow the data to identify the different choices for the β coefficients. The evolution of the factors is fairly general, including nonstationarity, and the setup provides for global shocks (strong factors) as well as local spillovers (weak factors). The productivity shock term m0it is accounted for by a fixed effect αi (m0i) and the common factor structure (m0t = λft).13 Finally, we allow for technology heterogeneity βi across countries and analyze whether parameter constancy holds over time (βit= βi). The parameter constancy tests will provide further insights into the ‘core technology’ by highlighting whether technology parameters are likely to be functions of unobservable processes (in our case, ft, in the Mundlak, Butzer, and Larson [2012] notation, s). Our empirical implementation is focused on recent panel time series estimators that address nonstationarity, parameter heterogeneity, and cross-section dependence. The following section introduces these methods in more detail. 10 <>Empirical Implementation Our empirical setup incorporates a large degree of flexibility concerning the impact of observable and unobservable inputs on output. Empirical implementation will necessarily lead to different degrees of restrictions on this flexibility, which will then be formally tested: the emphasis is on a comparison of different empirical estimators allowing for or restricting the heterogeneity in the observables and unobservables outlined above. The two-by-two matrix in table 1 indicates the assumptions that are implicit in the various estimators implemented below. 14 For the estimators marked with stars, we confine the results to the supplemental appendix to save space. 15 TABLE 1. Estimators and Assumptions about the Data Generating Process Impact of Unobservables: COMMON IDIOSYNCRATIC Production Technology: COMMON POLS, 2FE, CCEP, GMM*, PMG* CPMG* IDIOSYNCRATIC MG, FDMG CMG The panel time series econometric approach is given particular attention in this study for a number of reasons (for a detailed discussion, see Eberhardt and Teal, 2011a). First, we know that many macrovariables are potentially nonstationary (Nelson and Plosser, 1982; Granger, 1997; Pedroni, 2007), a property that cannot be rejected for the variables in our data (see supplemental appendix S1). When variables are nonstationary, standard regression output must be treated with extreme caution because results are potentially spurious. Provided variables (and 11 unobserved processes) are cointegrated; however, we can establish long-run equilibrium relationships in the data. The practical indication of cointegration is when regressions yield stationary residuals, whereas nonstationary residuals indicate a potentially spurious regression. Panel time series estimators can address this concern over spurious regression, and below, we investigate the residuals of each empirical model using panel unit root tests. Second, panel time series methods allow for parameter heterogeneity across countries, which, as discussed above, is a central interest in our analysis. Third, panel time series methods can address the problems arising from cross-section correlation. Whether this is the result of common economic shocks or local spillover effects, cross-section correlation can potentially induce serious bias in the estimates because the impact assigned to an observed covariate in reality confounds its impact with that of the unobserved processes. Although the panel time series approach does not allow us to quantify their impact, common shocks and local spillovers can be accommodated in the empirical analysis to obtain unbiased technology coefficients for the observable inputs. Below, we will employ diagnostic tests to analyze each model’s residuals for the presence or absence of cross-section dependence. We introduce the Common Correlated Effects (CCE) estimators developed in Pesaran (2006) and extended to nonstationary variables in Kapetanios, Pesaran, and Yamagata (2011) in some more detail because relatively few applied studies employ these estimators (e.g., Holly, Pesaran, and Yamagata, 2010; Moscone and Tosetti, 2010; Cavalcanti, Mohaddes, and Mehdi, 2011; Eberhardt, Helmers, and Strauss, forthcoming). 16 The CCE estimators augment the regression equation with cross-section averages of the dependent ( yt ) and independent variables ( xt ) to account for the presence of unobserved 12 common factors with heterogeneous impact. For the Mean Group version (CMG), the individual country regression is specified as k (7) yit = ai + bi′xit + c0i yt + ∑ cmi xmt + eit m =1 ˆ are averaged across countries similar to the practice In a second step, the parameter estimates bi in the Pesaran and Smith (1995) Mean Group (MG) estimator. 17 The pooled version (CCEP) is specified as yit = ai + b′ xit + ∑ c0i ( yt D j ) + ∑∑ cmi (xmt D j ) + eit N k N (8) j =1 m =1 j =1 where Dj represents country dummies. 18 The CMG is thus a simple extension to the Pesaran and Smith (1995) MG estimator based on country-specific OLS regressions, whereas the CCEP is a standard fixed effects estimator augmented with additional regression terms. To obtain insight into the mechanics of this approach, consider the cross-section average of our model in equation (4). As the cross-section dimension N increases, given ε t = 0 , we obtain (9) yt = α + β ′ xt + λ ′ f t ⇔ f t = λ −1 ( yt − α − β ′ xt ) This simple derivation provides a powerful insight: working with the cross-sectional means of y and x can account for the impact of unobserved common factors (TFP) in the production process. 19 Given the assumed heterogeneity in the impact of unobserved factors across countries (λi), the estimator is implemented in the manner detailed above, which allows for each country i to have different parameter estimates for yt and the xt and, thus, implicitly for ft. Simulation studies (Pesaran, 2006, Coakley, Fuertes, and Smith, 2006; Kapetanios, Pesaran, and Yamagata, 2011; Pesaran and Tosetti, 2011) have shown that this approach works well even when the cross- 13 section dimension N is small, when variables are nonstationary, cointegrated, or not integrated, in the presence of local spillovers and global/local business cycles and when the relationship is subject to structural breaks. 20 In the present study, we implement two versions of the CCE estimators in the sector-level regressions: estimators in a standard form as described above and estimators in a variant form that includes the cross-section averages of the input and output variables from both sectors. This variant specification allows for cross-section dependence across sectors, albeit at the cost of a reduction in degrees of freedom. It is conceivable that the evolution of the agricultural sector in developing countries influences that of the wider economy in general and the manufacturing sector in particular, such that this extension is sensible in the dual economy context. This completes our discussion of the empirical implementation within each sector/level of aggregation. We highlight the direct link between the issues that these estimators seek to address and the problem of identifying the technology parameters of interest raised in the previous section. Heterogeneity in the impact of observables and unobservables across countries can be directly interpreted as differences in the production technology and a differential TFP evolution across countries. The above discussion suggests that, from an economic theory standpoint, there are reasons to prefer a more flexible empirical approach. Empirically, however, we do not impose this more flexible approach on our data. We compare models with differing degrees of parameter heterogeneity and use established econometric diagnostics (tests for residual stationarity and cross-section independence) to identify the models that are rejected and those that are supported by the data. 14 <>Data Descriptive statistics and a more detailed discussion of the data can be found in the appendix. We conduct our empirical analysis with four datasets: (i) for the agricultural sector, building on the sectoral investment series collected by Crego, Larson, Butzer, and Mundlak (1998) and output from the WDI (World Bank, 2008) as well as sectoral labor and land data from FAO (2007); (ii) for the manufacturing sector, building on the sectoral investment series collected by Crego, Larson, Butzer, and Mundlak (1998), output data from the WDI, and labor data from UNIDO (2004); (iii) for a stylized aggregate economy made up of the aggregated data for the agriculture and manufacturing sectors; 21 (iv) for the aggregate economy, building on data provided by the PWT (we use version 6.2, Heston, Summers, and Aten 2006). The capital stocks in the agriculture, manufacturing, and PWT samples are constructed from investment series following the perpetual inventory method (see Klenow and Rodriguez- Clare, 1997b). For the aggregated sample, we simply added up the sectoral capital stocks. A comparison across sectors and with the stylized aggregate sector is possible because of the efforts by Crego, Larson, Butzer, and Mundlak (1998) in providing sectoral investment data for agriculture and manufacturing. All monetary values in the sectoral and stylized aggregated datasets are transformed into US dollar values for the year 1990 (in the capital stock case, this transformation is applied to the investment data), following Martin and Mitra (2002). In light of concerns that the stylized aggregate economy data might not offer a sound representation of true 15 aggregate economy data, we have adopted the PWT data, which measure monetary values in international dollars (purchasing power parity adjusted), as a benchmark for comparison. Despite a number of vocal critics (e.g., Johnson, Larson, Papageorgiou, and Subramanian, 2009), the PWT data are undoubtedly the most popular macrodataset for cross-country empirical analysis. 22 Our sample is an unbalanced panel 23 for 1963 to 1992, consisting of 40 developing and developed countries with a total of 918 observations (average T=23). Our aim is to compare estimates across the four datasets, which requires us to match the same sample, thus reducing the number of observations to the smallest common denominator. Only eight countries in our sample are in Africa, whereas approximately half are present-day ‘industrialized economies.’ However, these numbers are deceiving if one recalls that structural change and development in many of these industrialized economies has primarily been achieved during our period of study. For example, prior to 1964, GDP per capita was higher in Ghana than in South Korea. In 1970, the share of agricultural value-added in GDP for Finland, Ireland, Portugal, and South Korea amounted to 13 percent, 16 percent, 31 percent, and 26 percent, respectively, whereas the 1992 figures were 5 percent, 8 percent, 7 percent, and 8 percent. This is strong evidence of economies undergoing structural change. A detailed description of our sample is available in table A1 and descriptive statistics for each sample are provided in table A2. <>EMPIRICAL RESULTS Panel unit root and cross-section dependence tests for our data are available in the supplemental appendix (S1, S2) of the paper. We adopt the Pesaran (2007) CIPS panel unit root test to analyze the time series properties of each variable series. The results provide strong indication that 16 variables in log levels for the agriculture and manufacturing data as well as the two aggregate economy representations are nonstationary. A number of formal and informal tests were conducted to investigate cross-section correlation in the data. The results (see supplemental appendix S2) show very high average absolute correlation coefficients for the data in log levels and in the data represented as growth rates. Formal tests for cross-section dependence (Pesaran, 2004; Moscone and Tosetti, 2009) reject cross-section independence in virtually all variable series tested. Below, we discuss the empirical results from sectoral production function regressions for agriculture and manufacturing, first assuming technology parameter homogeneity and then allowing for differential technology across countries. For all regression models, we report residual diagnostic tests, including the Pesaran (2007) panel unit root test (we summarize results using I(0) for stationary residuals, I(1) for nonstationary residuals, and I(1)/I(0) for ambiguous results), and the Pesaran (2004) cross-section dependence (CD) test (H0: cross-section independence), which we use to build our judgment for a preferred empirical model. Residual nonstationarity invalidates the inferential tools (for example, t-statistics) employed (Kao, 1999) and indicates that regression results are potentially spurious. In the same way that serial dependence indicates dynamic misspecification, residual cross-section dependence violates the assumption that the error terms are independent and identically distributed (iid). This suggests that the specific model tested fails to adequately address the correlation of inputs, output, and unobservables across different countries, induced by, for example, common shocks or local spillover effects. 24 Note that our empirical regressions express all variables in per-worker terms (in logs). The inclusion of the log labor variable therefore indicates the deviation from constant returns to 17 ˆ +β scale (i.e., β L ˆ +β K ˆ − 1 ): a positive (negative) significant coefficient on log labor indicates N increasing (decreasing) returns; an insignificant coefficient indicates constant returns. The coefficient on labor in the regression is thus not the output elasticity with respect to labor, which we also report in a lower panel of each table (‘Implied β ˆ ’) 25 along with the returns to scale L (‘Implied RS’). This setup allows for an easy imposition of constant returns (CRS) by dropping the log labor variable from the model. In each table, Panel (A) shows results with no restrictions on returns to scale, whereas Panel (B) imposes CRS. <>Pooled Models Table 2 presents the empirical results for agriculture and manufacturing. Beginning with agriculture, the empirical estimates for models [1] and [2] neglecting cross-section dependence are quite similar, with the capital coefficient of about .63 and statistically significant decreasing returns to scale. The land coefficient is insignificant in all pooled specifications, except in the 2FE model, where it carries a negative sign. Diagnostic tests indicate that the residuals in these models are cross-sectionally dependent and that the standard POLS and 2FE models yield nonstationary residuals and, thus, might represent spurious regressions. The two CCEP models yield stationary and cross-sectionally independent residuals, capital coefficients of approximately .5 and insignificant land coefficients. There is no substantial change in these results when CRS (Panel (B)) is imposed, with the exception of the 2FE estimates, where the land variable (previously negative and significant) is now insignificant and the capital coefficient has become further inflated. Land is still insignificant, but in models [3] and [4] it now has a plausible coefficient estimate. 18 TABLE 2. Pooled Regression Models for Agriculture and Manufacturing PANEL (A) UNRESTRICTED RETURNS TO SCALE Agriculture Manufacturing [1] [2] [3] [4] [5] [6] [7] [8] POLS 2FE CCEP CCEPb POLS 2FE CCEP CCEPb log labor −0.060 −0.199 −0.266 −0.142 0.043 0.081 0.082 0.002 βL + βK (+ βN) − [7.20]** [9.60]** [2.13]* [0.55] [3.53]** [4.35]** [1.53] [0.03] 1 log capital pw 0.618 0.661 0.480 0.531 0.897 0.845 0.472 0.469 βK [73.80]** [43.62]** [9.87]** [5.92]** [55.38]** [32.69]** [7.62]** [5.34]** log land pw 0.011 −0.160 −0.165 0.052 βN [1.02] [4.93]** [0.98] [0.20] implied RS † DRS DRS DRS CRS IRS IRS CRS CRS implied βL ‡ 0.322 0.300 0.254 0.469 0.147 0.236 0.528 0.532 ê integrated â—Š I(1) I(1) I(0) I(0) I(1) I(1) I(0) I(0) CD test p-value # 0.00 0.00 0.45 0.38 0.19 0.34 0.00 0.93 R-squared 0.94 0.86 1.00 1.00 0.84 0.67 1.00 1.00 RMSE 0.446 0.127 0.095 0.086 0.439 0.128 0.090 0.066 PANEL (B) CONSTANT RETURNS TO SCALE IMPOSED Agriculture Manufacturing [1] [2] [3] [4] [5] [6] [7] [8] POLS 2FE CCEP CCEPb POLS 2FE CCEP CCEPb log capital pw 0.644 0.725 0.496 0.526 0.919 0.860 0.490 0.500 βK [85.46]** [48.87]** [11.22]** [6.70]** [70.80]** [34.01]** [13.55]** [8.38]** log land pw 0.008 −0.007 0.092 0.126 βN [0.66] [0.20] [1.24] [1.02] implied βL ‡ 0.356 0.275 0.504 0.474 0.081 0.140 0.510 0.500 ê integrated â—Š I(1) I(0)/I(1) I(0) I(0) I(1) I(1) I(0) I(0) CD test p-value # 0.00 0.00 0.87 0.52 0.02 0.00 0.00 0.00 R-squared 0.94 0.85 1.00 1.00 0.84 0.66 1.00 1.00 RMSE 0.457 0.132 0.098 0.089 0.444 0.129 0.094 0.074 Source: Authors’ analysis based on data sources discussed in the text. Note: N = 40 countries, 918 observations, average T = 23. Dependent variable: value-added per worker (in logs). All variables are suitably transformed in the 2FE equation. Estimators: POLS, pooled OLS; 2FE, Two-way Fixed Effects; CCEP, Common Correlated Effects, Pooled version (see below). We omit reporting the estimates on the intercept term. Absolute t-statistics reported in brackets are constructed using White heteroskedasticity-robust standard errors. For CCEP in [3], [4], [7], and [8], we report results on the basis of bootstrapped standard errors (100 replications). Time dummies are included explicitly in [1] and [5] or implicitly in [2] and [6]. Augmentation with cross-section averages in [3], [4], [7], and [8] (estimates not reported). b The model includes cross-section averages for both the agricultural and manufacturing sector variables. † Returns to scale are based on the significance of the log labor estimate. ‡ Based on returns to scale and significant parameter estimates—see the main text. â—Š Order of integration of regression residuals is determined using Pesaran (2007) CIPS 19 (full results available on request), H0: nonstationary residuals. # Pesaran (2004) CD-test, H0: cross-sectionally independent residuals. RMSE: root mean squared error. * significant at the 5 percent level, ** significant at the 1 percent level In the manufacturing data, the models ignoring cross-section dependence in [5] and [6] yield increasing returns to scale and capital coefficients in excess of .85. Residuals again display nonstationarity; however, the CD tests now imply that they are cross-sectionally independent. Surprisingly, the standard CCEP model in [7], with a capital coefficient of approximately .5 (as in agriculture data), does not pass the cross-section correlation test. However, further accounting for correlations across sectors in [8] yields favorable diagnostics and a similar capital coefficient. Following the imposition of CRS, all models reject cross-section independence, whereas parameter estimates are more or less identical to those in the unrestricted models. Based on these pooled regression results, the diagnostic tests (stationary and cross-section independent residuals) favor the CRS CCEP results in [3] and [4] for the agriculture data, whereas in the manufacturing data the unrestricted CCEP model in [8], which accounts for cross-sectoral impact, emerges as the preferred specification. Results for the other empirical models cannot be readily interpreted in the standard manner because of the presence of nonstationary and/or correlated residuals. 26 In sum, relying on diagnostic testing, the alternative CCEP estimator emerges as the preferred estimator for both the agriculture and manufacturing samples. For agriculture, the imposition of CRS seems valid, whereas for manufacturing, the data reject this restriction. Across preferred specifications, the mean capital coefficients for agriculture and manufacturing are quite similar, approximately .5. Our shift to heterogeneous technology models, discussed in the next section, will allow us to determine whether these results are representative of the underlying technology. Although the CCEP imposes common technology coefficients, theory 20 and simulations (Pesaran, 2006) have shown that if technology differs results reflect the mean coefficient across countries. However, outliers might exert undue influence on this mean. Therefore, our heterogeneous parameter models account for this possibility and report outlier- robust average coefficients. 27 <>Averaged Country Regressions Table 3 presents the robust means for each regressor across N country regressions for the unrestricted (Panel (A)) and CRS models (Panel (B)), respectively. The t-statistics reported for each average estimate test whether the average parameter is statistically different from zero, following Pesaran and Smith (1995). In addition, we report the share of countries for which the country results rejected CRS as well as the share of countries for which linear country trends are statistically significant (at the 10 percent level). 21 TABLE 3. Heterogeneous Parameter Models for Agriculture and Manufacturing (Robust Means) PANEL (A) UNRESTRICTED RETURNS TO SCALE Agriculture Manufacturing [1] [2] [3] [4] [5] [6] [7] [8] MG FDMG CMG CMGb MG FDMG CMG CMGb log labor −1.935 −0.474 −0.682 −0.068 −0.132 −0.127 0.069 0.003 βL + βK (+ βN) − 1 [2.43]* [0.53] [1.05] [0.08] [0.92] [1.15] [0.78] [0.03] log capital pw −0.084 0.133 0.496 0.360 0.195 0.179 0.525 0.284 βK [0.42] [0.58] [2.25]* [1.37] [1.32] [1.12] [6.46]** [3.35]** log land pw −0.430 −0.269 −0.445 −0.129 βN [1.46] [0.96] [1.44] [0.50] country trend/drift 0.015 0.010 0.015 0.018 [1.55] [1.06] [2.70]** [3.31]** implied RS † DRS CRS CRS CRS CRS CRS CRS CRS implied βL ‡ n/a n/a 0.504 n/a n/a n/a 0.475 0.717 reject CRS (10%) 0.38 0.20 0.23 0.23 0.50 0.13 0.38 0.25 sign. trends/drifts (10%) 0.40 0.18 0.40 0.20 ê integrated â—Š I(0) I(0) I(0) I(0) I(0) I(0) I(0) I(0) CD test p-value # 0.00 0.00 0.49 0.75 0.00 0.00 0.02 0.18 RMSE 0.081 0.094 0.069 0.059 0.080 0.077 0.068 0.047 Observations 918 872 918 918 918 872 918 918 PANEL (B) CONSTANT RETURNS TO SCALE IMPOSED Agriculture Manufacturing [1] [2] [3] [4] [5] [6] [7] [8] MG FDMG CMG CMGb MG FDMG CMG CMGb log capital pw −0.050 0.300 0.538 0.620 0.291 0.346 0.509 0.413 βK [0.29] [2.22]* [4.55]** [2.98]** [2.60]** [3.64]** [6.19]** [6.37]** log land pw 0.260 0.031 0.082 0.073 βN [1.03] [0.20] [0.47] [0.38] country trend/drift 0.016 0.014 0.012 0.013 [2.71]** [3.09]** [2.72]** [3.61]** implied βL ‡ n/a 0.700 0.462 0.380 0.709 0.654 0.491 0.588 sign. trends/drifts (10%) 0.45 0.13 0.55 0.23 ê integrated â—Š I(0) I(0) I(0) I(0) I(0) I(0) I(0) I(0) CD test p-value # 0.00 0.00 0.93 0.73 0.00 0.00 0.00 0.00 RMSE 0.087 0.096 0.076 0.068 0.088 0.078 0.080 0.059 Observations 918 872 918 918 918 872 918 918 Source: Authors’ analysis based on data sources discussed in the text. Note: N = 40 countries, average T = 23 (21.8 for FDMG). Dependent variable: value-added per worker (in logs). All variables are suitably transformed in the FD equations. Estimators: MG, Mean Group; FDMG, MG with variables in 22 first difference; CMG, Common Correlated Effects, Mean Group version. We report outlier-robust means; estimates on intercept terms are omitted. Absolute t-statistics are in brackets following Pesaran and Smith (1995). Estimates on cross-section averages in [3], [4], [7], and [8] are not reported. b The model includes cross-section averages for both the agricultural and manufacturing sector variables. † Returns to scale are based on the significance of the log labor estimate. ‡ Based on returns to scale and significant parameter estimates—see the main text. ‘reject CRS’ and ‘sign. trends/drifts’ report the share of countries where CRS is rejected and where country trends/drifts are statistically significant (in both cases, applying a 10 percent level of significance). â—Š Order of integration of regression residuals, determined using Pesaran (2007) CIPS (full results available on request), H0: nonstationary residuals. # Pesaran (2004) CD-test, H0: cross-sectionally independent residuals. RMSE: root mean squared error. * significant at the 5 percent level, ** significant at the 1 percent level Beginning with the unrestricted models in Panel (A), we observe that MG and FDMG estimates for the agriculture and manufacturing equations are very imprecise. Furthermore, in the agriculture model, MG yields decreasing returns to scale that are nonsensical in magnitude. Simulations for nonstationary and cross-sectionally dependent data (Coakley, Fuertes and Smith, 2006; Bond and Eberhardt, 2009) show that MG estimates are severely affected by their failure to account for cross-section dependence, and this is the likely cause of these results. Standard CMG in agriculture and manufacturing yield similar capital coefficients of approximately .5, whereas the alternative CMG results provide somewhat lower estimates, approximately .3 (these models allow for agriculture sectors to influence manufacturing sectors and vice-versa). Diagnostics are sound in the case of the two CMG results in agriculture, but only for the alternative CMG estimator in manufacturing (cross-sectionally dependent residuals in model [7]). Panel (B) shows how the imposition of constant returns affects the results: MG and FDMG in both sectors are generally more sensible, but the diagnostic tests suggest cross-section correlation in the residuals that might indicate serious misspecification. The two CMG estimates for agriculture are now more similar. Land coefficients are still insignificant, but positive. Manufacturing results for the standard CMG remain virtually unchanged from the unrestricted 23 model; however, diagnostic tests still indicate cross-sectionally dependent residuals. The same caveat applies to the alternative CMG for manufacturing. In sum, the diagnostic tests support the use of the CRS versions of the CMG estimators for agricultural data and the unrestricted returns to scale version of the ‘alternative’ CMG estimator for the manufacturing data. These preferred models suggest that average technology differs across sectors, with a manufacturing capital coefficient of approximately .3 and an agriculture capital coefficient of approximately .5. 28 The results for the land coefficient, where our preferred estimates indicate a positive, albeit statistically insignificant, average coefficient, warrant additional comment. Given the relative persistence of the area under cultivation, the short time series dimension of the data might be responsible for this outcome. Any form of land quality adjustment would require time- varying information on land quality, which is not available at an annual rate over a long time horizon. 29 Time-invariant adjustments are accounted for by the country-specific intercepts. Because of the aim of our study, we do not put too much emphasis on providing the best estimate for the ‘true’ sectoral technology coefficients. Instead, we highlight the discrepancy between these sectoral results and the results obtained when analyzing aggregate economy data. <>AGGREGATION VERSUS HETEROGENEITY In this section, we provide practical evidence that the use of an aggregate production function will lead to severely biased technology estimates. We then provide some insights into the nature of technology heterogeneity across sectors and countries. 24 <>Aggregation Bias: Empirical Evidence To investigate the impact of aggregation across heterogeneous sectors with technology furthermore differing across countries, we create a stylized ‘aggregated economy’ from our data on agriculture and manufacturing. To avoid the suggestion that our results might be critically distorted by this overly simplistic design, we compare them with those obtained from a matched sample of aggregate economy data from the PWT. Pre-estimation testing reveals that both datasets utilized in this section consist of nonstationary series that are cross-sectionally correlated; the results are provided in the supplemental appendix (S1, S2). 30 We begin our discussion with the results for the pooled models in table 4. Across all specifications, the estimated capital coefficients in the stylized aggregated data far exceed those derived from the respective agriculture and manufacturing samples in table 2. Furthermore, the patterns across estimators are replicated one-to-one in the PWT data, which also yield excessively high capital coefficients across all models. All models suffer from cross-sectional dependence in the residuals. There are also indications that the residuals in the CCEP model for the aggregated data are nonstationary (those in the two other specifications in levels are always nonstationary). We also investigate the impact of human capital (via a proxy variable, average years of schooling attained in the population over 15 years of age) in these aggregate economy data models, but as the results in the supplemental appendix (S4) reveal, the basic bias remains. 25 TABLE 4. Pooled Regression Models for Aggregated and PWT Data PANEL (A) UNRESTRICTED RETURNS TO SCALE Aggregated data Penn World Table data [1] [2] [3] [4] [5] [6] POLS 2FE CCEP POLS 2FE CCEP log labor 0.010 −0.082 −0.054 0.035 −0.131 −0.097 βL + βK (+ βN) − [1.32] [3.75]** [0.78] [7.57]** [4.57]** [0.76] 1 log capital pw 0.828 0.798 0.657 0.742 0.704 0.631 βK [107.55]** [66.20]** [19.43]** [113.76]** [51.43]** [13.71]** implied RS † CRS DRS CRS IRS DRS CRS implied βL ‡ 0.172 0.120 0.343 0.293 0.165 0.369 ê integrated â—Š I(1) I(1) I(0)/I(1) I(1) I(1) I(0) CD test p-value # 0.40 0.00 0.04 0.10 0.00 0.00 R-squared 0.96 0.89 1.00 0.96 0.82 1.00 RMSE 0.358 0.109 0.078 0.195 0.095 0.061 observations 918 918 918 912 912 912 PANEL (B) CONSTANT RETURNS TO SCALE IMPOSED Aggregated data Penn World Table data [1] [2] [3] [4] [5] [6] POLS 2FE CCEP POLS 2FE CCEP log capital pw 0.825 0.824 0.666 0.730 0.745 0.651 βK [120.48]** [73.01]** [20.85]** [130.30]** [63.41]** [19.33]** implied βL ‡ 0.175 0.176 0.334 0.270 0.255 0.349 ê integrated â—Š I(1) I(1) I(0)/I(1) I(1) I(1) I(0) CD test p-value # 0.31 0.30 0.06 0.00 0.00 0.00 R-squared 0.96 0.88 1.00 0.96 0.82 1.00 RMSE 0.358 0.109 0.086 0.202 0.097 0.069 observations 918 918 918 912 912 912 Source: Authors’ analysis based on data sources discussed in the text. Note: See table 2 for definitions and further details on diagnostic testing. * significant at the 5 percent level, ** significant at the 1 percent level 26 TABLE 5. Heterogeneous Parameter Models for Aggregated and PWT Data (Robust Means) PANEL (A) UNRESTRICTED RETURNS TO SCALE Aggregated data Penn World Table data [1] [2] [3] [4] [5] [6] MG FDMG CMG MG FDMG CMG log labor −0.154 −0.079 0.117 −1.152 −1.681 −0.389 βL + βK (+ βN) − 1 [0.36] [0.25] [0.62] [1.23] [2.28]* [1.03] log capital pw 0.220 0.297 0.609 0.655 1.004 0.753 βK [1.17] [1.66] [6.11]** [4.22]** [5.38]** [5.26]** country trend/drift 0.025 0.020 0.010 −0.010 [2.73]** [2.42]* [0.90] [1.88] implied RS † CRS CRS CRS CRS DRS CRS implied βL ‡ n/a n/a 0.391 0.345 n/a 0.247 reject CRS (10%) 0.60 0.23 0.38 0.68 0.33 0.53 sign. trends/drifts (10%) 0.55 0.33 0.43 0.18 ê integrated â—Š I(0) I(0) I(0) I(0) I(0) I(0) CD test p-value # 0.00 0.00 0.00 0.00 0.00 0.16 RMSE 0.081 0.094 0.051 0.080 0.077 0.041 observations 918 872 918 918 872 918 PANEL (B) CONSTANT RETURNS TO SCALE IMPOSED Aggregated data Penn World Table data [1] [2] [3] [5] [6] [7] MG FDMG CMG MG FDMG CMG log capital pw 0.293 0.202 0.725 0.619 0.923 0.811 βK [1.92] [1.90] [10.95]** [6.36]** [6.01]** [12.09]** country trend/drift 0.014 0.002 −0.007 [2.93]** [0.50] [1.97]* implied βL ‡ n/a n/a 0.275 0.381 0.077 0.189 sign. trends/drifts (10%) 0.48 0.28 0.48 0.25 ê integrated â—Š I(0) I(0) I(0) I(0) I(0) I(0) CD test p-value # 0.00 0.00 0.05 0.00 0.00 0.00 RMSE 0.074 0.064 0.067 0.061 0.044 0.059 observations 918 872 918 912 866 912 Source: Authors’ analysis based on data sources discussed in the text. Note: See table 3 for definitions and further details on diagnostic testing. * significant at the 5 percent level, ** significant at the 1 percent level 27 In the results from averaged country regressions in table 5, the MG and FDMG models indicate differences between the aggregated and PWT data. The capital coefficients in the MG model are estimated very imprecisely but seem to center at approximately .3, whereas in the FDMG model, they are considerably higher, approximately .7 to .9. The results for the conceptually superior CMG, however, are very consistent between the two samples and across unrestricted and CRS models, with capital coefficients of approximately .7. Residual testing suggests that all specifications yield stationary residuals. Cross-section correlation tests reject independence in all but the PWT data unrestricted CMG residual series. For ease of comparison, table 6 provides an overview of the preferred empirical results at the sectoral and aggregate data level, assuming common technology (top panel) or technology differences across countries (bottom panel). 31 Thus, across a large number of empirical specifications, we have found a systematic difference between the results for the sectoral data, on the one hand, and the results for the stylized aggregated and aggregate economy data, on the other hand. Theoretical work by Hsiao, Shen, and Fujiki (2005) provides insight into potential causes of this phenomenon. These authors find that if variable series are nonstationary and cointegrated at the ‘micro unit’ level (in their empirical illustration, in Japanese prefectures), then aggregation will only yield stable macrorelations if all technology parameters are the same across units or if the weights used to construct the aggregate economy series from the micro units stay the same over time. In terms of our empirical question, time-invariant weights would imply the absence of any structural change in the economy over time, which clearly is not given here. 28 TABLE 6. Comparison of Preferred Models PANEL (A) HOMOGENEOUS TECHNOLOGY Sectoral Data Aggregate Data Agri Manu Stylized PWT [1] [2] [3] [4] CCEPb CCEPb CCEP CCEP log labor 0.002 −0.097 βL + βK (+ βN) − 1 [0.03] [0.76] log capital pw 0.526 0.469 0.666 0.631 βK [6.70]** [5.34]** [20.85]** [13.71]** log land pw 0.126 βN [1.02] implied βL ‡ 0.474 0.532 0.334 0.369 ê integrated â—Š I(0) I(0) I(0)/I(1) I(0) CD test p-value # 0.52 0.93 0.06 0.00 RMSE 0.089 0.066 0.086 0.061 observations 918 918 918 912 PANEL (B) HETEROGENEOUS TECHNOLOGY Sectoral Data Aggregate Data Agri Manu Stylized PWT [1] [2] [3] [4] CMGb CMGb CMG CMG log labor 0.003 −0.389 βL + βK (+ βN) − 1 [0.03] [1.03] log capital pw 0.620 0.284 0.725 0.753 βK [2.98]** [3.35]** [10.95]** [5.26]** log land pw 0.073 βN [0.38] implied βL ‡ 0.380 0.717 0.275 0.247 reject CRS (10%) 0.25 0.53 ê integrated â—Š I(0) I(0) I(0) I(0) CD test p-value # 0.73 0.18 0.05 0.16 RMSE 0.068 0.047 0.067 0.041 observations 918 918 918 912 Source: Authors’ analysis based on data sources discussed in the text. Note: See tables 2 and 3 for definitions and further details on diagnostic testing. In the agricultural regressions where the CCEP and CCEPb both had sound diagnostics (and very similar coefficient estimates), we report results for the CCEPb because it allows for greater flexibility. * significant at the 5 percent level, ** significant at the 1 percent level 29 <>Technology Heterogeneity Our empirical analysis has been based on the theoretical model first developed in Mundlak (1988). As the empirical implementations in Mundlak, Larson, and Butzer (1999) and Mundlak, Butzer, and Larson (2012), we have had to make simplifying assumptions to take this model to the data. By assuming parameter constancy over time, we have had to impose the same restriction on the parameter coefficients in the time series dimension as these studies. Our empirical model has however allowed for more flexibility in the cross-section dimension, where we have allowed for parameter heterogeneity across countries within each of the sectors. In the following, we critically review these modeling choices. First, we discuss our insights into technology heterogeneity across countries, and then, we provide evidence for parameter constancy. From the empirical results in table 2, all pooled specifications, except for the CCEP estimators, yield residual series that are nonstationary. Therefore, we cannot rule out that the estimated coefficients are spurious. In addition the unrestricted POLS and 2FE models for agriculture as well as all POLS and 2FE models where the constant return to scale restriction has been imposed (a restriction rejected by the data) result in cross-sectionally dependent residual series. In contrast, the preferred heterogeneous parameter models for agriculture and manufacturing in table 6 do not suffer from nonstationary or cross-sectionally correlated residuals (or both). In conclusion, it appears that the data for both sectors reject the crucial assumptions underlying a pooled regression model (well-behaved residuals) and cannot reject those underlying a heterogeneous one. We interpret this evidence for misspecification in the pooled models as an indication of heterogeneous production technology within each sector of production. 32 30 Given this finding for heterogeneity, one would naturally want to investigate the patterns of parameter heterogeneity across countries. With the specific data at our disposal (unbalanced panel, average T = 23), a closer analysis of whether we can identify discernible patterns must be interpreted with caution, and we view our results below as merely indicative. Previous empirical analysis averaging individual country regressions has frequently observed that although country estimates are widely dispersed and, at times, economically implausible, averages represent very plausible estimates (Boyd and Smith 2002; Baltagi et al. 2003). Pedroni (2007: 440) calls for caution when interpreting the estimates for any individual country because the “long-run signals contained in [limited] years of data may be relatively weak,â€? whereas the cross-section averages will amplify the signal patterns sufficiently. Abstracting from the presence of common factors, Boyd and Smith (2002) discuss this issue somewhat more formally. Arguing for omitted variable bias in the country regression, assume a simple data generating process (10) yit = β i xit + wit + uit where w represents all variables omitted from the empirical model. Here, w is assumed to be correlated with the included regressor x in a particular country i and over a particular period of time T, indicated by the parameter subscript iT: (11) wit = biT xit + vit In a single country regression of y on x, we obtain (12) Εβ( ) ˆ = β +b i i iT If the wit are structural, operating in all time periods and countries, this would cause a systematic bias in the cross-country average estimate β ˆ MG . 33 If they are not structural but are only correlated in a particular subsample, they will lead to bias in these countries’ estimates of βi. 31 However, averaging estimates across countries in this case yields E(biT) = 0, such that the biases cancel out in the average estimate β ˆ MG . The same principle applies to the CMG estimators in the presence of unobserved common factors. We perform a basic analysis to obtain insight into the patterns of technology heterogeneity across countries. We begin by plotting the country-specific capital coefficients from the preferred agriculture and manufacturing models in table 6 against country mean aggregate income per capita (from PWT, in logs). Figure 1 presents individual country estimates and linear regression lines together with 90 percent confidence intervals for the two sectors. 34 Although the capital coefficients in agriculture appear to rise with income and those in manufacturing appear to fall, the confidence intervals indicate that neither relationship is statistically precise, and (full-sample) robust regressions of the two equations yield statistically insignificant slope coefficients. 35 Figure 2 is somewhat less ambitious than the previous analysis. This figure provides density and distribution plots to highlight the differential distribution of capital coefficients in the agriculture and manufacturing equations. In the density plots on the left, manufacturing coefficients (dashed line) are distributed over a much narrower range than the agriculture coefficients. In other work on the cross-country production function in agriculture (Eberhardt and Teal 2011b), we have argued that this heterogeneity 36 might be due in part to the difference in output structure (wheat vs. rice vs. livestock) and the commercialization of agriculture (subsistence vs. industrialized farming), both of which are functions of the level of development and productive specialization across countries. Manufacturing production, in comparison, represents a more homogeneous undertaking, such that the heterogeneity might be less pronounced. As the cumulative distribution plots on the right of figure 2 indicate, the robust 32 means that we report in our regression results do not distort the underlying relative relationship, namely, that most agriculture coefficients are further to the right and thus larger than those for manufacturing. The graphs in figures 3 and 4 address the question of slope parameter constancy over time by estimating each model with an increasing number of observations and plotting the resulting estimates. 37 We plot the estimates for the CCEP (in figure 3) and CMG (figure 4) capital coefficient β ˆ from the preferred agriculture, manufacturing and aggregated data models, K corresponding to the models presented in columns [1] to [3] of table 6, Panels (A) and (B) for pooled and heterogeneous parameter models, respectively. In each plot, the number of observations increases as we move to the right. In the left plots, all regressions include data from 1963 to 1979. These graphs show the parameter estimates when we add one year of data at a time, at the end of the sample period, until we reach 1992. In the right plots, all regressions include data from 1976 to 1992. These graphs show the parameter estimates when we add one year at a time, at the beginning of the sample period, until we reach 1963. In each case, we begin (on the left of the plot) with a reduced sample, where Timin=11 and Timax =18, corresponding to n=473 (623 for the right plot) observations from N=34 (38) countries. The solid grey line indicates the results for the aggregated data, and solid and dashed black lines indicate results for agriculture and manufacturing, respectively. In the CCEP plots in the second row of figure 3, we indicate the 90 percent confidence intervals for the agriculture (grey area) and manufacturing (area between the dashed lines) estimates. The estimates for the aggregated data are omitted to improve legibility. In the CMG plots in figure 4, squares indicate that coefficients are statistically insignificant at the 10 percent level. 33 We use these graphs to provide insight into two specific questions: (i) From an econometric point of view, are the β ˆ coefficients on average constant over time? (ii) Following K the suggestion in Mundlak, Butzer, and Larson (2012), if the β K parameters are functions of common factors (“state variables,â€? in their terminology), implying that any estimated coefficient is a constant associated with the specific sample under analysis β( ) ˆ ( s ) , we would expect results K to vary over time given different samples. Do our recursive plots provide evidence for sample dependence in the estimated β K coefficients? The answers to (i) and (ii) are clearly dependent on each other because these questions seek the same information but are motivated from econometric and economic theory, respectively. In the pooled specification where the preferred CCEP models yield relatively similar capital coefficients of approximately .5 in the full samples, the recursive regressions in figure 3 suggest that the agriculture (manufacturing) capital coefficient decreases (increases) over time as we increase our sample. Because the same pattern results whether we add years at the beginning or the end of the sample, it seems that this result is driven by small sample bias: as more observations become available in each country, the results become more precise. The associated confidence intervals included in the plots in the second row of the figure support this hypothesis. Coefficient estimates in the extreme left of each plot (the reduced sample) are contained within the 90 percent confidence interval of the coefficient estimates at the extreme right of each plot (the full sample). Turning to the heterogeneous parameter model estimates in figure 4, the robust mean coefficients marked with a square are statistically insignificant. If we eliminate these estimates from the graphs, we find remarkably stable recursive estimates for both the manufacturing and agriculture capital coefficients. Thus, the answer to question (i) on parameter constancy is a tentative ‘yes.’ The answer to question (ii) on sample dependence is a tentative 34 ‘no.’ The former answer suggests that the assumption βit = βi is valid, and the latter answer implies that we find no evidence for a systematic relationship between technology coefficients and unobserved time-varying factors (or state variables). <>CONCLUDING REMARKS In this paper, we employed unique panel data for agriculture and manufacturing sectors to estimate sector-level and aggregate production functions. Our empirical analysis emphasized contributions from the recent panel time series econometrics literature and, in particular, emphasized the importance of parameter heterogeneity across countries as well as sectors. In addition, we took the nonstationarity of observable and unobservable factor inputs into account and addressed concerns over cross-sectional dependence commonly found in macropanel data. We draw the following conclusions from our attempts to highlight the importance of structural makeup and change for the empirical analysis of cross-country growth and development. First, duality matters. The empirical analysis of growth and development across countries benefits significantly from the consideration of the modern and traditional sectors that make up a developing economy. Comparing our analysis of agriculture and manufacturing with that of a stylized aggregated economy suggests that the latter analysis yields severely distorted empirical results with serious implications for estimates of TFP derived from aggregate analysis. An analysis of PWT data in parallel with the aggregated data suggests that this finding is not an artifact of our stylized empirical setup. Growth accounting exercises at the aggregate economy level thus provide misleading results in that any technology differences across sectors within countries are assumed away, and the constructed TFP series might reflect this misspecification rather than true technological progress. 35 Second, focusing on technology and TFP within each sector, we find that the data rejected empirical specifications that impose common technology, common TFP evolution, and the independence of shocks across countries. Thus, the assumption of common technology in the existing work on the dual economy model using growth accounting methods is not in line with the data. If these restrictions were correct, we should be able to find pooled technology models that satisfy the most basic assumptions of stationary and cross-sectionally independent residuals. In practice, however, we find results that are much more in line with the notion of differential technology across countries, for which we have provided support from economic theory. Third, the presence of unobserved common factors, both as latent processes driving all observables and as a conceptual framework for TFP, has been shown to have a substantial impact on empirical results. Much of the cross-country empirical literature ignores the presence of global economic shocks with heterogeneous impact and spillovers across country borders. With the experience of the recent global financial crisis, it is now more evident than ever that economic performance in a globalized world is highly interconnected and that domestic markets cannot ‘de-couple’ from the global financial and goods markets. In econometric terms, latent forces drive all of the observable and unobservable variables and processes that we attempt to model. An important implication is that commonly applied instruments in cross-country growth regressions are invalid, a sentiment that is echoed in recent work by Bazzi and Clemens (2009). We argue that panel time series methods allow us to develop a new type of cross-country empiric that is more informative and more flexible in the problems that it can address than its critics have allowed. Fourth, we are aware of the serious data limitations for sectoral data from developing economies, particularly regarding the high data requirements of panel time series methods. The 36 Crego, Larson, Butzer, and Mundlak (1998) dataset allowed us to directly compare sectoral analysis between manufacturing and agriculture. However, for alternative research questions, the use of data from one sector or the other might be sufficient. There are at least two existing data sources, FAO data for agriculture and UNIDO data for manufacturing, which are ideally suited to inform this type of analysis at the sector level for a large number of countries and over a substantial period of time. Cross-country panel data play a crucial role in policy analysis for development. The present work represents a first step toward establishing an empirical version of a dual economy model to inform this literature. From the perspective of dual economy theory, we have only analyzed one aspect of the canon, technology heterogeneity between traditional and modern sectors of production. In future work, we will implement empirical tests to investigate the suggested sources of growth arising from this literature, including marginal factor product differences and heterogeneous TFP levels for growth across sectors. 37 FIGURES AND NOTES TO FIGURES FIGURE 1. Investigating Technology Heterogeneity and Income Note: These graphs investigate the issue of slope heterogeneity across countries. We plot the CMG country estimates for the capital coefficient βK from the preferred heterogeneous agriculture and manufacturing models, corresponding to the models presented in columns [1] and [2] of table 6, Panel (B). The shaded areas represent the 90 percent confidence intervals of a linear regression of the respective capital coefficients on mean income per capita, where means are computed from aggregate PWT data over the entire 1963 to 1992 time horizon. Robust regression of these relationships yields the following (statistically insignificant) slope parameters (standard errors in square brackets): .108 [.217] and −.079 [.087] for agriculture and manufacturing, respectively. For both plots, we exclude outliers on the basis of weights computed from these robust regressions. Any coefficient with a weight less than .5 is excluded from the graph (for agriculture, five countries; for manufacturing, one country). Source: Authors’ analysis based on data sources discussed in the text. FIGURE 2. Investigating Technology Heterogeneity across Sectors 38 Note: These graphs investigate the issue of slope heterogeneity across sectors. In the density plots on the left, we estimate separate Epanechnikov kernels (using common bandwidth .34) for the agriculture (solid line) and manufacturing (dashed line) capital coefficients from table 6, Panel (B); the right plots chart the cumulative distribution functions of the respective sector coefficients. For both sets of plots, we follow the same strategy as in figure 1 to exclude extreme outliers. Source: Authors’ analysis based on data sources discussed in the text. FIGURE 3. Investigating Technology Constancy—Recursive Estimates (i) Note: These graphs investigate the issue of slope parameter constancy over time by estimating each model with an increasing number of observations and plotting the resulting estimates. We plot the robust estimates for the CCEP capital coefficients from the preferred agriculture, manufacturing, and aggregated data models, corresponding to the results presented in columns [1] to [3] of table 6, Panel (A). In each plot, the number of observations increases as we move to the right. In the left plots, all regressions include data from 1963 to 1979. The graphs then show the parameter estimates when we add one year of data at a time, at the end of the sample period, until we reach 1992. In the right plots, all regressions include data from 1976 to 1992. The graphs show the parameter 39 estimates when we add one year at a time, at the beginning of the sample period, until we reach 1963. In each case, we begin (on the left of the plot) with a reduced sample where Timin = 11 and Timax = 18, corresponding to n = 473 (623 for the right plot) from N = 34 (38) countries. In each plot, the grey solid line represents aggregated data; black solid line, agriculture data; and black dashed line, manufacturing data. In the plots in the second row we indicate the 90 percent confidence intervals for the agriculture (grey area) and manufacturing (area between the dashed lines) estimates. Here, the estimates for the aggregated data are omitted to improve the legibility. Source: Authors’ analysis based on data sources discussed in the text. FIGURE 4. Investigating Technology Constancy—Recursive Estimates (ii) Note: These graphs investigate the issue of slope parameter constancy over time by estimating each model with an increasing number of observations and plotting the resulting estimates. We plot the robust estimates for the CMG capital coefficients from the preferred agriculture, manufacturing and aggregated data models, corresponding to the results presented in columns [1] to [3] of table 6, Panel (B). See figure 3 for further details on how these plots are constructed. Squares indicate coefficients that are statistically insignificant at the 10 percent level. 40 <>APPENDIX <>Data construction and descriptive statistics We use a total of four datasets in our empirical analysis, consisting of data for agriculture and manufacturing (Crego, Larson, Butzer, and Mundlak 1998; UNIDO 2004; FAO 2007), an ‘aggregated dataset’ in which the labor, output, and capital stock values for the two sectors are summed, and a PWT (6.2) dataset (Heston, Summers, and Aten 2006) for comparative purposes. The first three datasets differ significantly in their construction from the last, primarily in the choice of exchange rates and deflation: the first three datasets use international exchange rates for the year 1990, whereas the PWT dataset uses international dollars (purchasing power parity adjusted) with the year 2000 as the comparative base. The first three datasets thus emphasize traded goods, whereas the PWT is generally perceived to better account for nontradables and service. Provided that all monetary values incorporated in the variables for each regression are comparable (across countries and over time) and given that the comparison of sectoral and aggregated data with the PWT is intended for illustration purposes, we have no concerns about presenting results from these two conceptually different datasets. In all cases, the results presented are for matched observations across datasets, so that the four datasets are identical in terms of country and time period coverage. We prefer this design for direct comparison even though more observations are available for individual data sources, which could improve the robustness of empirical estimates. We provide details on the sample makeup in table A1. The next two subsections describe the data construction. Descriptive statistics for all variables in the empirical analysis are presented in table A2. 41 TABLE A1. Descriptive Statistics: Sample Makeup for all Datasets # ISO COUNTRY OBS # ISO COUNTRY OBS 1 AUS Australia 20 22 KEN Kenya 29 2 AUT Austria 22 23 KOR South Korea 29 3 BEL Belgium-Luxembourg 22 24 LKA Sri Lanka 17 4 CAN Canada 30 25 MDG Madagascar 20 5 CHL Chile 20 26 MLT Malta 23 6 COL Colombia 26 27 MUS Mauritius 16 7 CYP Cyprus 18 28 MWI Malawi 23 8 DNK Denmark 26 29 NLD Netherlands 23 9 EGY Egypt 24 30 NOR Norway 22 10 FIN Finland 28 31 NZL New Zealand 19 11 FRA France 23 32 PAK Pakistan 24 12 GBR United Kingdom 22 33 PHL Philippines 24 13 GRC Greece 28 34 PRT Portugal 20 14 GTM Guatemala 19 35 SWE Sweden 23 15 IDN Indonesia 22 36 TUN Tunisia 17 16 IND India 29 37 USA United States 23 17 IRL Ireland 23 38 VEN Venezuela 19 18 IRN Iran 25 39 ZAF South Africa 26 19 ISL Iceland 20 40 ZWE Zimbabwe 25 20 ITA Italy 21 21 JPN Japan 28 Total 918 Source: Authors’ analysis based on data sources discussed in the text. Note: ISO indicates the three-letter ISO code for each country; OBS reports the number of observations (levels regression). 42 TABLE A2. Descriptive Statistics Agriculture Manufacturing PANEL (A) VARIABLES IN UNTRANSFORMED LEVELS TERMS variable mean median st. dev. min. max. variable mean median st. dev. min. max. Output 1.8E+10 6.0E+09 3.0E+10 3.5E+07 2.2E+11 Output 7.6E+10 8.8E+09 2.1E+11 7.2E+06 1.4E+12 Labor 9.6E+06 1.3E+06 3.5E+07 3.0E+03 2.3E+08 Labor 1.7E+06 4.8E+05 3.4E+06 9.6E+03 2.0E+07 Capital 6.5E+10 1.1E+10 1.5E+11 2.9E+07 8.6E+11 Capital 1.3E+11 2.0E+10 3.0E+11 1.4E+07 1.8E+12 Land 1.8E+07 3.5E+06 4.1E+07 6.0E+03 1.9E+08 in logarithms Output 22.39 22.51 1.73 17.38 26.13 Output 22.84 22.89 2.29 15.79 27.99 Labor 14.00 14.04 2.02 8.01 19.27 Labor 13.10 13.08 1.65 9.17 16.79 Capital 22.96 23.07 2.28 17.18 27.48 Capital 23.64 23.74 2.27 16.46 28.22 Land 15.11 15.07 1.99 8.70 19.07 in growth rates (percent) Output 1.7 1.9 10.4 −41.5 53.9 Output 4.4 3.9 10.1 −40.9 84.2 Labor −0.6 −0.0 3.0 −28.8 13.4 Labor 1.9 1.1 6.8 −38.8 78.1 Capital 1.9 1.2 3.6 −5.1 31.4 Capital 4.8 3.6 5.0 −5.1 53.0 Land 0.1 0.0 2.2 −23.1 13.6 PANEL (B) VARIABLES IN PER WORKER TERMS variable mean median st. dev. min. max. variable mean median st. dev. min. max. Output 12,724 6,644 13,161 44.18 57,891 Output 27,093 20,475 22,111 753 101,934 Capital 52,367 9,925 63,576 13.10 222,397 Capital 63,533 43,577 64,557 1,475 449,763 Land 9.66 3.00 20.34 0.29 110 in logarithms Output 8.39 8.80 1.83 3.79 10.97 Output 9.74 9.93 1.09 6.62 11.53 Capital 8.96 9.20 2.71 2.57 12.31 Capital 10.54 10.68 1.09 7.30 13.02 Land 1.11 1.10 1.41 −1.24 4.70 in growth rates (percent) Output 2.3 2.5 10.5 −43.7 56.0 Output 2.5 2.5 9.0 −67.0 73.0 Capital 2.5 2.0 4.2 −7.8 31.1 Capital 2.9 2.9 6.6 −71.7 42.4 Land 0.7 0.5 3.4 −18.4 28.8 (continued) 43 TABLE A2. Descriptive Statistics (continued) Aggregated Data Penn World Table Data PANEL (A) VARIABLES IN UNTRANSFORMED LEVELS TERMS variable mean median st. dev. min. max. variable mean median st. dev. min. max. Output 9.3E+10 1.7E+10 2.3E+11 1.1E+08 1.6E+12 Output 4.3E+11 1.3E+11 1.0E+12 1.3E+09 8.0E+12 Labor 1.1E+07 2.4E+06 3.6E+07 2.2E+04 2.4E+08 Labor 5.1E+07 1.3E+07 1.2E+08 2.1E+05 8.5E+08 Capital 2.0E+11 2.9E+10 4.3E+11 1.0E+08 2.3E+12 Capital 1.2E+12 3.3E+11 2.9E+12 3.3E+09 2.3E+13 in logarithms Output 23.50 23.58 2.01 18.55 28.07 Output 25.44 25.58 1.71 21.02 29.71 Labor 14.66 14.67 1.74 10.01 19.30 Labor 16.49 16.41 1.63 12.27 20.57 Capital 24.10 24.08 2.21 18.44 28.44 Capital 26.38 26.52 1.80 21.92 30.75 in growth rates (percent) Output 3.1 3.1 7.4 −33.9 42.1 Output 4.0 4.0 5.0 −37.1 26.6 Labor 0.2 0.4 2.6 −11.4 19.3 Labor 1.5 1.4 1.1 −1.9 4.8 Capital 3.6 2.7 3.6 −5.0 25.1 Capital 4.6 4.2 2.9 −1.3 16.4 PANEL (B) VARIABLES IN PER-WORKER TERMS variable mean median st. dev. min. max. variable mean median st. dev. min. max. Output 19,493 11,197 19,212 72 76,031 Output 11,445 10,630 8,193 594 31,074 Capital 49,634 23,140 55,541 53 236,312 Capital 37,059 32,981 31,765 661 136,891 in logarithms Output 8.84 9.32 1.85 4.28 11.24 Output 8.95 9.27 1.02 6.39 10.34 Capital 9.44 10.05 2.20 3.96 12.37 Capital 9.87 10.40 1.37 6.49 11.83 in growth rates (percent) Output 3.0 3.3 7.0 −31.0 44.5 Output 2.5 2.6 5.0 −41.2 23.2 Capital 3.4 3.2 3.8 −18.4 22.2 Capital 3.1 2.8 2.9 −4.2 14.3 Source: Authors’ analysis based on data sources discussed in the text. Note: We report the descriptive statistics for value-added (in US dollars for the year 1990 or purchasing power parity-adjusted international dollars for the year 2000), labor (headcount), capital stock (the same monetary values as VA in each respective dataset), and land (in hectares) for the regression sample (levels sample: n = 918; N = 40)<>Sectoral and aggregated data Investment Data. Data for agricultural and manufacturing investment (AgSEInv, MfgSEInv) in constant year 1990 local currency units (LCU), the US$-LCU exchange rate (Ex_Rate, see comment below), and sector-specific deflators (AgDef, TotDef) were taken from Crego, Larson, Butzer, and Mundlak (1998). 38 Note that these authors also 44 provide capital stock data, which they produced through their own calculations from the investment data. Following Martin and Mitra (2002), we believe that the use of a single year exchange rate is preferable to the use of annual rates in the construction of real output (see next paragraph) and capital stock (see below). Output data. For manufacturing, we use data on aggregate GDP in current LCU and the share of GDP in manufacturing from the World Bank WDI (World Bank, 2008). For agriculture, we use agricultural value-added in current LCU from the same source. The two sectoral value-added series are then deflated using the Crego, Larson, Butzer, and Mundlak (1998) sectoral deflator for agriculture and the total economy deflator for manufacturing before we use the 1990 US$-LCU exchange rates to make them comparable across countries. The currencies used in the Crego, Larson, Butzer, and Mundlak (1998) data differ from those applied in the WDI data for a number of European countries because of the adoption of the Euro. Therefore, we must use alternative 1990 US$-LCU exchange rates for these economies. 39 Labor data. For agriculture, we adopt the variable ‘economically active population in agriculture’ from the FAO’s (2007) PopSTAT. Manufacturing labor is taken from UNIDO’s (2004) INDSTAT. Additional data. The land variable is taken from ResourceSTAT and represents ‘arable and permanent crop land’ (measured in hectares) (FAO 2007). For the robustness checks (results available on request), the livestock variable is constructed from the data for the following animals in the ‘live animals’ section of ProdSTAT: asses (donkeys), buffalos, camels, cattle, chickens, ducks, horses, mules, pigs, sheep, goats, and turkeys. 45 Following convention, we use the formula below to convert the numbers for individual animal species into the livestock variable: livestock = 1.1 camels + buffalos + horses + mules + 0.8 cattle + 0.8 asses + 0.2 pigs + 0.1 (sheep + goats) + 0.01 (chickens + ducks + turkeys). The fertilizer variable is taken from the ‘fertilizers archive’ of ResourceSTAT and represents ‘agricultural fertilizer consumed in metric tons,’ which includes ‘crude’ and ‘manufactured’ fertilizers. For human capital, we employ years of schooling attained in the population by those aged 25 years and above, from Barro and Lee (2001), interpolated to create an annual series. Capital stock. We construct capital stock in agriculture and manufacturing by applying the perpetual inventory method described in detail in Klenow and Rodriguez- Clare (1997b), using the investment data from Crego, Larson, Butzer, and Mundlak (1998), which are transformed into US dollars by applying the 1990 US$-LCU exchange rate. For the construction of a sectoral base year capital stock in each country i, we employ average sector value-added growth rates gij (using the deflated sectoral value- added data), the average sectoral investment to value-added ratio (I/Y)ij and an assumed depreciation rate of 5 percent to construct (K/Y)0ij = (I/Y)ij / (gij +0.05) for sector j (agriculture, manufacturing). This ratio is then multiplied by the sectoral value-added data for the base year to yield K0j. Note that the method deviates from that discussed in Klenow and Rodriguez-Clare (1997b) because they use per capita GDP in 46 their computations and therefore need to account for population growth in the construction of the base year capital stock. Aggregated data. We combine the agriculture and manufacturing data to produce a stylized ‘aggregate economy.’ For labor, we simply sum the headcount; for the monetary representations of output and capital stock, the same treatment is applied. Crego, Larson, Butzer, and Mundlak (1998) developed the first large panel dataset that provides data on investment in agriculture for a long span of time, and their work affords us this ability to sum variables for the two sectors. <>Penn World Table Data As a means of comparison, we also provide production function estimates using data from PWT version 6.2. We adopt real per capita GDP in international dollars Laspeyeres (rgdpl) as measure for output and construct capital stock using investment data (derived from the investment share in real GDP, ki, and the output variable, rgdpl) in the perpetual inventory method described above, again adopting 5 percent depreciation (at this point, we must use the data on population from PWT, pop, to compute the average annual population growth rate). 47 <>REFERENCES Abramowitz, Moses. 1956. “Resource and Output Trends in the United States since 1870.â€? American Economic Review 46 (2): 5-23. Arellano, Manuel, and Stephen R. Bond. 1991. “Some Tests of Specification for Panel Data.â€? Review of Economic Studies 58 (2): 277-297. Azariadis, Costas, and Allan Drazen. 1990. “Threshold Externalities in Economic Development.â€? The Quarterly Journal of Economics 105 (2): 501-26. Bai, Jushan, and Chihwa Kao. 2006. “On the Estimation and Inference of a Panel Cointegration Model with Cross-Sectional Dependence.â€? In Panel data econometrics: Theoretical contributions and empirical applications, ed. Badi H. Baltagi. Amsterdam: Elsevier Science. Bai, Jushan, Chihwa Kao, and Serena Ng. 2009. “Panel Cointegration with Global Stochastic Trends.â€? Journal of Econometrics 149 (1): 82-99. Bai, Jushan, and Serena Ng. 2008. “Large Dimensional Factor Analysis.â€? Foundations and Trends in Econometrics 3 (2): 89-163. Baier, Scott L., Gerald P. Dwyer, and Robert Tamura. 2006. “How Important are Capital and Total Factor Productivity for Economic Growth?â€? Economic Inquiry 44 (1): 23-49. Bailey, Natalia, George Kapetanios, and M. Hashem Pesaran. 2012. Exponent of Cross- sectional Dependence: Estimation and Inference. Cambridge University, Faculty of Economics, January 2012. 48 Baltagi, Badi H., Georges Bresson, James M. Griffin, and Alain Pirotte. 2003. “Homogeneous, Heterogeneous or Shrinkage Estimators? Some Empirical Evidence from French Regional Gasoline Consumption.â€? Empirical Economics 28 (4): 795-811. Banerjee, Abhijit V., and Andrew F. Newman. 1993. “Occupational Choice and the Process of Development.â€? Journal of Political Economy, 101 (2): 274-98. Banerjee, Anindya, Markus Eberhardt and J. James Reade. 2010. “Panel Econometrics for Worriers.â€? Oxford University, Department of Economics Discussion Paper Series No. 514. Barro, Robert J., and Jong-Wha Lee. 2001. “International Data on Educational Attainment: Updates and Implications.â€? Oxford Economic Papers 53 (3): 541-63. Barro, Robert J., and Jong-Wha Lee. 2010. “A New Data Set of Educational Attainment in the World, 1950-2010.â€? NBER Working Paper No. 15902, NBER, Washington, DC. Basu, Susantu, and David N. Weil. 1998. “Appropriate Technology and Growth.â€? The Quarterly Journal of Economics 113 (4): 1025-1054. Bazzi, Samuel and Michael Clemens. 2009. Blunt Instruments: On Establishing the Causes of Economic Growth, Working Paper No. 171, Center for Global Development, Washington, DC. Binder, Michael, and Christian J. Offermanns. 2007. “International Investment Positions and Exchange Rate Dynamics: A Dynamic Panel Analysis.â€? Deutsche Bundesbank: Discussion Paper Series 1, Economic Studies No. 2007/23. 49 Blundell, Richard, and Stephen R. Bond. 1998. “Initial Conditions and Moment Restrictions in Dynamic Panel Data Models.â€? Journal of Econometrics 87 (1): 115-43. Bond, Stephen R. 2002. “Dynamic Panel Data Models: A Guide to Micro Data Methods and Practice.â€? Portuguese Economic Journal 1 (2): 141-62. Bond, Stephen R., and Markus Eberhardt. 2009. “Cross-Section Dependence in Nonstationary Panel Models: A Novel Estimator.â€? Paper presented at the Nordic Econometrics Meeting in Lund, Sweden, October 29-31. Boyd, Derick A. C., and Ron P. Smith. 2002. “Some Econometric Issues in Measuring the Monetary Transmission Mechanism, with an Application to Developing Countries.â€? In Monetary Transmission in Diverse Economies, ed. Lavan Mahadeva and Peter Sinclair, Cambridge University Press. Brock, William A., and Steven N. Durlauf. 2001. “Growth Economics and Reality.â€? World Bank Economic Review 15 (2): 229-72. Butzer, Rita. (2011). The Role of Physical Capital in Agricultural and Manufacturing Production. PhD thesis, Department of Economics, University of Chicago, June 2011. Cavalcanti, Tiago, Kamiar Mohaddes, and Mehdi Raissi. 2011. “Growth, Development and Natural Resources: New Evidence using a Heterogeneous Panel Analysis.â€? The Quarterly Review of Economics and Finance 51 (4): 305-18. Clark, Gregory. 2007. A Farewell to Alms: A Brief Economic History of the World. Princeton University Press. 50 Coakley, Jerry, Ana-Maria Fuertes, and Ron P. Smith. 2006. “Unobserved Heterogeneity In Panel Time Series Models.â€? Computational Statistics & Data Analysis, 50(9): 2361-2380. Crego, Al, Donald Larson, Rita Butzer, and Yair Mundlak. 1998. “A New Database on Investment and Capital for Agriculture and Manufacturing.â€? Policy Research Working Paper Series No. 2013, The World Bank, Washington, DC. Durlauf, Steven N. 1993. “Nonergodic economic growth.â€? Review of Economic Studies 60(2): 349-66. Durlauf, Steven N., Paul A. Johnson, and Jonathan R. Temple. 2005. “Growth Econometrics.â€? In Philippe Aghion and Steven N. Durlauf, eds., Handbook of Economic Growth Vol. 1, 555-677. Amsterdam: Elsevier. Durlauf, Steven N., Andros Kourtellos, and Artur Minkin. 2001. “The Local Solow Growth Model.â€? European Economic Review 45 (4-6): 928-40. Durlauf, Steven N., and Danny T. Quah. 1999. “The New Empirics of Economic Growth.â€? In John B. Taylor and Michael Woodford, eds., Handbook of Macroeconomics Vol. 1, 235-308. Amsterdam: Elsevier. Easterly, William. 2002. The Elusive Quest for Growth — Economists’ Adventures and Misadventures in the Tropics. Cambridge, MA: MIT Press. Easterly, William, and Ross Levine. 2001. “It’s Not Factor Accumulation: Stylised Facts and Growth Models.â€? World Bank Economic Review 15 (2): 177-219. Eberhardt, Markus. 2012. “Estimating Panel Time Series Models with Heterogeneous Slopes.â€? Stata Journal 12 (1): 61-71. 51 Eberhardt, Markus, and Christian Helmers. 2010. “Untested Assumptions and Data Slicing: A Critical Review of Firm-Level Production Function Estimators.â€? Oxford University, Department of Economics Discussion Paper Series No. 513. Eberhardt, Markus, Christian Helmers, and Hubert Strauss. Forthcoming. “Do Spillovers Matter when Estimating Private Returns to R&D?â€? The Review of Economics and Statistics. Eberhardt, Markus, and Francis Teal. 2011a. “Econometrics for Grumblers: A New Look at the Literature on Cross-Country Growth Empirics.â€? Journal of Economic Surveys 25 (1): 109-55. Eberhardt, Markus, and Francis Teal. 2011b. “No Mangos in the Tundra: Spatial Heterogeneity in Agricultural Productivity Analysis.â€? Unpublished working paper, University of Nottingham, School of Economics. FAO. 2007. FAOSTAT. Online database, Rome: FAO, United Nations Food and Agriculture Organisation. [http://faostat.fao.org/]. Granger, Clive W. J. 1997. “On Modelling the Long Run in Applied Economics.â€? Economic Journal 107 (440): 169-77. Hall, Robert E., and Charles I. Jones. 1999. “Why do Some Countries Produce so Much More Output Per Worker than Others?â€? The Quarterly Journal of Economics 114 (1): 83-116. Hamilton, Lawrence C. 1992. “How Robust is Robust Regression?â€? Stata Technical Bulletin 1 (2): 21-25. 52 Heston, Alan, Robert Summers and Bettina Aten. 2006. Penn World Table version 6.2. Center for International Comparisons of Production, Income and Prices, University of Pennsylvania. [http://pwt.econ.upenn.edu/php_site/pwt_index.php]. ———. 2011. Penn World Table version 7.0. Center for International Comparisons of Production, Income and Prices, University of Pennsylvania. [http://pwt.econ.upenn.edu/php_site/pwt_index.php]. Holly, Sean, M. Hashem Pesaran, and Takashi Yamagata. 2010. “A Spatio-Temporal Model Of House Prices in the US.â€? Journal of Econometrics 158 (1): 160-73. Hsiao, Cheng, Yan Shen, and Hiroshi Fujiki. 2005. “Aggregate vs. Disaggregate Data Analysis — A Paradox in the Estimation of a Money Demand Function of Japan under the Low Interest Rate Policy.â€? Journal of Applied Econometrics 20 (5): 579-601. Jerzmanowski, Michal. 2007. “Total Factor Productivity Differences: Appropriate Technology vs. Efficiency.â€? European Economic Review 51 (8): 2080-110. Johnson, Simon, William Larson, Chris Papageorgiou, and Arvind Subramanian. 2009. “Is Newer Better? Penn World Table Revisions and Their Impact on Growth Estimates.â€? NBER Working Papers No. 15455. NBER, Washington, DC. Jorgensen, Dale W. 1961. “The Development of a Dual Economy.â€? Economic Journal 71 (282): 309-34. Kao, Chihwa. 1999. “Spurious Regression and Residual-Based Tests for Cointegration in Panel Data.â€? Journal of Econometrics 65 (1): 9-15. 53 Kapetanios, George, M. Hashem Pesaran, and Takashi Yamagata. 2011. “Panels with Nonstationary Multifactor Error Structures.â€? Journal of Econometrics 160 (2): 326-48. Klenow, Peter J., and Andres Rodriguez-Clare. 1997a. “Economic growth: A review essay.â€? Journal of Monetary Economics 40 (3): 597-617. ———. 1997b. “The Neoclassical Revival in Growth Economics: Has it Gone too Far?â€? NBER Macroeconomics Annual 12: 73-103. Lee, Kevin, M. Hashem Pesaran, and Ron P. Smith. 1997. “Growth and Convergence in a Multi-Country Empirical Stochastic Solow Model.â€? Journal of Applied Econometrics 12 (4): 357-92. Lewis, W. Arthur. 1954. “Economic Development with Unlimited Supplies of Labour.â€? The Manchester School 22: 139-91. Lin, Justin Y. 2011. “New Structural Economics: A Framework for Rethinking Development.â€? World Bank Research Observer 26 (2): 193-221. Mankiw, N. Gregory, David Romer, and David N. Weil. 1992. “A Contribution to the Empirics of Economic Growth.â€? The Quarterly Journal of Economics 107 (2): 407-37. Marschak, Jacob, and William H. Andrews Jr. 1944. “Random Simultaneous Equations and the Theory of Production.â€? Econometrica 12 (3/4): 143-205. Martin, Will, and Devashish Mitra. 2002. “Productivity Growth and Convergence in Agriculture versus Manufacturing.â€? Economic Development and Cultural Change 49 (2): 403-22. 54 McMillan, Margaret, and Dani Rodrik. 2011. “Globalization, Structural Change and Productivity Growth.â€? NBER Working Papers No. 17143, NBER, Washington DC. Moscone, Francesco, and Elisa Tosetti. 2009. “A Review and Comparison of Tests of Cross-Section Independence in Panels.â€? Journal of Economic Surveys 23 (3): 528-61. ———. 2010. “Health Expenditure and Income in the United States.â€? Health Economics 19 (12): 1385-403. Mundlak, Yair. 1988. “Endogenous Technology and the Measurement of Productivity.â€? In Susan M. Capalbo and John M. Antle, eds., Agricultural Productivity: Measurement and Explanation, Washington, DC: Resources for the Future. Mundlak, Yair, Donald Larson, and Rita Butzer. 1999. “Rethinking within and between Regressions: The Case of Agricultural Production Functions.â€? Annales D’Economie et de Statistique 55/56: 475-501. Mundlak, Yair, Rita Butzer, and Donald Larson. 2012. “Heterogeneous Technology and Panel Data: The Case of the Agricultural Production Function.â€? Journal of Development Economics 99(1): 139-49. Murphy, Kevin M., Andrei Shleifer, and Robert W. Vishny. 1989. “Industrialization and the Big Push.â€? Journal of Political Economy 97 (5): 1003-26. Nelson, Charles R., and Charles R. Plosser. 1982. “Trends and Random Walks in Macroeconomic Time Series: Some Evidence and Implications.â€? Journal of Monetary Economics 10 (2): 139-62. 55 Onatski, Alexei. 2009. “Testing Hypotheses about the Number of Factors in Large Factor Models.â€? Econometrica, 77 (5): 1447-79. Page, John M. 2012. Aid, Structural Change and the Private Sector in Africa. UNU- WIDER Working Paper No. 2012/21, Helsinki, United Nations University. Pedroni, Peter. 2007. “Social Capital, Barriers to Production and Capital Shares: Implications for the Importance of Parameter Heterogeneity from a Nonstationary Panel Approach.â€? Journal of Applied Econometrics 22 (2): 429-451. Pesaran, M. Hashem. 2004. General diagnostic tests for cross section dependence in panels. IZA Discussion Paper No. 1240, Bonn, Institute for the Study of Labor. ———. 2006. “Estimation and Inference in Large Heterogeneous Panels With a Multifactor Error Structure.â€? Econometrica 74 (4): 967-1012. ———. 2007. “A Simple Panel Unit Root Test in the Presence of Cross-Section Dependence.â€? Journal of Applied Econometrics 22 (2): 265-312. Pesaran, M. Hashem, Yongcheol Shin, and Ron P. Smith. 1999. “Pooled Mean Group Estimation of Dynamic Heterogeneous Panels.â€? Journal of the American Statistical Association 94 (446): 621-34. Pesaran, M. Hashem, and Ron P. Smith. 1995. “Estimating Long-Run Relationships From Dynamic Heterogeneous Panels.â€? Journal of Econometrics 68 (1): 79-113. Pesaran, M. Hashem, and Elisa Tosetti. 2011. “Large Panels with Common Factors and Spatial Correlations.â€? Journal of Econometrics 161 (2): 182-202. Ranis, Gustav, and John Fei. 1961. “A Theory of Economic Development.â€? American Economic Review 51 (4): 533-556. 56 Robinson, Sherman. 1971. “Sources of Growth in Less Developed Countries: A Cross- Section Study.â€? The Quarterly Journal of Economics 85 (3): 391-408. Solow, Robert M. 1956. “A Contribution to the Theory of Economic Growth.â€? The Quarterly Journal of Economics 70 (1): 65-94. Stock, James H., and Mark W. Watson. 2002. “Macroeconomic Forecasting Using Diffusion Indexes.â€? Journal of Business & Economic Statistics 20 (2): 147-62. Stoker, Thomas M. 1993. “Empirical Approaches to the Problem of Aggregation over Individuals.â€? Journal of Economic Literature 31 (4): 1827-74. Swan, Trevor W. 1956. “Economic Growth and Capital Accumulation.â€? Economic Record 32 (2): 334-61. Temple, Jonathan R. 2005. “Dual Economy Models: A Primer for Growth Economists.â€? The Manchester School 73 (4): 435-78. Temple, Jonathan R., and Ludger Wößmann. 2006. “Dualism and Cross-Country Growth Regressions.â€? Journal of Economic Growth 11 (3): 187-228. UNIDO. 2004. UNIDO INDSTAT Industrial Statistics Database. Vienna: UNIDO, United Nations Industrial Development Organisation. [http://www.unido.org/index.php?id=1000327] Vollrath, Dietrich. 2009a. “The Dual Economy in Long-Run Development.â€? Journal of Economic Growth 14 (4): 287-312. ———. 2009b. “How Important are Dual Economy Effects for Aggregate Productivity?â€? Journal of Development Economics 88(2): 325-334. 57 Westerlund, Joakim, and Jean-Pierre Urbain. 2011. Cross sectional averages or principal components? Maastricht: METEOR, Maastricht Research School of Economics of Technology and Organization, Working Paper No. 53. World Bank. 2008. World Development Indicators. Online Database, The World Bank, Washington, DC. [http://data.worldbank.org/data-catalog/world-development- indicators]. Young, Alwyn. 1995. “The Tyranny of Numbers: Confronting the Statistical Realities of the East Asian Growth Experience.â€? The Quarterly Journal of Economics 110 (3): 641-80. 58 NOTES 1 We refer to ‘dual economy models’ as representing economies with two stylized sectors of production (agriculture and manufacturing). ‘Technology’ and ‘technology parameters’ refer to the coefficients on capital and labor in the production function model (elasticities with respect to capital and labor), not Total Factor Productivity (TFP) or its growth rate (technical/technological progress). 2 The quoted shares are from the WDI database (World Bank 2008). For comparison, the maximum share of oil revenue in GDP, computed as the difference between ‘industry share in GDP’ and ‘manufacturing share in GDP’ from the same database, yields the following ranges for some of the countries mentioned by Mankiw, Romer, and Weil (1992): Iran (12 to 51 percent), Kuwait (15 to 81 percent), Gabon (28 to 60 percent), and Saudi Arabia (29 to 67 percent). 3 Crucially, all changes in X* are instigated by the state variables, and with the exception of error, it is deemed ‘meaningless’ to think of any other factors driving inputs (Mundlak, Larson, and Butzer, 1999). 4 For simplicity, the exposition in Mundlak, Butzer, and Larson (2012) is limited to a static model. 5 u0it and εjit are white noise. 6 Mundlak, Butzer, and Larson (2012) refer to the presence of state variables in both equations as technology ‘heterogeneity.’ Our use of the term differs from theirs because we refer to βi ≠ β as technology heterogeneity. 59 7 The between-country regressions further include time-invariant proxies for countries’ physical environment. 8 Between-time and between-country estimates are also provided, but the 2FE results are the focus of attention. 9 Further, fâ‹…mt is a subset of ft, and the error terms εit, vmit, ωt and Ï…t are white noise. 10 Heterogeneity over time will be addressed in section IV. 11 Others, namely, gt, are specific to the input evolution. 12 A detailed review of the important contribution of factor models to empirical macroeconometrics is beyond the scope of this study. See Stock and Watson (2002), Bai and Ng (2008), and Onatski (2009) for details. 13 The shock can never be truly idiosyncratic; m0it differs for each country i at each point in time t. We consider this assumption reasonable given the interconnectedness of economies. 14 Abbreviations: POLS, Pooled OLS; 2FE, 2-way Fixed Effects; GMM, Arellano and Bond (1991) Difference GMM and Blundell and Bond (1998) System GMM; MG, Pesaran and Smith (1995) Mean Group estimator (with linear country trends); FDMG, dto with variables in first difference and country drifts; PMG, Pesaran, Shin, and Smith (1999) Pooled Mean Group estimator; CPMG, dto augmented with cross-section averages following Binder and Offermanns (2007); CCEP/CMG, Pesaran (2006) Common Correlated Effects estimators. Note that our POLS model is augmented with T-1 year dummies. 60 15 GMM, PMG, and CPMG estimation was based on an error correction model specification; see Pesaran, Shin, and Smith (1999) for details. Further discussion of the empirical setup and results is available on request. 16 We abstain from discussing the standard panel estimators here in great detail and refer to the articles by Coakley, Fuertes, and Smith (2006), Bond and Eberhardt (2009), and Bond (2002) for more information. We also investigate the Pooled Mean Group (PMG) estimator by Pesaran, Shin, and Smith (1999) as well as a simple extension to the PMG in which we include cross- section averages of the dependent and independent variables (CPMG), as suggested in Binder and Offermanns (2007). 17 Although yt and eit are not independent, their correlation goes to zero as N becomes larger. 18 Thus, in the MG version, we have N individual country regressions with 2k + 2 RHS variables, and in the pooled version, there is a single regression equation with k + N (k + 2) RHS variables. 19 Most conservatively, the CCE estimators require λ ≠ 0 : the impact of each factor is, on average, non-zero (Coakley, Fuertes, and Smith 2006). Alternative scenarios (see Pesaran 2006; Kapetanios, Pesaran, and Yamagata 2011) allow for this assumption to be dropped in certain situations, but for the sake of generality, we maintain it here. 20 An alternative approach to empirically implementing equation (4) is to estimate factors, factor loadings, and slope coefficients jointly, as in the estimators developed in Bai and Kao (2006) and Bai, Kao, and Ng (2009). Computational complexity aside, two recent theoretical contributions support the Pesaran (2006) approach adopted in this study. Theoretical work by Westerlund and Urbain (2011: 17f) compares the two approaches and concludes that “one is unlikely to do better 61 than when using the relatively simple CA [cross-sectional average augmentation] approach.â€? Similarly, a study by Bailey, Kapetanios, and Pesaran (2012: 25) concludes that the methods used to determine the number of strong factors on which the approach by Bai and co-authors relies are “invalid and will select the wrong number of factors, even asymptotically.â€? 21 We sum the values for value-added, capital stock (both in per worker terms), and labor and then take logarithms. 22 We are, of course, aware that the difference in deflation between our sectoral and stylized aggregated data, on the one hand, and PWT, on the other hand, makes them conceptually very different measures of growth and development. The aggregated data emphasize tradable goods production, whereas the PWT data equally emphasize tradable and non-tradable goods and services. However, we believe that these differences are comparatively unimportant for the purposes of estimation and inference in comparison to the distortions introduced by neglecting the sectoral makeup and technology heterogeneity of economies at different stages of economic development. 23 We do not account for missing observations in any way. The preferred empirical specifications presented below are based on heterogeneous parameter models, in which (arguably) the lack of balance (25 percent of observations in the balanced panel are missing) is less relevant than in the homogeneous models because of the averaging of estimates. 24 If the correlation is caused by the same factors as those present in the inputs, the situation is altogether more serious than mere lack of efficiency, namely, that β might be unidentified. Residual diagnostics and their importance for empirical modeling are discussed in more detail in Eberhardt and Teal (2011a) and Banerjee, Eberhardt, and Reade (2010). 62 25 This computation is based on statistically significant parameters only: ˆ =1− β β L ˆ +β K [ ( ˆ , where β ˆ +β N RS )]ˆ is the log labor coefficient discussed above. If any of RS ˆ ,β β K ˆ or β N ˆ is insignificant, it is omitted from this calculation; if all parameters are RS insignificant, we report ‘not applicable’ (n/a). 26 The implication is that these empirical results are potentially spurious. We conduct a number of robustness checks adding further covariates in the agriculture equations (livestock per worker, fertilizer per worker) in the pooled regression framework. Results (available on request) do not change from those presented above. We also conduct robustness checks to include human capital in the estimation equation of both sectors. The results are presented in supplemental appendix S4 (see also discussion below). 27 We use robust regression to produce a robust estimate of the mean; see Hamilton (1992) and Eberhardt (2012) for details. 28 We further implement alternative specifications for both sectors that include the level and squared human capital terms (average years of schooling in the adult population) as additional covariates (see supplemental appendix S4). In the agriculture data, augmentation with human capital does not lead to statistically significant results (not reported). Manufacturing results for the MG and FDMG mirror those in the unaugmented models presented above. For the standard CMG models, we find capital coefficients somewhat below those in the unaugmented models but within each other’s 95 percent confidence intervals (we do not estimate the ‘alternative CMG estimator’ with human capital because we encounter a dimensionality problem due to the large number of covariates). Average education coefficients are significant and indicate high returns to 63 education in manufacturing: 11 percent and 12 percent in the unrestricted and CRS models, respectively. 29 It can be argued that the CCE approach accounts for the induced bias for systematic distortion of the land variable. In Eberhardt, Helmers, and Strauss (forthcoming), we suggest that similar ‘mismeasurement’ of research and development investments leading to ‘expensing’ and ‘double- counting’ bias can be addressed in a common factor approach to the Griliches knowledge production function. 30 The supplemental appendix (S3) also contains details of an extensive simulation exercise in which we formulate a number of production technologies for agriculture and manufacturing, reflecting our insights into the effects of parameter heterogeneity, variable nonstationarity, and cross-section dependence and analyze stylized aggregate data constructed from these two sectors. This exercise suggests that, more than any other feature, the introduction of common factors (even different ones across sectors) creates the largest problems in the aggregate empirical results. 31 As a further robustness check, we ran regressions where, rather than aggregating the data, we forced manufacturing and agriculture production to follow the same technology using cross- equation restrictions. Results (available on request) did not differ qualitatively from the aggregated results presented above. Additionally, we estimated dynamic pooled models, introducing the PMG and CPMG estimators (for the results, see supplemental appendix S4). All of these results confirm the patterns across the sectoral and aggregated data described above. 32 The importance of correctly specified technology heterogeneity in the presence of nonstationary processes is discussed in detail in Eberhardt and Teal (2011a: 139f). 64 33 This is akin to ignoring common factors when these drive both y and x; see Eberhardt and Teal (2011a: 137f). 34 We exclude the most extreme outliers from this plot using the following rule: we run a robust regression of the capital coefficients on mean income pc (in logs), reported in the note to figure 1, further computing the weights assigned to each observation by the algorithm. Countries with weights below 0.5 are then excluded (five countries in the agriculture and one country in the manufacturing sample). 35 We also replaced the mean income variable in this analysis with a number of proxies for institutions and ‘social capital,’ provided and investigated by Hall and Jones (1999). The patterns and significance levels for the correlations between sectoral capital coefficients and these alternative variables were very similar to those for the income correlations presented above. 36 Note that whether this refers to true technology heterogeneity or simply greater bias in the country regression for agriculture cannot be determined in this context. 37 Following the example in our main results, we use robust means for the heterogeneous parameter models. 38 Data are available at http://go.worldbank.org/FS3FXW7461. All data discussed in this appendix are linked at http://sites.google.com/site/medevecon/devecondata. Stata code for empirical estimators and tests is available from SSC: pescadf, xtmg, xtcd. See also Eberhardt (2012) on xtmg. 39 In detail, we apply exchange rates of 1.210246384 for AUT, 1.207133927 for BEL, 1.55504706 for FIN, 1.204635181 for FRA, 2.149653527 for GRC, 1.302645017 for IRL, 65 1.616114954 for ITA, 1.210203555 for NLD, and 1.406350856 for PRT. See table A1 for country codes. 66 SUPPLEMENTAL APPENDIX Structural Change and Cross-Country Growth Empirics World Bank Economic Review by Markus Eberhardt1 and Francis Teal Contents S1 Time-series properties of the data 2 S2 Cross-section dependence in the data 3 S3 Monte Carlo Simulations 4 S3.1 Data Generating Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 S3.2 Overview of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 S3.3 Detailed results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 S4 Additional tables and ï¬?gures 8 References 13 List of Tables 1 Second generation panel unit root tests . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 Cross-section correlation analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 4 Pooled regression models (HC-augmented) . . . . . . . . . . . . . . . . . . . . . . . . . 8 5 Heterogeneous Manufacturing models (HC-augmented) . . . . . . . . . . . . . . . . . 9 6 Aggregate & PWT data: Pooled models (HC-augmented) . . . . . . . . . . . . . . . . 10 7 Aggregate & PWT data: Heterogeneous models with HC . . . . . . . . . . . . . . . . . 11 8 Alternative dynamic panel estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 List of Figures 1 Box plots — Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1 Corresponding author: School of Economics, University of Nottingham, Room C6, Sir Clive Granger Building, University Park, Nottingham NG7 2RD, UK. Email: markus.eberhardt@nottingham.ac.uk, Website: http://sites.google.com/site/medevecon 1 S1 Time-series properties of the data Table 1: Second generation panel unit root tests Panel (A): Agriculture data Variables in levels Variables in growth rates log VA pw log Labour log Cap pw VA pw Labour Cap pw lags Ztbar p Ztbar p Ztbar p lags Ztbar p Ztbar p Ztbar p 0 -0.93 0.18 7.88 1.00 7.14 1.00 0 -16.11 0.00 1.01 0.84 -1.63 0.05 1 -1.25 0.11 5.94 1.00 3.03 1.00 1 -10.88 0.00 2.66 1.00 -1.10 0.14 2 2.23 0.99 7.65 1.00 4.78 1.00 2 -5.82 0.00 5.94 1.00 3.49 1.00 3 4.18 1.00 9.18 1.00 4.80 1.00 3 -2.09 0.02 6.64 1.00 4.48 1.00 Land pw Land pw lags Ztbar p lags Ztbar p 0 9.15 1.00 0 -10.40 0.00 1 6.34 1.00 1 -3.05 0.00 2 5.48 1.00 2 -0.17 0.43 3 3.42 1.00 3 2.65 1.00 Panel (B): manufacturing data Variables in levels Variables in growth rates log VA pw log Labour log Cap pw VA pw Labour Cap pw lags Ztbar p Ztbar p Ztbar p lags Ztbar p Ztbar p Ztbar p 0 0.57 0.72 2.05 0.98 1.61 0.95 0 -18.64 0.00 -11.52 0.00 -9.27 0.00 1 1.69 0.95 1.12 0.87 0.28 0.61 1 -9.58 0.00 -7.76 0.00 -5.71 0.00 2 1.68 0.95 3.52 1.00 1.62 0.95 2 -4.61 0.00 -4.36 0.00 -2.94 0.00 3 3.00 1.00 3.08 1.00 2.75 1.00 3 -1.50 0.07 -0.81 0.21 0.23 0.59 Panel (C): Aggregated data Variables in levels Variables in growth rates log VA pw log Labour log Cap pw VA pw Labour Cap pw lags Ztbar p Ztbar p Ztbar p lags Ztbar p Ztbar p Ztbar p 0 2.29 0.99 5.90 1.00 6.41 1.00 0 -15.30 0.00 -5.25 0.00 -4.01 0.00 1 2.28 0.99 3.84 1.00 3.00 1.00 1 -9.45 0.00 -2.38 0.01 -1.78 0.04 2 4.43 1.00 4.76 1.00 3.51 1.00 2 -3.90 0.00 -0.52 0.30 0.49 0.69 3 4.89 1.00 4.75 1.00 3.77 1.00 3 -1.24 0.11 1.87 0.97 2.89 1.00 Panel (D): Penn World Table data Variables in levels Variables in growth rates log VA pw log Labour log Cap pw VA pw Labour Cap pw lags Ztbar p Ztbar p Ztbar p lags Ztbar p Ztbar p Ztbar p 0 5.05 1.00 -2.57 0.01 2.27 0.99 0 -14.49 0.00 0.46 0.68 -4.73 0.00 1 5.81 1.00 5.78 1.00 5.26 1.00 1 -7.32 0.00 -2.91 0.00 -3.19 0.00 2 6.10 1.00 6.93 1.00 6.26 1.00 2 -4.99 0.00 1.06 0.86 -2.48 0.01 3 7.62 1.00 6.26 1.00 6.74 1.00 3 -1.78 0.04 1.52 0.94 -1.20 0.12 Notes: We report test statistics and p-values for the Pesaran (2007) CIPS panel unit root test of the variables in our four datasets. In all cases we use N = 40, n = 918 for the levels data. ‘Lags’ refers to the augmentation with lagged dependent variables (Augmented Dickey-Fuller test). 2 S2 Cross-section dependence in the data Table 2: Cross-section correlation analysis Variables in levels Variables in FD Agriculture ¯ Ï? ¯| |Ï? CD ( p) ¯ Ï? |Ï? ¯| CD ( p) log VA pw 0.33 0.51 42.42 0.00 0.05 0.23 6.32 0.00 log Labour 0.00 0.80 0.94 0.35 0.07 0.56 8.55 0.00 log Capital pw 0.41 0.71 51.52 0.00 0.08 0.41 8.86 0.00 log Land pw 0.02 0.67 3.57 0.00 0.02 0.29 2.91 0.00 Manufacturing ¯ Ï? ¯| |Ï? CD ( p) ¯ Ï? |Ï? ¯| CD ( p) log VA pw 0.39 0.59 49.87 0.00 0.05 0.22 6.19 0.00 log Labour 0.15 0.62 18.98 0.00 0.14 0.26 17.31 0.00 log Capital pw 0.59 0.76 74.15 0.00 0.07 0.22 8.01 0.00 Aggregated ¯ Ï? |Ï?¯| CD ( p) ¯ Ï? |Ï?¯| CD ( p) log VA pw 0.55 0.67 69.67 0.00 0.08 0.23 10.18 0.00 log Labour 0.04 0.71 5.50 0.00 0.07 0.32 7.93 0.00 log Capital pw 0.76 0.85 94.70 0.00 0.07 0.29 7.78 0.00 PWT ¯ Ï? |Ï?¯| CD ( p) ¯ Ï? |Ï?¯| CD ( p) log VA pw 0.58 0.72 72.20 0.00 0.14 0.24 17.08 0.00 log Labour 0.94 0.94 114.37 0.00 0.05 0.39 6.21 0.00 log Capital pw 0.70 0.88 87.01 0.00 0.26 0.37 31.57 0.00 Notes: We report the average correlation coefï¬?cient across the N ( N − 1) variable series Ï? ¯ , as well as the average ¯ |. CD is the formal cross-section correlation tests introduced by Pesaran (2004). absolute correlation coefï¬?cient |Ï? Under the H0 of cross-section independence its statistics is asymptotically standard normal. We use our regression sample N = 40, n = 918 for the levels data. The same sample is used for the ï¬?rst difference data (n = 884) with the exception of the PWT analysis: here we are forced to drop the series for CYP to be able to compute correlation coefï¬?cients. 3 S3 Monte Carlo Simulations S3.1 Data Generating Process We run M = 1, 000 replications of the following DGP for N = 50 cross-section elements and T = 30 time periods. Our basic setup for the DGP closely follows that of Kapetanios, Pesaran, and Yamagata (2011), albeit with a single rather than two regressors. For notational simplicity we do not identify the different sectors (agriculture and manufacturing) in the following, but all processes and variables are created independently across sectors, unless otherwise indicated. y y yit = β i xit + uit uit = αi + λi1 f 1t + λi2 f 2t + ε it (1) xit = ai1 + ai2 dt + λix1 f 1t + λix3 f 3t + vit (2) for i = 1, . . . , N unless indicated below and t = 1, . . . , T . The common deterministic trend term (dt ) and individual-speciï¬?c errors for the x-equation are zero-mean independent AR(1) processes deï¬?ned as dt = 0.5dt−1 + Ï…dt Ï…dt ∼ N (0, 0.75) t = −48, . . . , 1, . . . , T d−49 = 0 vit = Ï?vi vi,t−1 + Ï…it Ï…it ∼ N (0, (1 − Ï?2 vi )) t = −48, . . . , 1, . . . , T vi,−49 = 0 where Ï?vi ∼ U [0.05, 0.95]. The common factors are nonstationary processes f jt = µ j + f j,t−1 + Ï… f t j = 1, 2, 3 Ï… f t ∼ N (0, 1) t = −49, . . . , 1, . . . , T (3) µa m j = {0.01, 0.008, 0.005}, µ j = {0.015, 0.012, 0.01} f j,−50 = 0 where we deviate from the Kapetanios et al. (2011) setup by including drift terms. Unless indicated the sets of common factors differ between sectors. Innovations to y are generated as a mix of heterogeneous AR(1) and MA(1) errors ε it = Ï?iε ε i,t−1 + σi 1 − Ï?2 i ε ωit i = 1, . . . , N1 t = −48, . . . , 0, . . . , T σi ε it = (ωit + θiε ωi,t−1 ) i = N1 + 1, . . . , N t = −48, . . . , 0, . . . , T 1 + θi2ε where N1 is the nearest integer to N /2 and ωit ∼ N (0, 1), σi2 ∼ U [0.5, 1.5], Ï?iε ∼ U [0.05, 0.95], and θiε ∼ U [0, 1]. Ï?vi , Ï?iε , θiε and σi do not change across replications. Initial values are set to zero and the ï¬?rst 50 observations are discarded for all of the above. Regarding parameter values, αi ∼ N (2, 1) and ai1 , ai2 ∼ iid N (0.5, 0.5) do not change across replica- tions. To begin with TFP levels αi are speciï¬?ed to be the same across sectors. The slope coefï¬?cient β can vary across countries and across sectors (see below). In case of cross-country heterogeneity we have β i = β + ηi with ηi ∼ N (0, 0.04). If the mean of the slope coefï¬?cient β is the same across sectors we specify β = 0.5, otherwise β a = 0.5 and βm = 0.3 for agriculture and manufacturing respectively. For the factor loadings may be heterogeneous and are distributed λix1 ∼ N (0.5, 0.5) and λix3 ∼ N (0.5, 0.5) (4) y y λi1 ∼ N (1, 0.2) and λi2 ∼ N (1, 0.2) (5) 4 The above represents our basis DGP for the simulations carried out. We investigate the following ten models (the focus is on those marked with stars): (1) Cross-country homogeneity ( β) and no factors. We set all λi to zero such that x and y are stationary and cross-sectionally independent; technology is the same across countries and sectors. (2) As Model (1) but now we have heterogeneous β across countries. (3) As Model (2) but with substantially larger heterogeneity in TFP levels across countries. (4) As Model (2) but with TFP levels in manufacturing are now 1.5 times those in agriculture. We keep this feature for the remainder of setups. (5) This sees the introduction of common factors ( f 2t and f 3t ) albeit with homogeneous factor loadings across countries. Both factors and loadings are independent across sectors. The absence of f 1t means there is no endogeneity problem. (6) As Model (5) but now we have factor loading heterogeneity across countries. (7) As Model (6) but with factor-overlap between x and y equations: f 1t is contained in both of these, inducing endogeneity in a sectoral regression. (8) As Model (7) but slope coefï¬?cients now differ across countries and sectors — for the latter we specify βm a i = 1 − βi . (9) As Model (8) except we now have independent slope coefï¬?cients across sectors with means βm = 0.3 and β a = 0.5. (10) As Model (9) but we now have the same factor f 1t contained in y and x-equations of both sectors, although with differential (and independent) factor loadings. Models (1) to (4) analyse a homogeneous parameter world without common factors, where aggrega- tion should lead to no problems for estimation. Models (5) to (7) show what happens when factors are introduced. Models (8) and (9) introduce parameter heterogeneity across sectors and Model (10) adds factor-overlap between sectors (on top of overlap across variables within sector). 5 S3.2 Overview of results Figure 1: Box plots — Simulation results Notes: We present box plots for the M = 1, 000 estimates using various estimators under 4 DGP setups. In all cases the true coefï¬?cient is subtracted from the estimates, such that the plots are centred around zero. The estimators are as follows: ‘CMG Agri’ and ‘CMG Manu’ — Pesaran (2006) CMG regressions on the sector-level m with β j the mean sectoral slope data; Weighted — this is not an estimator but the weighted average β a sia + βm si coefï¬?cient and s j the sectoral share of total output; the remaining four estimators use the aggregated data: OLS — pooled OLS with T − 1 year dummies; 2FE — OLS with country and time dummies; FD — OLS with variables in ï¬?rst differences (incl. time dummies); CMG — Pesaran (2006) CMG. We omit the results for the Pesaran and Smith (1995) MG estimator as these are very imprecise and would counter the readability of the graphs. The MC setups are described in detail in Section S3.1 of the Appendix. 6 S3.3 Detailed results Table 3: Simulation results Model 1 Model 2 mean median ste• ste mean median ste• ste CMG Agri 0.4999 0.4990 0.0318 0.0324 CMG Agri 0.5007 0.4996 0.0425 0.0424 CMG Manu 0.4999 0.4990 0.0318 0.0324 CMG Manu 0.5007 0.4996 0.0425 0.0424 Weighted 0.5000 0.5000 0.0000 Weighted 0.5007 0.4998 0.0289 POLS 0.5054 0.5064 0.0462 0.0298 POLS 0.5058 0.5065 0.0572 0.0304 2FE 0.5002 0.5005 0.0248 0.0226 2FE 0.5014 0.5007 0.0392 0.0232 FD 0.5000 0.5007 0.0295 0.0257 FD 0.5014 0.5014 0.0441 0.0262 CCEP 0.4996 0.4997 0.0292 0.0271 CCEP 0.5008 0.5001 0.0424 0.0276 MG 0.4993 0.4987 0.0276 0.0283 MG 0.5001 0.4993 0.0389 0.0399 CMG 0.4999 0.4990 0.0318 0.0324 CMG 0.5007 0.4996 0.0425 0.0424 Model 3 Model 4 mean median ste• ste mean median ste• ste CMG Agri 0.4999 0.4990 0.0318 0.0324 CMG Agri 0.4999 0.4990 0.0318 0.0324 CMG Manu 0.4999 0.4990 0.0318 0.0324 CMG Manu 0.4999 0.4990 0.0318 0.0324 Weighted 0.5000 0.5000 0.0000 Weighted 0.5000 0.5000 0.0000 POLS 0.5310 0.5280 0.1968 0.1128 POLS 0.5119 0.5112 0.0593 0.0365 2FE 0.5002 0.5005 0.0248 0.0226 2FE 0.5002 0.5005 0.0248 0.0226 FD 0.5000 0.5007 0.0295 0.0257 FD 0.5000 0.5007 0.0295 0.0257 CCEP 0.4996 0.4997 0.0292 0.0271 CCEP 0.4996 0.4997 0.0292 0.0271 MG 0.4993 0.4987 0.0276 0.0283 MG 0.4993 0.4987 0.0276 0.0283 CMG 0.4999 0.4990 0.0318 0.0324 CMG 0.4999 0.4990 0.0318 0.0324 Model 5 Model 6 mean median ste• ste mean median ste• ste CMG Agri 0.4993 0.4987 0.0299 0.0298 CMG Agri 0.5005 0.5002 0.0238 0.0233 CMG Manu 0.5000 0.5014 0.0311 0.0321 CMG Manu 0.4994 0.5004 0.0253 0.0246 Weighted 0.5000 0.5000 0.0000 Weighted 0.5000 0.5000 0.0000 POLS 0.4936 0.4936 0.0753 0.0432 POLS 0.4558 0.4669 0.1059 0.0197 2FE 0.4563 0.4571 0.0331 0.0266 2FE 0.4382 0.4450 0.0588 0.0176 FD 0.4427 0.4416 0.0418 0.0268 FD 0.4181 0.4224 0.0517 0.0219 CCEP 0.4516 0.4502 0.0327 0.0278 CCEP 0.4231 0.4326 0.0522 0.0186 MG 0.4663 0.4687 0.3257 0.0369 MG 0.4305 0.4333 0.1816 0.0496 CMG 0.4498 0.4497 0.0362 0.0379 CMG 0.4161 0.4226 0.0516 0.0342 Model 7 Model 8 mean median ste• ste mean median ste• ste CMG Agri 0.5000 0.4998 0.0448 0.0436 CMG Agri 0.5009 0.5020 0.0528 0.0520 CMG Manu 0.4979 0.4972 0.0454 0.0445 CMG Manu 0.4986 0.4978 0.0550 0.0528 Weighted 0.5000 0.5000 0.0000 Weighted 0.5007 0.4998 0.0289 POLS 0.4405 0.4469 0.1212 0.0236 POLS 0.4459 0.4452 0.1299 0.0248 2FE 0.4143 0.4161 0.0700 0.0210 2FE 0.4217 0.4234 0.0807 0.0220 FD 0.4027 0.4011 0.0541 0.0238 FD 0.4106 0.4073 0.0635 0.0245 CCEP 0.3956 0.3987 0.0619 0.0227 CCEP 0.4040 0.4047 0.0702 0.0233 MG 0.6759 0.6585 0.2510 0.0782 MG 0.6826 0.6644 0.2532 0.0828 CMG 0.3897 0.3928 0.0584 0.0496 CMG 0.3985 0.3976 0.0650 0.0560 Model 9 Model 10 mean median ste• ste mean median ste• ste CMG Agri 0.5009 0.5020 0.0528 0.0520 CMG Agri 0.5009 0.5020 0.0528 0.0520 CMG Manu 0.2961 0.2972 0.0543 0.0526 CMG Manu 0.2961 0.2972 0.0543 0.0526 Weighted 0.3924 0.3928 0.0391 Weighted 0.3939 0.3946 0.0391 POLS 0.3383 0.3388 0.1324 0.0246 POLS 0.3400 0.3415 0.1322 0.0246 2FE 0.3151 0.3127 0.0814 0.0217 2FE 0.3163 0.3144 0.0816 0.0217 FD 0.3074 0.3053 0.0625 0.0242 FD 0.3086 0.3071 0.0626 0.0242 CCEP 0.2963 0.2973 0.0666 0.0229 CCEP 0.2976 0.2986 0.0667 0.0229 MG 0.5793 0.5562 0.2558 0.0814 MG 0.5796 0.5561 0.2558 0.0815 CMG 0.2956 0.2962 0.0625 0.0543 CMG 0.2970 0.2976 0.0627 0.0544 Notes: See Section S3.1 in the Appendix for details on the estimators and the DGP in each of the experiments. ste• marks the empirical standard error and ste the mean standard error from 1,000 replications. ‘CMG Agri’ and ‘CMG Manu’ employ the sector-level data, ‘Weighted’ calculates the aggregate slope coefï¬?cient based on the size (output) and slope of the respective sector, the remaining six estimators use the aggregated data. 7 S4 Additional tables and ï¬?gures Table 4: Pooled regression models (HC-augmented) Panel (A): Unrestricted returns to scale Agriculture Manufacturing [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] POLS 2FE CCEP CCEP FD POLS 2FE CCEP CCEP FD log labour -0.079 -0.151 -0.457 -0.557 -0.085 0.005 0.029 0.121 -0.048 0.162 [11.71]∗∗ [4.35]∗∗ [1.54] [1.46] [1.46] [0.62] [0.88] [1.91] [0.47] [4.62]∗∗ log capital pw 0.471 0.671 0.554 0.676 0.595 0.692 0.851 0.533 0.446 0.654 [61.84]∗∗ [27.20]∗∗ [4.51]∗∗ [4.32]∗∗ [12.60]∗∗ [44.38]∗∗ [22.14]∗∗ [8.00]∗∗ [4.52]∗∗ [14.56]∗∗ log land pw 0.018 -0.020 -0.154 -0.174 0.111 [1.17] [0.48] [0.56] [0.50] [1.14] Education 0.241 0.087 0.007 -0.068 0.101 0.226 -0.006 0.152 -0.017 0.095 [9.95]∗∗ [3.12]∗∗ [0.07] [0.40] [1.30] [11.91]∗∗ [0.21] [2.04]∗ [0.16] [1.53] Educationˆ2 -0.010 -0.007 -0.003 0.005 -0.006 -0.009 0.002 -0.006 -0.004 -0.005 [4.73]∗∗ [4.15]∗∗ [0.49] [0.50] [1.23] [6.22]∗∗ [1.39] [1.32] [0.66] [1.10] Implied RS† CRS CRS CRS CRS IRS CRS CRS CRS IRS Implied β L ‡ 0.529 0.329 0.446 0.324 0.321 0.308 0.149 0.467 0.508 Mean Education 5.82 5.82 5.82 5.82 5.94 5.82 5.82 5.82 5.82 5.94 Returns to Edu 13.3% 0.7% -2.9% -0.7% 3.0% 12.3% 1.9% 8.5% -6.6% 4.1% [t-statistic] [15.71]∗∗ [0.50] [0.68] [0.11] [0.78] [19.88]∗∗ [1.30] [3.11]∗∗ [1.56] [1.54] ˆ integrated e I(1) I(1) I(0) I(1)/I(0) I(0) I(1) I(1) I(0) I(0) I(0) CD test p-value 0.11 0.09 0.14 0.21 0.00 0.87 0.18 0.58 0.84 0.00 R-squared 0.91 0.57 1.00 1.00 - 0.91 0.57 1.00 1.00 - Observations 830 830 830 775 793 860 860 860 775 817 Panel (B): Constant returns to scale imposed Agriculture Manufacturing [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] POLS 2FE CCEP CCEP FD POLS 2FE CCEP CCEP FD log capital pw 0.502 0.720 0.592 0.709 0.611 0.695 0.839 0.472 0.463 0.558 [59.09]∗∗ [33.18]∗∗ [5.32]∗∗ [5.08]∗∗ [13.29]∗∗ [49.18]∗∗ [24.30]∗∗ [8.87]∗∗ [5.59]∗∗ [13.85]∗∗ log land pw 0.014 0.078 0.144 0.122 0.124 [0.71] [2.23]∗ [0.99] [0.69] [1.27] Education 0.278 0.069 -0.003 -0.031 0.107 0.226 0.014 0.234 0.036 0.220 [11.54]∗∗ [2.48]∗ [0.03] [0.23] [1.38] [11.80]∗∗ [0.71] [3.67]∗∗ [0.38] [3.91]∗∗ Educationˆ2 -0.012 -0.005 0.000 0.002 -0.006 -0.009 0.001 -0.010 -0.007 -0.010 [6.17]∗∗ [3.19]∗∗ [0.06] [0.28] [1.26] [6.11]∗∗ [0.98] [2.55]∗ [1.22] [2.41]∗ Implied β L ‡ 0.498 0.202 0.408 0.291 0.389 0.305 0.162 0.528 0.537 0.443 Mean Education 5.82 5.82 5.82 5.82 5.94 5.82 5.82 5.82 5.82 5.94 Returns to Edu 13.9% 0.8% -0.7% -0.3% 3.4% 12.3% 2.7% 11.7% -4.3% 10.5% [t-statistic]â™  [16.25]∗∗ [0.52] [0.18] [0.07] [0.90] [20.20]∗∗ [2.30]∗ [5.25]∗∗ [1.18] [4.62]∗∗ ˆ integrated e I(1) I(1) I(0) I(1)/I(0) I(0) I(1) I(1) I(0) I(1)/I(0) I(0) CD test p-value 0.29 0.23 0.07 0.23 0.00 0.88 0.04 0.08 0.02 0.00 R-squared 0.91 0.57 1.00 1.00 - 0.91 0.57 1.00 1.00 - Observations 830 830 830 775 793 860 860 860 775 817 Notes: We include our proxy for education in levels and as a squared term. Returns to Education are computed from the sample mean (E ˆ E2 E ˆ E + 2β ¯ ) as β ˆ E2 are the coefï¬?cients on the levels and squared education terms ˆ E and β ¯ where β respectively. â™  computed via the delta-method. For more details see Notes of Table 1 of the main text. 8 Table 5: Heterogeneous Manufacturing models (HC-augmented) Panel (A): Unrestricted Panel (B): CRS imposed [1] [2] [3] [4] [5] [6] MG FDMG CMG MG FDMG CMG log labour -0.305 -0.293 0.097 [1.20] [1.50] [0.62] log capital pw 0.059 0.144 0.426 0.352 0.347 0.386 [0.22] [0.74] [3.73]∗∗ [3.25]∗∗ [3.66]∗∗ [3.95]∗∗ Education -0.478 0.237 1.248 -0.228 0.085 0.668 [1.02] [0.81] [2.66]∗ [0.62] [0.29] [2.43]∗ Education squared 0.050 0.011 -0.098 0.005 -0.019 -0.042 [1.38] [0.35] [2.67]∗ [0.13] [0.67] [1.95] country trend/drift 0.016 0.020 0.008 0.013 [1.55] [2.44]∗ [1.16] [2.23]∗ reject CRS (10%) 38% 8% 38% Implied β L ‡ n/a 0.857 0.574 0.648 0.653 0.614 Mean Education 5.82 5.91 5.82 5.87 5.94 5.87 Returns to Edu -6.3% -1.3% 10.9% -6.2% -2.1% 11.9% [t-statistic] [1.01] [0.25] [1.89] [1.00] [0.47] [1.70] sign. trends (10%) 15 9 17 7 ˆ integrated e I(0) I(0) I(0) I(0) I(0) I(0) CD-test ( p) 0.00 0.00 0.71 0.00 0.00 0.27 Obs (N) 775 (37) 732 (37) 775 (37) 775 (37) 732 (37) 775 (37) Notes: All averaged coefï¬?cients presented are robust means across i. The returns to education and associated t-statistics are based on a two-step procedure: ï¬?rst the country-speciï¬?c mean education value (E ˆ i , E2 E ˆ i, E + 2 β ¯ i ) is used to compute β ¯ i to yield the country-speciï¬?c returns to education. The reported value then represents the robust mean of these N country estimates, s.t. the t-statistic should be interpreted in the same fashion as that for the regressors, namely as a test whether the average parameter is statistically different from zero, following Pesaran and Smith (1995). For other details see Notes for Tables 2 (main text) and 4 (above). 9 Table 6: Aggregate & PWT data: Pooled models (HC-augmented) Panel (A): Unrestricted returns Aggregated data Penn World Table data [1] [2] [3] [4] [5] [6] [7] [8] POLS 2FE CCEP FD POLS 2FE CCEP FD log labour -0.001 -0.058 0.566 0.083 0.040 -0.064 -0.193 -0.032 [0.14] [1.97]∗ [4.13]∗∗ [2.50]∗ [8.99]∗∗ [3.27]∗∗ [1.49] [1.11] log capital pw 0.662 0.782 0.677 0.766 0.725 0.680 0.601 0.676 [97.95]∗∗ [31.50]∗∗ [7.25]∗∗ [25.24]∗∗ [72.79]∗∗ [24.79]∗∗ [9.12]∗∗ [18.96]∗∗ Education 0.243 -0.004 0.086 0.065 0.041 0.043 0.032 0.103 [16.97]∗∗ [0.15] [1.24] [1.22] [3.42]∗∗ [2.86]∗∗ [0.80] [3.41]∗∗ Education squared -0.010 0.003 -0.007 -0.003 -0.001 -0.002 -0.002 -0.006 [8.05]∗∗ [1.82] [1.57] [0.77] [1.77] [2.97]∗∗ [0.83] [2.94]∗∗ Implied RS† CRS DRS CRS CRS CRS DRS CRS CRS Implied β L ‡ 0.337 0.160 0.890 0.318 0.315 0.256 0.206 0.292 Mean Education 5.824 5.824 5.824 5.885 5.822 5.822 5.822 5.883 Returns to Edu 12.9% 2.5% 1.0% 3.4% 2.4% 1.9% 0.9% 3.3% [t-statistic] [22.35]∗∗ [1.68] [0.37] [1.40] [6.82]∗∗ [2.02]∗ [0.56] [2.26]∗ ˆ integrated e I(1) I(1) I(0) I(0) I(1) I(1) I(0) I(1)/I(0) CD test p-value 0.00 0.02 0.59 0.00 0.34 0.22 0.01 0.00 R-squared 0.98 0.87 1.00 - 0.97 0.78 1.00 - Observations 775 775 775 732 769 769 769 726 Panel (B): Constant returns to scale imposed Aggregated data Penn World Table data [1] [2] [3] [4] [5] [6] [7] [8] POLS 2FE CCEP FD POLS 2FE CCEP FD log capital pw 0.662 0.798 0.485 0.744 0.694 0.706 0.611 0.691 [102.10]∗∗ [35.45]∗∗ [7.03]∗∗ [25.48]∗∗ [73.08]∗∗ [27.73]∗∗ [10.05]∗∗ [21.13]∗∗ Education 0.243 -0.016 0.210 0.111 0.043 0.037 0.016 0.092 [16.98]∗∗ [0.62] [3.00]∗∗ [2.21]∗ [3.30]∗∗ [2.44]∗ [0.48] [3.22]∗∗ Education squared -0.010 0.004 -0.013 -0.005 -0.001 -0.002 -0.002 -0.006 [8.17]∗∗ [2.75]∗∗ [2.92]∗∗ [1.37] [0.97] [2.12]∗ [0.95] [2.79]∗∗ Constant 1.586 1.843 [21.62]∗∗ [20.44]∗∗ Implied β L ‡ 0.338 0.203 0.515 0.256 0.306 0.294 0.390 0.309 Mean Education 5.824 5.824 5.824 5.885 5.822 5.824 5.824 5.883 Returns to Edu 12.9% 2.6% 6.5% 5.8% 3.3% 2.0% -0.6% 2.7% [t-statistic] [22.41]∗∗ [1.68] [2.56]∗∗ [2.56]∗∗ [8.62]∗∗ [1.99]∗ [0.42] [1.98]∗ ˆ integrated e I(1) I(1) I(0) I(0) I(1) I(1) I(0) I(0) CD test p-value 0.00 0.00 0.65 0.00 0.25 0.57 0.02 0.00 R-squared 0.98 0.86 1.00 0.97 0.78 1.00 Observations 775 775 775 732 769 769 769 726 Notes: We include our proxy for education in levels and as a squared term. Returns to Education are computed from the sample mean (E ˆ E2 E ˆ E + 2β ¯ ) as β ˆ E2 are the coefï¬?cients on the levels and squared education terms ˆ E and β ¯ where β respectively. computed via the delta-method. For more details see Notes for Tables 3 (in the main text) and (for the education variables) 4 above. 10 Table 7: Aggregate & PWT data: Heterogeneous models with HC Panel (A): Unrestricted returns to scale Aggregated data Penn World Table data [1] [2] [3] [4] [5] [6] MG FDMG CMG MG FDMG CMG log labour -0.066 0.269 -0.428 -1.609 -2.478 -1.324 [0.16] [0.57] [1.22] [1.97] [3.76]∗∗ [2.79]∗∗ log capital pw -0.070 -0.021 0.453 0.963 1.245 1.122 [0.26] [0.07] [2.47]∗ [4.44]∗∗ [5.99]∗∗ [5.52]∗∗ Education 0.601 0.637 0.489 0.123 0.004 -0.012 [1.29] [1.75] [0.98] [0.52] [0.02] [0.05] Education squared -0.089 -0.065 -0.063 -0.002 0.004 -0.001 [1.76] [1.70] [1.48] [0.11] [0.25] [0.03] country trend/drift 0.005 0.005 0.021 0.008 [0.33] [0.29] [2.25]∗ [0.77] Implied RS† CRS CRS CRS CRS DRS DRS Implied β L ‡ n/a n/a 0.547 n/a n/a n/a reject CRS (10%) 38% 3% 19% 38% 18% 33% sign. trends (10%) 44% 32% 44% 10% Mean Education 5.72 5.84 5.72 5.72 5.84 5.72 Returns to edu -7.1% -3.2% -11.1% -4.5% 0.5% 1.3% [t-statistic] [1.33] [0.65] [1.24] [1.33] [0.18] [0.43] ˆ integrated e I(0) I(0) I(0) I(0) I(0) I(0) CD-test ( p) 7.23(.00) 7.88(.00) -0.50(.61) 7.59.00) 9.29.00) 0.98(.33) Panel (B): CRS imposed Aggregated data Penn World Table data [1] [2] [3] [4] [5] [6] MG FDMG CMG MG FDMG CMG log capital pw 0.093 0.151 0.528 0.779 1.052 0.906 [0.49] [0.90] [4.90]∗∗ [5.75]∗∗ [6.43]∗∗ [5.86]∗∗ Education 0.075 0.260 0.683 -0.215 -0.134 0.089 [0.18] [0.99] [1.73] [1.25] [0.84] [0.42] Education squared -0.023 -0.023 -0.075 0.013 0.014 -0.023 [0.65] [0.89] [1.57] [0.82] [1.13] [1.16] country trend/drift 0.017 0.015 -0.001 -0.010 [1.96] [1.33] [0.21] [2.08]∗ Implied β L ‡ n/a n/a 0.472 0.221 n/a 0.094 sign. trends (10%) 37% 32% 37% 34% Mean Education 5.79 5.84 5.79 5.79 5.84 5.79 Returns to edu -9.3% -4.0% 3.2% -1.4% 0.3% -0.2% [t-statistic] [1.34] [0.88] [0.50] [0.50] [0.16] [0.05] ˆ integrated e I(0) I(0) I(0) I(0) I(0) I(0) CD-test ( p) 8.05(.00) 8.59(.00) 0.11(.92) 9.75(.00) 10.84(.00) 3.12(.00) Notes: All averaged coefï¬?cients presented are robust means across i. The returns to education and associated t-statistics are based on a two-step procedure: ï¬?rst the country-speciï¬?c mean education value (E ¯ i ) is used to compute ¯ β i,E + 2 β i,E2 Ei to yield the country-speciï¬?c returns to education. The reported value then represents the robust mean of these N country estimates, s.t. the t-statistic should be interpreted in the same fashion as that for the regressors, namely as a test whether the average parameter is statistically different from zero, following Pesaran and Smith (1995). For other details see Notes for Tables 2 (in the main text) and 5 above. 11 Table 8: Alternative dynamic panel estimators Panel (A): Agriculture Dynamic FE PMG CPMG DGMM SGMM [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] EC [yt−1 ] -0.293 -0.312 -0.300 -0.460 -0.459 -0.624 -0.466 -0.482 -0.503 -0.455 -1.087 -0.432 [11.80]∗∗ [12.43]∗∗ [11.91]∗∗ [10.63]** [9.34]∗∗ [14.29]∗∗ [10.44]∗∗ [10.06]∗∗ [9.74]∗∗ [9.34]∗∗ [2.60]∗∗ [5.38]∗∗ capital pw 0.672 0.684 0.582 0.652 0.714 0.036 0.132 0.501 0.464 0.530 1.135 0.776 [12.47]∗∗ [12.69]∗∗ [7.50]∗∗ [20.16]∗∗ [18.52]∗∗ [0.57] [3.01]∗∗ [10.78]∗∗ [11.05]∗∗ [10.83]∗∗ [2.85]∗∗ [12.59]∗∗ land pw 0.124 0.121 0.135 0.136 0.367 0.867 0.361 0.247 0.494 0.228 0.083 -0.247 [1.30] [1.29] [1.45] [2.90]∗∗ [6.43]∗∗ [8.27]∗∗ [8.05]∗∗ [5.03]∗∗ [8.95]∗∗ [4.73]∗∗ [0.35] [1.17] trend(s)† 0.001 0.008 0.012 [1.59] [3.36]∗∗ [12.26]∗∗ Constant 0.667 0.679 0.896 1.072 0.644 4.273 3.084 1.545 1.402 1.298 0.714 [5.03]∗∗ [4.75]∗∗ [4.58]∗∗ [10.48]∗∗ [7.53]∗∗ [13.11]∗∗ [10.27]∗∗ [10.38]∗∗ [9.69]∗∗ [9.94]∗∗ [4.21]∗∗ lags [trends]‡ 1 2 1 [l-r] 1 2 1 [s-r] 1 [l-r] 1 2 1 i: 2-3 i: 2-3 impl. labour 0.328 0.316 0.418 0.212 -0.081 0.098 0.507 0.253 0.042 0.242 -0.135 0.224 obs 894 857 894 894 857 894 894 894 857 872 857 894 Panel (B): Manufacturing Dynamic FE PMG CPMG DGMM SGMM [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] EC [yt−1 ] -0.196 -0.195 -0.195 -0.219 -0.181 -0.543 -0.214 -0.245 -0.194 -0.272 -2.196 -0.041 [9.40]∗∗ [9.16]∗∗ [9.31]∗∗ [6.59]∗∗ [5.97]∗∗ [4.04]∗∗ [4.13]∗∗ [7.16]∗∗ [6.45]∗∗ [7.33]∗∗ [0.72] [0.65] capital pw 0.711 0.708 0.637 1.016 1.044 0.298 1.379 0.598 1.264 0.505 1.866 -1.515 [12.96]∗∗ [12.34]∗∗ [6.85]∗∗ [29.64]∗∗ [33.09]∗∗ [5.34]∗∗ [26.80]∗∗ [11.58]∗∗ [22.28]∗∗ [9.47]∗∗ [3.25]∗∗ [0.40] trend(s)† 0.001 0.001 -0.010 [1.00] [0.24] [6.77]∗∗ Constant 0.452 0.456 0.588 -0.212 -0.228 3.493 -0.977 0.225 -0.434 0.372 1.042 [3.87]∗∗ [3.73]∗∗ [3.29]∗∗ [5.43]∗∗ [4.95]∗∗ [3.87]∗∗ [4.18]∗∗ [5.68]∗∗ [5.77]∗∗ [6.48]∗∗ [1.80] lags [trends]‡ 1 2 1 [l-r] 1 2 1 [s-r] 1 [l-r] 1 2 1 i: 2-3 i: 2-3 impl. labour 0.289 0.292 0.363 -0.016 -0.044 0.702 -0.379 0.402 -0.264 0.495 -0.866 2.515 obs 902 880 902 902 880 902 902 902 880 879 880 902 Panel (C): Aggregated data Dynamic FE PMG CPMG DGMM SGMM [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] EC [yt−1 ] -0.172 -0.176 -0.173 -0.279 -0.277 -0.429 -0.284 -0.292 -0.294 -0.317 -0.380 -0.243 [8.59]∗∗ [8.39]∗∗ [8.59]∗∗ [6.89]∗∗ [7.25]∗∗ [9.55]∗∗ [6.72]∗∗ [6.98]∗∗ [7.38]∗∗ [7.48]∗∗ [0.71] [4.21]∗∗ capital pw 0.705 0.709 0.668 0.974 1.015 0.128 0.899 0.891 0.949 0.905 0.271 0.896 [15.25]∗∗ [14.65]∗∗ [8.17]∗∗ [36.86]∗∗ [37.38]∗∗ [1.90] [21.11]∗∗ [24.84]** [24.92]∗∗ [27.54]∗∗ [0.27] [22.80]∗∗ trend(s)† 0.000 0.011 0.004 [0.54] [6.07]∗∗ [2.42]∗ Constant 0.390 0.393 0.446 -0.100 -0.200 3.061 0.082 -0.062 -0.169 -0.145 0.120 [4.96]∗∗ [4.62]∗∗ [3.42]∗∗ [3.73]∗∗ [5.18]∗∗ [9.30]∗∗ [4.20]∗∗ [2.53]∗ [4.97]∗∗ [4.58]∗∗ [1.44] lags [trends]‡ 1 2 1 [l-r] 1 2 1 [s-r] 1 [l-r] 1 2 1 i: 2-3 i: 2-3 impl. labour 0.295 0.292 0.332 0.026 -0.015 0.872 0.102 0.109 0.051 0.095 0.729 0.104 obs 879 836 879 879 836 879 879 879 836 879 836 879 Panel (D): Penn World Table data Dynamic FE PMG CPMG DGMM SGMM [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] EC [yt−1 ] -0.098 -0.101 -0.107 -0.333 -0.138 -0.567 -0.392 -0.338 -0.081 -0.347 0.835 0.031 [5.82]∗∗ [6.01]∗∗ [6.22]∗∗ [6.70]∗∗ [4.37]∗∗ [12.63]∗∗ [7.88]∗∗ [6.63]∗∗ [2.56]∗ [8.24]∗∗ [1.07] [0.49] capital pw 0.538 0.553 0.356 0.923 0.916 0.698 0.652 0.903 -0.125 0.731 0.604 0.863 [8.14]∗∗ [8.66]∗∗ [3.44]∗∗ [130.34]∗∗ [71.72]∗∗ [65.10]∗∗ [67.96]∗∗ [52.90]∗∗ [1.81] [86.83]∗∗ [0.60] [1.88] trend(s)† 0.001 0.002 0.006 [2.44]∗ [2.57]∗ [19.84]∗∗ Constant 0.363 0.360 0.567 -0.122 -0.020 1.085 0.935 -0.071 0.456 0.504 0.010 [5.38]∗∗ [5.29]∗∗ [5.28]∗∗ [4.44]∗∗ [1.63] [13.05]∗∗ [7.79]∗∗ [3.47]∗∗ [2.99]∗∗ [8.29]∗∗ [0.07] lags [trends]‡ 1 2 1 [l-r] 1 2 1 [s-r] 1 [l-r] 1 2 1 i: 2-3 i: 2-3 impl. labour 0.462 0.447 0.645 0.077 0.084 0.302 0.349 0.097 1.125 0.270 0.396 0.137 obs 914 904 914 914 904 914 914 904 873 904 914 Notes: All results are based on an unrestricted error correction model speciï¬?cation (ECM), which is equivalent to a ï¬?rst order autoregressive distributed-lag model, ARDL(1,1) (see Hendry, 1995, p.231f). We report the long-run coefï¬?cients on capital per worker (and in the agriculture equations also land per worker). EC [yt−1 ] refers to the Error-Correction term (speed of adjustment parameter) with the exception of Models [11] and [12], where we report the coefï¬?cient on yt−1 — conceptually, these are the same, however in the latter we do not impose common factor restrictions like in all of the former models. Note that in the PMG and CPMG models the ECM term is heterogeneous across countries, while in the Dynamic FE and GMM models these are common across i. † In model [6] we include heterogeneous trend terms, whereas in [7] a common trend is assumed (i.e. linear TFP is part of cointegrating vector). ‡ ‘lags’ indicates the lag-length of ï¬?rst differenced RHS variables included, with the exception of Models [11] and [12]: here ‘i:’ refers to the lags (levels in [11], levels and differences in [12] used as instruments. In the models in [8] and [9] the cross-section averages are only included for the long-run variables, whereas in the model in [10] cross-section averages for the ï¬?rst-differenced dependent and independent variables (short-run) are also included. References Hendry, D. (1995). Dynamic Econometrics. Oxford University Press. Kapetanios, G., Pesaran, M. H., & Yamagata, T. (2011). Panels with Nonstationary Multifactor Error Structures. Journal of Econometrics, 160(2), 326-348. Pesaran, M. H. (2004). General diagnostic tests for cross section dependence in panels. (IZA Discussion Paper No. 1240) Pesaran, M. H. (2006). Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica, 74(4), 967-1012. Pesaran, M. H. (2007). A simple panel unit root test in the presence of cross-section dependence. Journal of Applied Econometrics, 22(2), 265-312. Pesaran, M. H., & Smith, R. P. (1995). Estimating long-run relationships from dynamic heteroge- neous panels. Journal of Econometrics, 68(1), 79-113. 13