WPS8097 Policy Research Working Paper 8097 Prices, Engel Curves, and Time-Space Deflation Impacts on Poverty and Inequality in Vietnam John Gibson Trinh Le Bonggeun Kim Development Economics Vice Presidency Operations and Strategy Team June 2017 Policy Research Working Paper 8097 Abstract Many developing countries lack spatially disaggregated multilateral price indexes calculated from repeated spatial price data. Some analysts use “no-price” methods by using price surveys. Deflators from a food Engel curve appear a food Engel curve to derive the deflator as that needed for to be a poor proxy for deflators obtained from multilat- nominally similar households to have equal food shares in eral price indexes. To the extent that such price indexes all regions and time periods. This method cannot be tested reliably compare real living standards over time and space, in countries where it is used as a spatial deflator since they these results suggest that estimates of the level, location, lack suitable price data. In this paper, data from Vietnam are and change in poverty and inequality would be distorted used to test this method against benchmarks provided by if the Engel method deflator was used in their stead. This paper is a product of the Operations and Strategy Team, Development Economics Vice Presidency. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at jkgibson@waikato.ac.nz. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Prices, Engel Curves, and Time-Space Deflation: Impacts on Poverty and Inequality in Vietnam John Gibson, Trinh Le, and Bonggeun Kim JEL Codes: D12, E31, O15 Keywords: Deflator bias, Engel curves, inequality, poverty, prices John Gibson (corresponding author) is a professor of economics at University of Waikato, Hamilton, New Zealand; his email address is jkgibson@waikato.ac.nz. Trinh Le is a research fellow at Motu Economic and Public Policy Research, Wellington, New Zealand; her email address is Trinh.Le@motu.org.nz. Bonggeun Kim is a professor of economics at Seoul National University, Seoul, Korea; his email address is bgkim07@snu.ac.kr. The authors are grateful for assistance from Valerie Kozel, Ian Hinsdale, Nguyen Tam Giang, Tinh Doan, and Geua Boe-Gibson, along with many staff from the General Statistics Office. Helpful comments were received from the editor and three anonymous referees and from seminar audiences at Monash and Otago. All remaining errors are those of the authors. I. INTRODUCTION Reliable data on real welfare over time and space in poor countries are rare. Statistical agencies mostly focus on the temporal Consumer Price Index (CPI), which lets one compare changes in, but not levels of, prices over space. Few poor countries have a spatial price index, despite their weak infrastructure and limited market integration permitting large spatial price differences. 1 Without consistent time-space comparisons of living standards, it is unclear if reports of rising inequality in some developing countries (e.g., China) reflect spatially diverging prices more than growing disparities in real welfare levels. Debates about where and by how much poverty has fallen also depend critically on accurate cost-of-living comparisons over time and space. Amongst ways to spatially deflate in countries without spatial prices, the most startling results use a food Engel curve to calculate the deflator that lets different nominal incomes have the same real standard of living (based on the same food share). This adapts a method developed for temporal comparisons by Hamilton (2001), who estimated Engel curves to back out the implied true price index and real income growth over time.2 Almås (2012) uses Hamilton’s method for spatial comparisons; assuming a unique food Engel curve for the world, gaps between the food share for a particular country and the base country imply bias in the Purchasing Power Parity (PPP) 1. Gibson (2013) provides examples of the priority that statistical agencies in poor countries give to collecting nominal living standards data over price data, despite both types of data being needed for measuring real welfare. 2. Applications of the temporal method include studies of the historical United States (Costa 2001; and Logan 2009) and contemporary Australia (Barrett and Brzozowski 2010), Brazil (Filho and Chamon 2012), Canada (Beatty and Larsen 2005), China (Chamon and Filho 2014), Indonesia (Olivia and Gibson 2013), Japan (Higa 2013), Korea (Chung, Kim and Gibson 2010), Mexico (Filho and Chamon 2012), New Zealand (Gibson and Scobie 2011), Norway (Larsen 2007), and Russia (Gibson, Stillman and Le 2008). 2 statistics of the Penn World Table. Correcting for PPP bias raises global inequality by at least one- quarter. Almås and Johnsen (2012) apply the same method to China but add a time dimension to uncover spatial bias in real growth rates. China’s CPI seems too low in rural areas and too high in urban areas; the Engel curve deflator shows a 44% rise in the rural cost-of-living from 1995 to 2002 and no change in the urban cost-of-living, versus CPI increases of 8% and 11%.3 Correcting this bias raises the rural cost-of-living from 60% of the urban level in 1995 to 87% by 2002, and one-half of apparent poverty reduction in rural China disappears. 4 Likewise, an Engel curve deflator for India gives a fall in rural poverty of just 5% between 2005 and 2010, versus a 20% fall at official poverty lines that are allegedly time-space consistent (Almås, Krelsrud, and Somanathan 2013). Remarkable gaps occur in the records for some states; for example, the official lines show that the poverty rate fell in rural West Bengal from 38% to 29% while the Engel curve deflator has Bengalese poverty rising from 67% to 70%. The record of recent progress in poverty reduction for poor countries may need to be revised if these results based on Engel curve deflators are correct. But it is hard to know how credible these findings are since China and India lack spatially disaggregated price surveys, preventing comparison of Engel curve deflators with multilateral price indexes.5 In this paper we 3. Chamon and Filho (2014) use Urban Household Survey data for ten Chinese provinces and estimate an upward bias in the CPI of about one percentage point per year over 1993–2005 but do not consider any spatial deflation. 4. Specifically, for the $1 a day poverty line, the fall in the rural poverty gap between 1995 and 2002 is -0.67 using the CPI deflator but only -0.32 using the food Engel curve deflator. 5. Household expenditure surveys in both countries allow unit values to be constructed (mainly for foods), but these are best treated as a proxy for quality rather than for price (Gibson 2013). Unit values have been widely used in India to calculate spatial multilateral price indexes for urban and rural sectors and states, most recently by Deaton and Dupriez (2011) and Majumder, Ray, and Sinha (2012). The most widely used spatial deflator for China is from Brandt and Holz (2006), who used provincial CPI data from 1990 to price national rural and urban expenditure baskets (containing 40–60 items; with 40% of the rural basket using urban prices since rural prices were missing). The annual rate of change in the CPI for each province was then used by Brandt and Holz to extend from the base year back to 1984 and forward to 2004, which likely causes time-space inconsistencies, as demonstrated for the example of Russia by Gluschenko (2006). 3 conduct just such a comparison, using high quality data from Vietnam. Specifically, we use the 2010 and 2012 Vietnam Household Living Standards Surveys (VHLSS) and market prices from spatial cost of living surveys fielded in conjunction with the VHLSS. In each year, prices of up to one hundred goods and services were surveyed in sixteen hundred different markets, with surveyors given detailed pictures of the desired specifications to ensure consistency over time and space. The spatial deflators and spatially disaggregated estimates of temporal inflation derived from the food Engel curve are a poor proxy for the deflators obtained from the multilateral price indexes. The Engel curve deflators suggest costs of living in some rural regions exceed those of the capital city. The derived changes in the cost of living from 2010 to 2012 vary widely over space, with the Engel curve suggesting deflation in some regions while the multilateral indexes give regional price changes of between 14–26% (the CPI rose 26% between the 2010 and 2012 surveys). These differences matter to conclusions about the location, level, and trend in poverty and inequality. For example, if the Engel curve deflator is used, the national Gini index rises to 0.47 from the 0.43 calculated in nominal terms; in contrast, using the multilateral price indexes leads to a lower Gini of 0.40 in real terms. The Engel deflator also causes the headcount poverty rate to be ten percentage points higher and skews the poverty profile to finding more rural poverty, especially in some regions that are already poorest. Our results cast doubt on the Engel curve method, but its proponents may claim that our multilateral price indexes do not get the right cost-of-living and are a poor benchmark. The Engel method relies on food shares falling as income rises, so preferences cannot be homothetic. Thus any benchmark price index consistent with homothetic preferences may be considered an unfair test. But a “fair” test of the Engel curve method is hard to design. Beatty and Crossley (2012) show 4 that this method gives the true cost of living for an unknown household whose expenditure gives zero utility at base period prices. As Nakamura et al. (2015) note, the Engel curve method recovers the change in the cost-of-living for a household that may be anywhere in the income distribution; thus it is hard to see what crucial experiment could compare the Engel curve deflator with another deflator. Even a fixed-weight cost of goods index that does not imply homothetic preferences, like a Laspeyres index, may be a poor benchmark for such a test since one would not know whether to weight it democratically, plutocratically, or at some other point in the income distribution so as to best match the unknown reference household of the Engel curve method.6 Our strategy for testing the Engel method is more pragmatic. The issue of the preference framework that gives a price index consistent with the cost-of-living changes recovered by the Engel curve method is hard to resolve since the reference household is unidentified. But we note that researchers are not using the Engel curve method as a spatial deflator because they are guided by hypothesis tests that this is a more preference-compatible deflator.7 Instead, this method is used for time-space deflation because spatially disaggregated prices are unavailable. For example, Almås and Johnsen (2012, 2) motivate their food Engel curve deflator study by stating: Why is it necessary to produce new price indices? First, data on prices in China are scarce. To our knowledge, there are no official and available price indices that allow for cross-province comparisons, and price data on specific goods are extremely limited. 6. The plutocratic weights for the CPI treat dollars equally and thus treat people unequally since some people have many more dollars than others. Deaton (1988) shows that in the United States the CPI weights are representative for a household above the 75th percentile of the expenditure distribution, while Ley (2005) shows that higher inequality, differences in consumption patterns by income groups, and greater variance in individual price behavior all contribute to the gap between plutocratic and democratic price indexes. 7. This contrasts with cross-country studies of PPPs for measuring global poverty where, for example, Ackland et al. (2013) test the hypothesis of common homothetic preferences and find that in samples from the 1996 and 2005 ICP they cannot reject homothetic preferences for about 70% of countries. 5 Hence natural benchmarks are the sort of price indexes that a statistics office would use if price data were available. Typically, this would be a fixed-weight index, like a Laspeyres, for temporal deflation. For spatial deflation a statistics office might use a variable-weight superlative index, like a Törnqvist, since substitution bias is likely a bigger concern over space than over time given that relative prices do not vary much over the short to medium term (Van Veelen and Van der Weide 2008). Our Weighted Country Product Dummy (WCPD) testing framework allows both fixed- weight and variable-weight price indexes to be calculated (along with their standard errors), and we apply these dual benchmarks to evaluate the performance of the Engel curve method. On top of the empirical results there are good reasons to doubt the Engel curve method. Anything varying over space and affecting food shares but omitted from Engel curve regressions gets attributed to price differences between areas. For example, calorie needs and food shares are high for hard working rural folk; equally poor but sedentary urbanites seem better off due to their lower food shares (Deaton and Dupriez 2011). Likewise, another Engel curve study for China found a much richer set of covariates than those of Almås and Johnsen—including temperature— were relevant to food shares and were correlated with spatial variables (Gong and Meng 2008). While these factors could be included in Engel curve regressions, they almost never are, yet they reflect long-standing spatial differences that likely vary much more than do short-term changes when food budget shares are compared over time. Thus the omitted variables bias problem is potentially much worse when the Engel curve method is used over space than over time. Another problem for the Engel curve method is that food shares will vary with relative prices, but if there is no spatial price survey, then relative prices are unobserved. In temporal uses of the Engel curve method, the relative inflation rate for food versus nonfood is used, but this is no help for spatial comparisons. While unit values (surveyed expenditures divided by quantities) 6 are sometimes used to proxy for prices, it is rare for surveys to get quantities (and hence unit values) of most nonfood items. Moreover, unit values for food will be systematically biased over space because they will average over a different quality mix in net consuming areas compared to net producing areas because of the Alchian-Allen effect that fixed charges for transport, storage, or processing will alter the relative price of quality over time and space (Gibson and Kim 2015). Even allowing for all of these threats to the Engel curve method, a proponent could simply note that the current results show big gaps between what standard price indexes show and what the Engel curve shows. In temporal CPI bias studies these gaps are taken as evidence of problems with the standard price indexes, and results from the Engel curve method are treated as closer to the truth. In the current study, the gaps are treated as evidence that results from the Engel curve method are further from the truth. The authors have all published CPI bias studies (e.g., Gibson et al. 2008 and Chung et al. 2010), so a valid question is why we have switched sides, as it were. There are four reasons: a conventional price index may be more reliable over space than over time; the converse is likely to be true of the Engel curve method; there is more corroborating evidence available for assessing temporal CPI bias than for spatial deflators; and, the point raised by Beatty and Crossley (2012) about the unknown reference household of the Engel curve method potentially raises an important caveat to prior results on CPI bias. In terms of the first point, various sources of bias in price indexes, such as quality change, delayed introduction of new goods, and unaccounted for substitution of outlets and commodities likely are bigger problems for a temporal index than for a spatial one. For example, superlative indexes can deal with commodity substitution bias over space since base and current region budget shares are available but not over time (except retrospectively) since contemporaneous budget shares are not available for current period price index calculations. New and improved quality 7 goods may be accessed in different regions and there need not be outlet substitution bias notwithstanding the challenge of finding similar types of outlets in urban and rural areas when surveying prices. Second, comparing Engel curves over time is likely more reliable than comparing them over space, since household survey design is usually stable over the short term and average characteristics of respondents that might affect measurement error also will be fairly stable over time. In contrast, countries might use different methods to survey urban and rural households, and even if the same method is used it may be de facto different (e.g., diary surveys in illiterate rural areas often degrade to unstructured recall, while they may be truer to design in literate urban areas). This matters since key Engel curve parameters are sensitive to differences in how survey questions are posed and answered. 8 Third, Engel curve results on CPI bias are often corroborated by analyses of durables ownership or by comparing subjective reports of well-being over time. In contrast, there is only a diffuse prior about expected patterns of prices over space, except perhaps that prices should be higher in nominally richer areas due to the Balassa-Samuelson effect, although the opposite can be claimed (Muller 2002). Absent corroborating analyses, the burden of proof for relying on the Engel curve deflator for spatial comparisons should be higher than it is for temporal comparisons.9 The remainder of the paper is structured as follows. Section II describes the context and data, paying particular attention to the spatially disaggregated prices that are rarely available for 8. Gibson et al. (2015) randomly assign eight different consumption surveys to households in Tanzania and find the coefficient on real income in the food Engel curve varies by a factor of three between survey designs. This is one of two coefficients that determines the Engel curve deflators, so this fragility suggests that variation over space in survey design or in characteristics such as respondent’s education, wealth, and food acquisition opportunities, which correlate with measurement errors, may spuriously affect the deflators derived from Engel curves. 9. An exception is Almås et al. (2013) who attempt to provide corroborating evidence by comparing patterns of calorie sources and self-reported hunger around the poverty lines based on their Engel curve deflator. 8 large developing countries. The multilateral price indexes and Engel curve methods are set out in section III. Estimation results and comparisons between the various deflators are in section IV, and the impacts of different deflators on poverty and real inequality are described in section V. A limited cost-benefit evaluation is in section VI, while the conclusions are in section VII. II. CONTEXT AND DATA DESCRIPTION Over the past two decades Vietnam has conducted eight household living standards surveys that have been widely used to monitor progress in poverty reduction. In the first of these, the 1992/93 Vietnam Living Standards Survey (VLSS), prices in local markets were surveyed to provide one source of information on regional differences in the cost-of-living. The poverty lines calculated in 1993 suggested urban prices were 20% above rural prices, while the highest cost region (of seven then demarcated) had costs of living about 35% above the lowest cost region. The next VLSS, in 1998, fielded a price survey just in rural areas, with poverty line updating in urban areas relying on prices already collected for the CPI. The next four surveys (the Vietnam Household Living Standards Surveys [VHLSS] of 2002, 2004, 2006, and 2008) relied solely on already-collected CPI prices to update rural and urban poverty lines and spatial deflators. There were several concerns with using temporal index prices to calculate a spatial index. Vietnam’s CPI is ostensibly national in scope but the prices used to form the spatial deflators were from just forty of Vietnam’s sixty provinces. Also, the outlet sample for the CPI is not spatially representative since outlets need to be easily accessible (some item prices are observed every ten days) and in areas of dense demand so that target specifications are always in stock. Moreover, the CPI changed in 2006 to let provinces pick item specifications that suited the peculiarities of local demand, rather than using nationally-consistent specifications, so reported spatial price differences 9 thereafter may have included quality differences. In general, spatial variation in the cost of living is unlikely to be accurately measured with data collected for a temporal index, and in this regard the situation in Vietnam was similar to other large developing countries.10 In light of these concerns, the Prices Department of the General Statistics Office (GSO) introduced a new spatial cost of living index (SCOLI) in 2010, based on a price survey fielded in 1,588 communes (almost one-fifth of the total).11 Surveying overlapped with the VHLSS, which another GSO department was running in the same communes (and others) at the same time, ensuring that the budget shares needed for the SCOLI relate to the same period as the prices. To maintain consistency over space, the price surveyors were given detailed photographs of each of the sixty-four goods and services that were the specifications to be priced. The surveyors were required to find examples in the market of similar size and quality to what was pictured, weigh them, and record prices per metric unit (unless they were in standard packaging of known weight or were a service). The sampled prices were to be obtained from three different vendors in each locality; this quota was met in almost 90% of the item-market combinations and the price index calculations described below dealt with the remaining cases of missing data. The price surveying for the SCOLI was repeated in 2012, again in conjunction with the VHLSS but with an expanded scope. Specifically, the number of goods and services to be priced increased to 101, with seven food items and thirty nonfood goods and services added to the basket, and the number of communes surveyed increased to 1,644. The prices in approximately one-half of the communes were surveyed in March and for the other half in September, with the subsamples 10. For example, the main spatial deflator used in China, due to Brandt and Holz (2006), was developed from prices collected for the CPI. 11. Vietnam’s communes are the lowest level administrative unit, averaging about 10,000 people or 2,500 households. 10 in both rounds being nationally representative and matching rounds 1 and 3 of the four-round VHLSS. In 2010, prices had been surveyed in all communes that were part of the second (September) round of the three-round VHLSS and in a randomly drawn subset of the communes in the third (December) round. Analyses of the prices from both years reveal that spatial patterns in prices do not vary within-year, and for almost all items, the variation in prices over space is much greater than the between-month variation. The nominal welfare variables and the data for the Engel curve analysis—food budget shares and covariates other than prices—come from the 2010 and 2012 VHLSS. For both surveys the consumption modules use a thirty-day recall of food purchases and consumption from own- production and gifts, another recall of spending during festive periods on twenty-four food and drink groups, a thirty-day recall for twenty-eight frequently purchased nonfood items and an annual recall for thirty-six other items. The only change in 2012 was that three of the fifty-four food groups from 2010 (rice, cooking oil and lard, and outdoor meals) were split (high and low quality rice, oil separate from lard, and meals by whether household members were at home or away). This slightly finer disaggregation may prompt recall of some forgotten food spending, so food shares in 2012 may be slightly higher than otherwise and people may appear poorer (and so a higher price index will be derived) than if there had been no change in design. The 2010 and 2012 VHLSS marked a break from prior surveys. A “usual month” format and less comprehensive consumption aggregate than in 2010 were features of the prior surveys, which maintained definitions from 1993.12 This link to the past caused growing understatement of consumption and overstatement of the food share as Vietnam got richer and people diversified 12. Usual month recall is based on reporting the number of months in which the item is usually consumed by the household, the usual expenditure in those months, and the quantity usually consumed. 11 away from a food-based budget.13 For example, just 78% of comprehensive consumption in the 2010 VHLSS would be counted under the 1993 definition, and the average food share would be 54% rather than 46% (World Bank 2012). Correspondingly, the poverty line was also changed in 2010, raising it to VND 653,000 per person per month (US$2.26 per day in 2005 PPP terms). Under this line, 21% of Vietnam’s population in 2010 was counted as poor, with headcount poverty rates of 27% in rural areas and 6% in urban areas. The much lower poverty line used previously had seen headcount poverty rates fall to 15% by 2008 (from 58% in 1993). With these new baseline measures of consumption and poverty in place, the challenge for statistical authorities in Vietnam is to make consistent time-space comparisons of real living standards, inequality, and poverty in the future. While the SCOLI program may continue, the earlier VLSS experience and the current situation in most developing countries is that spatial price surveys are not fielded, even as part of household living standards surveys. Moreover, the SCOLI in 2010 was donor funded, and absent this support, the GSO may revert to using the CPI to calculate spatial deflators, so there is interest in how “no-price” methods of deriving spatial deflators perform. The experience of Vietnam in 2010 and 2012 where there is a benchmark from a comprehensive, spatially disaggregated price survey therefore gives a rare opportunity to assess how well a “no-price” method, such as the food Engel curve, works in practice. III. METHODS When researchers deflate for welfare analysis they typically want an empirical approximation to the true cost-of-living index (COLI): the ratio of minimum expenditure at alternative prices to 13. Annex 2.1 of World Bank (2012) summarizes the differences between the comprehensive consumption aggregate and the one which held fast to the 1993 definition of consumption. 12 minimum expenditure at base prices holding the standard of living constant. There are three broad approaches, according to Dumagan and Mount (1997) and Breur and von der Lippe (2011): use a price index with known biases, such as the Laspeyres, that gives a bound to the COLI; use a superlative index formula such as the Törnqvist, which is closer to the true COLI (due to less substitution bias) if preferences are homothetic but has an income bias if preferences are not homothetic; and, econometrically estimate demand equations for a set of goods, from which the theoretical expenditure functions that are numerator and denominator of the COLI can be derived. While the demand systems approach can handle nonhomothetic preferences and has early examples from developing countries (e.g., Ravallion and van de Walle 1991), it has proved difficult to carry out in practice and is not widely used so we do not consider it further.14 In contrast, the Laspeyres is used in probably all countries for their CPI while there are active debates in some countries about switching from this to a superlative price index, as was recommended by the “Boskin Commission” on the CPI in the United States. The known biases in a Laspeyres index are substitution biases from not accounting for consumers moving towards items (or outlets) that are relatively cheaper than in the base period or region, quality change bias when higher prices for improved goods wrongly get treated as inflation, delayed entry of new goods missing the rapid fall in price early in the product lifecycle, and biases due to the formula used to aggregate individual price observations into an index of price relativities. For spatial deflation the quality change and new goods biases should matter less than item substitution bias since new and improved goods are, in principle, available in all regions at the 14. Oulton (2012) proposes an algorithm based on principal components to overcome a problem for the econometric approach of too many parameters to estimate for the available data. This enables compensated budget shares to be derived econometrically in order to hold utility constant at some reference level for the nonhomothetic case with only the same data requirements as needed for conventional index numbers. 13 same time. A superlative index allows changes in the basket between the base period or region and the current period/region and so accounts for consumer substitution, while the Laspeyres index continues to price the base period or region basket. But using budget shares from two periods or regions has a potential problem; these may not refer to the same standard of living. In contrast, a fixed-weights Laspeyres index refers to the base period or region standard of living. The potential “income bias” in the superlative index will not happen in the special case of homothetic preferences, with budget shares not changing with income. But observed behavior, such as falling food shares as income rises, is inconsistent with homothetic preferences. The income bias of the superlative index may exceed the Laspeyres substitution bias and may be positive or negative whereas substitution bias only overstates changes in the cost of living between the base period or region and the current one (Dumagan and Mount 1997). For time-space deflation, a multilateral index method is needed to calculate regional and temporal price levels jointly so as to ensure transitivity (Hill 2004). The two main methods used for PPPs in cross-country studies are Geary-Khamis (GK), used in the Penn World Table, and EKS (Eltetö, Köves and Szulc), used by the World Bank. The GK method lets subindexes add to a total, which is useful for deflating GDP and its components but is less needed for comparing levels of living (Deaton, Friedman, and Alatas 2004). Moreover, the GK index uses plutocratic weights and does not allow for substitution effects; these are undesirable features if comparing household living standards over space. The EKS allows substitution because it uses underlying Fisher indexes (geometric means of a Paasche and Laspeyres) which are superlative in the sense of being an exact cost-of-living index for some homothetic utility function that is a flexible functional form, 14 allowing a fully general matrix of price substitution effects (Deaton et al. 2004).15 While less well known than either GK or EKS, another multilateral method is the Weighted Country Product Dummy (WCPD), which allows for substitution effects, for democratic weights, and for reversibility (which matters in spatial comparisons since there is no natural base country or region, unlike for temporal comparisons). Deaton et al. (2004) recommend EKS and WCPD for work on the measurement of living standards. Weighted Country Product Dummy (WCPD) Method The Country Product Dummy (CPD) method is a hedonic regression, proposed by Summers (1973) to deal with missing data in international comparisons, where the only characteristic of a commodity is the commodity itself. This humble origin belies a very useful framework for making price comparisons because with appropriate choice of expenditure or quantity weights one can derive several bilateral price indexes, including those of Dutot, Jevons, Törnqvist, and Walsh (Diewert 2005), and also a multilateral system that is an expenditure-share weighted geometric form of the Geary-Khamis index (Rao 2005). This set of price indexes includes both fixed-weight ones and variable-weight superlative indexes. Rao (2004) argues strongly in favor of both weighted and unweighted CPD methods, which also let various regression techniques be used to handle data-related problems and allow standard errors of the PPPs to be calculated. We use the WCPD framework to provide benchmarks for evaluating the deflators provided by the food Engel curve method. The WCPD works as follows: for J regions, K goods, 15. EKS methods impose transitivity in the following way: first, make bilateral comparisons between all possible pairs of countries and then take the nth root of the product of all possible Fisher indices between n countries. Deaton and Dupriez (2011, 4) note that multilateral price indexes required for spatial work are typically not consistent with the inflation rates in local CPIs and so need to be calculated regularly, not just once, and updated by the local CPIs. The repeated implementation of the SCOLI for Vietnam in 2010 and 2012 fits with this requirement. 15 and T periods, the relationship between the prices of goods in different regions and periods is assumed to follow: pk , j , t   j , t  k uk , j , t (1) where  j ,t is the price level in region j and period t relative to the base region/period, k is the price level of good k relative to the base good, and uk , j , t is a random disturbance term. The price parameters (  j ,t andk ) in equation (1) can be directly estimated in a log-linear regression model, using the KJT prices from a spatially disaggregated price survey: J T J K wk , j ,t ln pk , j ,t    j,0 wk , j ,0 D j ,0   ln  j ,t wk , j,,t D j,t  k wk , j,t Dk u k , j,t ˆ  ln  j 1 t 1 j  0 k 1 (2) where the weight wk,j,t for good k in region j and period t is described below, Dj,t is a dummy variable for region j and period t, Dk is a dummy for good k, and ˆ is the intercept plus the coefficient of the omitted base category dummies. We use two types of weights so as to generate two price indexes for evaluating deflators from the food Engel curve. These represent two of the three broad approaches to approximating a cost-of-living index, with the demand systems approach not used. First, we use a variable-weight price index, by using as weights: ( skj ,t  s k 0,1) / 2 where skj,t is the average budget share of item k, in region j, and time t, and sk0,1 is the average budget share for item k in the base period in region 0 (which we set to be the urban sector of the Red River region, where Hanoi is). We refer to this price index as WCPD_vw (for variable-weight), which gives estimated time-space deflators:  K  s kj , t  s k 0,1   p   j , t     ln kj ,t  (3a)  k 1  2   p      k 0,1  16 The WCPD-vw allows for substitution since it uses budget shares from both the base region and period and also from the current region and period, but it exactly measures the cost of living only for homothetic preferences. Therefore, we also use a fixed-weight index that does not rely on homothetic preferences but is subject to substitution bias, by using sk0,1 as the weight for all periods and regions. The time-space deflators for the WCPD-fw (for fixed-weight) index are:   s ln p K  j ,t   k 0,1  p  kj , t  (3b)  k 1   k 0,1  Intuitively, WCPD-fw is a Laspeyres-like index but it is not exact. Selvanathan (1991) shows how an appropriately weighted linear regression lets one calculate a Laspeyres; the difference is that our WCPD models are log-linear. Nevertheless, the deflator in equation (3b) gives an alternative testing framework that does not depend on homothetic preferences and is close to the sort of price index that a statistics office would likely report if they had disaggregated price data. Engel Curve Method In the original formulation of Hamilton (2001), the budget share of food at home for household i in region j and time period t, wi,j,t is treated as a linear function of the logarithm of real household income, a relative price term and control variables: wi , j ,t     ln PF , j ,t  ln PN , j ,t    ln Yi , j ,t  ln Pj ,t   X  ui , j ,t ( 4) where PF,j,t, PN,j,t, and Pj,t are the true but unobserved prices of food, nonfood, and all goods, Y is total expenditure (a permanent income proxy), X represents control variables, and u the disturbance. The true cost of living is a geometric weighted average of food and nonfood prices: ln Pj ,t   ln PF , j ,t  1    ln PN , j ,t (5) Hamilton assumed prices of good G (food, nonfood, or all goods) are measured with error, 17 ln PG, j,t  ln PG, j,0  ln 1  G, j,t   ln 1  EG,t  , (6) where G,j,t is the cumulative percentage increase in the CPI-measured price of good G from period 0 to period t and EG,t is the period-t cumulative measurement error in the price index since that base period. Inserting equation (6) into equation (4) gives: wi , j ,t     ln 1   F , j ,t   ln 1   N , j ,t    ln Yi , j ,t  ln 1   j ,t   X    ln 1  EF ,t   ln 1  E N ,t    ln 1  Et    ln PF , j ,0  ln PN , j ,0    ln Pj ,0  ui , j ,t . (7) An estimable version of equation (7) using a time-series of cross-sectional household budget data and a temporal CPI for food, nonfood, and all consumption is: ˆ   ln 1   1   N , j ,t  wi , j ,t   F , j , t   ln    ln Yi , j ,t  ln 1   j ,t   X T J    t Dt    j D j  ui , j ,t (8) t 1 j 1 ˆ is the intercept from where Dt are time dummy variables, Dj are regional dummies, and  equation (7), plus the coefficients of the omitted time and region dummies. In the usual time series usage, the coefficients on the time dummy variables, t, are key to the measurement of deflator bias; the calculation of real income should already have put households observed in different years on the same cost-of-living basis so there should be no temporal “drift” in the residual food share. These dummy coefficients capture relative price effects, differential bias for food and nonfood, and overall deflator bias scaled by the coefficient on income: ln 1  EF ,t   ln 1  E N ,t  t       ln 1  Et . (9) If the degree of CPI-bias in food and nonfood is approximately equal (or if relative price movements have only small effects on food budget shares) then Hamilton (2001) shows that: 18 ln1  Et     t  (10) with the cumulative CPI bias at period t, Et, just a simple ratio of coefficients: 1  exp(  ˆt ˆ). To adapt this method to time-space deflation in the manner of Almås and coauthors, it requires three changes to the framework. First, rather than having a vector of spatial dummies, Dj, and a separate vector of temporal dummies, Dt, time-space dummies Dj,t, which equal 1 for region j and period t, are needed so temporal patterns can vary across spatial units and spatial patterns can vary over time. Second, the relative price of food has to be measured at a more spatially disaggregated level, which we here call area, a, otherwise the ˆ is identified from the same regional and temporal variation as the time-space dummies and perfect collinearity will result. The third change is that income and the relative price of food need to be in nominal terms so that what was previously calculated from the dummy variable coefficients as deflator bias is now the estimate of the omitted true cost of living Pj,t. After these changes, the estimating equation is: wi , a , j ,t  ˆ   (ln P* N , a , j , t )   ln Yi , a , j , t  X   F , a , j , t  ln P *  J T J (11)   j ,0 D j ,0    j ,t D j,t  ui ,a , j ,t j 1 t 1 j  0 where the starred terms are nominal price indexes for food and nonfood, and the intercept ˆ now includes the coefficient of only a single omitted dummy, D0,0. The estimated PPP for the price level in region j and time period t is then calculated as: P j , t  exp   j ,t  . (12) 19 IV. ESTIMATION RESULTS AND IMPLIED DEFLATORS The deflators are estimated for Vietnam’s six broad regions (see figure 1), with the cost of living allowed to vary between urban and rural sectors within regions. As a first step, the prices had to map to average budget shares from the VHLSS, which has 120 commodity groups, while prices for fewer groups were surveyed. Budget shares for some groups are combined to match the prices, and the reverse also occurs, with fourteen groups having multiple prices mapping to a single budget share; these prices are first aggregated to budget-share level. In some cases, especially residual categories such as “other vegetables,” the mapping was from the prices of closely related items (e.g., for all of the specific items in the broad group whose residual category was lacking a price). Finally, eleven minor items, which in total accounted for just three percent of the average budget, had no prices available, and these are ignored in the analysis. Figure 1: The Six Broad Regions of Vietnam 20 The definition of the items that were priced and the consumption group(s) that they map to is reported in appendix table 1 based on the data available from the 2010 survey. For most of the analysis we use this mapping, since the greater disaggregation afforded by the 2012 price survey cannot be used when working with the pooled data from 2010 and 2012. But using the more disaggregated items makes almost no difference to the spatial deflators since the additional items priced in 2012 have small budget shares and have regional price relativities that were similar to the relativities for the substitute item(s) that had been used in their stead in 2010. Another modeling choice concerns use of imputed prices for item-market combinations with the target specification missing (13% of all cases). The imputations used regressions of the price of the target specification on prices of alternatives gathered in the survey, controlling for regional fixed effects and brand name fixed effects (for unbranded items, quasi-brands are formed by dividing into intervals based on unit prices). To show the effect of including imputed values (and other modeling choices discussed below) a bilateral Törnqvist index is reported in appendix table 2. This has the advantage of simplicity and also matches what the GSO and World Bank (2012) used in their poverty analysis. The regional deflators in columns (1) and (2) are almost the same with or without imputed values, so the imputed values are used for the remaining analysis. One important price not observed was rents or the user cost of housing services. Instead, econometric analysis of the VHLSS housing module enabled a hedonic house value equation to be estimated. Regional and temporal variation in reported dwelling values (conditioning on over 60 variables) are used as a proxy for prices. Values are used because there is almost no rental activity recorded in the VHLSS, precluding use of either actual rents or hypothetical rents as measures of regional and temporal price relativities for housing. In the third column of appendix table 2, the price index that results from estimating the housing equation on pooled data for 2010 21 and 2012 is reported, which compares with the index in column (2) where the housing equation is estimated just on VHLSS data for 2010. This makes almost no difference to the spatial patterns so the pooled housing value equation is used in the results that follow. The final modeling issue is whether the more aggregated mapping from prices to budget shares based on the 2010 survey gives different results than using the finer mapping based on the 37 extra items priced in 2012. To study this issue we first generate a price index for regions in 2012 on a 2010 base (column (4)), create inflation factors for each region (column (5)), and then rebase the 2012 regional price differences. The spatial patterns are very similar to 2010, with a correlation between the two years of 0.97. The final column of appendix table 2 has the spatial price index for 2012 if prices of the 37 more items available that year are used. This is almost identical to what results from keeping the level of aggregation from 2010, with a correlation between the deflators in columns (6) and (7) of 0.997. Thus basing the analysis on the more aggregated mapping of prices to budget shares from 2010 should not be a source of bias. The estimates of the main coefficients for the WCPD and food Engel curve regressions (equations (2) and (11)) are reported in table 1. There are two sets of results, first considering prices for items that cover all consumption (and the food share based on that) and then for an aggregate and food share that excludes housing and durable goods. Housing and durables have a combined budget share of almost one-fifth but are only lightly covered in the price survey, with housing prices from a hedonic regression and just a single specification for durables (a Samsung 21 inch television—although the 2012 survey added a DVD player and a motorcycle). By comparing Engel curve and WCPD deflators with and without housing and durables, we can assess whether any failure of the “no-price” Engel curve method to match the indexes from the WCPD is driven by these major items for which it is difficult to obtain surveyed prices. 22 In addition to the coefficients reported, the Engel curve regression includes as covariates household size; four demographic ratios (for children, youth, elderly, and migrants); the gender, age, sector of activity, and education of the household head; and prices for two types of street meals, which are a close substitute for food at home—the numerator of the food share in the dependent variable. Including these prices of close substitutes (and the relative price of food, whose coefficient is reported in table 1) favors the Engel curve method; typically these would be unobserved absent a price survey because unit values are unavailable for street meals and nonfoods given that household surveys usually just obtain reports on the quantities of well-defined food groups. Since the WCPD models include many more covariates (the fixed effects for each item) the summary statistics reported are adjusted-R2, which range from 0.57 to 0.61 for the all- consumption aggregate and from 0.66-0.67 when housing and durables are excluded. The adjusted- R2 is lower for the food Engel curve, at 0.58 (and 0.45 without housing and durables).16 The spatial variation in the cost-of-living can be observed from the size and significance of the dummy variable coefficients for each region and sector. According to the WCPD results (regardless of weights), the only area in 2010 without a significantly lower cost of living than the base region of urban Red River (Hanoi) is the urban Southeast, which has Ho Chi Minh City. The region and sector with the lowest cost-of-living is the rural Mekong Delta, which is Vietnam’s rice bowl. Except for Red River and Southeast, the between-region differences in the cost-of-living exceed the urban-rural differences within region; apart from Hanoi and Ho Chi Minh City, most cities are small and not highly differentiated from their rural hinterland. These patterns are quite similar to those found in 1993 with the VLSS, which is consistent with the regional variations in 16. This exceeds the average R2 in the Engel curves of Almås and coauthors (0.44), so any poor performance of the Engel curve deflators here should not be due to a poorly specified regression model. 23 the cost-of-living changing only slowly, since they reflect climate, infrastructure, population density, topography, and other factors that are unlikely to vary much in the short-run. The time-space price indexes derived from the WCPD and Engel curve methods are reported in table 2. Also reported is a test of the hypothesis that the price index for a particular region, sector, and year from the Engel curve method differs statistically significantly from WCPD-vw (using * to denote significance) or from WCPD-fw (using # for significance). There are 13 (out of 23) such occurrences of significant differences when the full consumption aggregate is used, regardless of which WCPD benchmark is used. If housing and durables are excluded there are 19 (18) significant differences between the Engel curve deflators and those from WCPD-vw (WCPD-fw). It appears that an abbreviated consumption aggregate (dropping two major items with hard to survey prices) does not improve the performance of the Engel curve method in terms of matching benchmark price indexes that are typical of what statistics offices would report if they had a survey of spatially disaggregated prices. Therefore, in the rest of the paper, the comparisons use the “all consumption” results, which also lets us match to the published poverty and inequality estimates for 2010 that are based on this same comprehensive consumption aggregate. The different spatial patterns for the WCPD and Engel curve deflators are illustrated in figures 2a and 2b, using the results for 2010 (based on the first eleven rows and first three columns of table 2). The Engel curve deflator implies that several rural areas have higher costs of living than in the capital city—markedly so in the case of the Mid-Northern Mountains region. According to the Engel curve, the price level in this region in 2010 is up to 39% above the price level of the urban Red River region. 24 Figure 2: Comparison of Spatial Deflators (for 2010): Urban Red River=100 (a) WCPD-vw versus Engel Curve Method Rural Mekong Delta Rural South East Rural Central Highlands Rural Northern & Central Coast Rural Midlands & Nth Mountains Rural Red River WCPD-vw Urban Mekong Delta Engel Urban South East Urban Central Highlands Urban Northern & Central Coast Urban Midlands & Nth Mountains Urban Red River 70.0 85.0 100.0 115.0 130.0 (b) WCPD-fw versus Engel Curve Method Rural Mekong Delta Rural South East Rural Central Highlands Rural Northern & Central Coast Rural Midlands & Nth Mountains Rural Red River WCPD-fw Urban Mekong Delta Engel Urban South East Urban Central Highlands Urban Northern & Central Coast Urban Midlands & Nth Mountains Urban Red River 70.0 85.0 100.0 115.0 130.0 25 Poverty maps show poverty is increasingly concentrated in the Northern Mountains (World Bank 2012), which is a region that could aptly be described as Vietnam’s Appalachia given its inaccessibility and topography. It is surprising to consider that such a region could have the highest cost of living in the whole nation (figure 2), especially with housing costs included in the comparison. Also surprising is the position of the rural Mekong Delta as having the second highest cost-of-living, given that this is Vietnam’s rice bowl, with rice moving out of this region to feed the rest of the country. The correlations between the benchmarks and the Engel curve estimates of the spatial price indexes for 2010 are -0.16 for WCPD-vw and -0.22 for WCPD-fw. The spatial deflators affect estimates of the level and location of poverty and the gap between nominal and real inequality, while estimated cost-of-living changes affect assessment of overall progress in raising living standards and escaping poverty. Once again, the experience of inflation implied by the Engel curve deflator is unrelated to the record of inflation given by the WCPD benchmarks, with zero correlation between benchmarks and Engel estimates (figures 3a and 3b). The cost-of-living increase between the 2010 and 2012 surveys ranges from 15–25% with the WCPD-vw and 14–26% with the WCPD-fw, with the least increase in the urban Southeast and the most in the rural Central Highlands. A much more varied experience of inflation is shown by the Engel curve deflator, with cost-of-living increases ranging from -3% to 45%. Such changes appear unlikely because of the arbitrage opportunities that they imply. For example, the Engel curve has the cost of living in the rural Southeast rising by 45% which is three times faster than the rise for that region’s urban sector and also contrasts with an estimate of an unchanging price level in the neighboring rural Mekong Delta. Such big price rises in the rural Southeast would be expected to attract food out of the rural Mekong Delta and industrial goods out of Ho Chi Minh City in order to moderate the price increases in the rural Southeast. 26 Figure 3: Comparison of Inflation Estimates (a) WCPD-vw versus Engel Curve Method Rural Mekong Delta Rural South East Rural Central Highlands Rural Northern & Central Coast Rural Midlands & Nth Mountains Rural Red River WCPD-vw Urban Mekong Delta Engel Urban South East Urban Central Highlands Urban Northern & Central Coast Urban Midlands & Nth Mountains Urban Red River -10 0 10 20 30 40 Percentage Change in Price Level (2012 survey relative to 2010) (b) WCPD-fw versus Engel Curve Method Rural Mekong Delta Rural South East Rural Central Highlands Rural Northern & Central Coast Rural Midlands & Nth Mountains Rural Red River WCPD-fw Urban Mekong Delta Engel Urban South East Urban Central Highlands Urban Northern & Central Coast Urban Midlands & Nth Mountains Urban Red River -10 0 10 20 30 40 Percentage Change in Price Level (2012 survey relative to 2010) 27 The Engel curve deflator gives implied price changes that suggest very little linkage between urban and rural sectors within each region. According to the WCPD deflator, the average gap between rural and urban inflation within a region is just 1.6 percentage points using variable weights or two percentage points using fixed weights, suggesting that price changes in urban areas and their hinterland largely move together. But for the Engel curve deflator, the average gap in the inflation experience of the rural and urban sectors within a region is thirteen percentage points, and this seems unlikely to be true given the arbitrage opportunities that it implies. V. IMPACTS ON INEQUALITY AND POVERTY The Engel curve deflator interprets the higher average food shares of households living in poor rural areas as evidence of a high cost-of-living in these areas. Consequently, the level of inequality appears higher in real terms than in nominal terms if the Engel curve deflator is used. In contrast, the WCPD price indexes show real inequality to be less than nominal inequality because regions and sectors that are nominally richer are found to have a higher price level; this positive relationship between nominal incomes and the price level is consistent with the Balassa-Samuelson effect. These differing effects of deflation are illustrated in figure 4 in the form of nominal and real Lorenz curves for Vietnam in 2010, with only the variable-weight version of the WCPD deflator used since the fixed-weight version gives very similar results. The Gini coefficients corresponding to these Lorenz curves are 0.427 for nominal consumption, 0.404 for real consumption when the WCPD deflator is used, and 0.465 if the Engel curve deflator is used. Thus, the use of the Engel curve deflator would introduce a bias of six Gini points into the measurement of real inequality, which is a relatively large effect. 28 Figure 4: Lorenz Curves for 2010, With and Without Spatial Deflation 1.0 0.8 0.6 L(p) 0.4 0.2 0.0 0 .2 .4 .6 .8 1 Percentiles (p) Perfect equality Nominal Engel deflated WCPD deflated A similar distortion is introduced into estimates of the overall poverty rate and the location of poverty. The existing evidence on poverty in Vietnam is that rural regions are poorer than urban regions (World Bank 2012). The Engel curve deflator exacerbates this effect by suggesting that three rural regions (the Mekong Delta, the Central Highlands, and the Mid-Northern Mountains) have a higher cost-of-living than even in Hanoi. The effects of this deflator are shown in table 3, which describes poverty in 2010 nationally and in the urban and rural sectors using the Engel curve and WCPD deflators from table 2. We use the P class of poverty measures of Foster, Greer, and Thorbecke (1984):  1 g  ,  P    q i   n z  i 1 29 where n is the total population, incomes are ordered from i=1 as the poorest and q are poor, z is the poverty line, and gi the poverty gap, gi  z  yi , ( yi is per capita consumption in the ith household).17 For =0, P0 is the head-count index, for =1, P1 is the poverty gap index, and for =2, P2 is the squared poverty gap or poverty severity index. The P class additively decomposes contributions from each subgroup to the total level of poverty, which is reported in the table as the “share” of poverty. Another useful manipulation of the P measures is to calculate the “risk” of poverty, which is the poverty rate for a particular subgroup relative to the overall average, and this is also reported. If the Engel curve deflator is used to adjust for regional and sectoral differences in the cost-of-living, it makes the national headcount poverty rate appear ten percentage points higher than if either WCPD deflator is used (37% versus 27%). This upward bias comes entirely from the rural sector, where the Engel curve deflator causes poverty to be overstated with proportionate biases of 39% in the headcount, 68% in the poverty gap, and 92% in the poverty severity index. The basic pattern of the poverty profile is not altered by using the Engel curve deflator—poverty is overwhelming rural in Vietnam—but the policy response to finding that one-half (49%) of rural dwellers live in households below the poverty line and that the risk of being poor in urban areas is just one-tenth the national risk (for the poverty severity index) is likely to be quite different to finding just over one-third of the rural population poor, which is what is revealed when either a variable-weight or fixed-weight WCPD deflator is used. 17. The poverty line of VND 653,000 used by World Bank (2012) is in national average prices of January 2010. In contrast the deflators used here are based on urban Red River prices, for surveys centred on October 2010, for which the equivalent poverty line is VND 881,000. 30 Finally, the Engel curve deflator also biases assessment of progress in reducing poverty, in this case showing much faster progress between 2010 and 2012 than is likely (table 4). Recall that the Engel curve implied lower inflation (and even deflation) for most regions and sectors than what the WCPD indexes show (figure 3). In fact, only the rural Southeast and the urban Central Highlands—containing just 8% of Vietnam’s people—had Engel curve inflation higher than WCPD inflation. Consequently, much of the growth in nominal consumption between 2010 and 2012 is treated as real growth, and so the fall in poverty seems faster than it actually was. For example, the headcount poverty rate appears to fall by eleven percentage points between 2010 and 2012, compared with a seven percentage point decline when either WCPD index is used. For the other poverty measures shown in table 4, the change in poverty using the Engel curve deflated consumption is twice as large as the change using the other two deflators. These other poverty measures include the average exit time measure of Morduch (1998), which shows the expected number of years to escape poverty with constant and uniform growth (assumed 3% per annum here).18 Use of the Engel curve deflator would lead one to find a three-year fall between 2010 and 2012 in the average time expected to escape poverty, and such progress may induce a false sense of achievement for Vietnam’s policy makers when compared with the actual record (based on either WCPD deflator) of just over a one-year reduction in expected poverty exit time. 18. The exit time measure has the same properties as the poverty severity index (sensitivity to distribution amongst the poor) but allows an intuitive interpretation. It is calculated as: Tg  1 N qj1 ln ( z )  ln ( y j ) g , where constant and uniform growth rate g would see person j below the poverty line take tg years to reach the poverty line (the expected value, Tg, includes an exit time of zero for the nonpoor). For the average poor person, it takes T g H years to escape poverty. 31 VI. COST-BENEFIT ANALYSIS OF SPATIALLY DISAGGREGATED PRICE SURVEYS The results in sections IV and V show that deflators from the food Engel curve appear to be a poor proxy for those obtained from the WCPD benchmark price indexes; compared to those benchmarks, estimates of the level, location, and change in poverty would be distorted if the Engel method deflator was used. We also note that researchers may turn to the Engel curve method, in part, because needed prices for time-space deflation are unavailable. In this section we join these two points in a cost-benefit analysis that asks the following question: could an analysis in the absence of prices (and instead using the Engel curve method to get the deflators) be so incorrect that it is so costly that it would have been better for a government to spend the money to gather the spatially disaggregated prices needed for the first-best deflators. Cost-benefit analyses of data infrastructure in poor countries are sorely lacking (Jerven 2014), in part because it is difficult to link data to outcomes, and it is not clear if bad policies are any less likely with better data. Despite those caveats, we proceed as follows: we assume that the goal of the price survey is to deflate in order to measure the total poverty gap, so that Vietnam’s authorities can budget the exact amount to eliminate poverty (with costless and perfectly targeted transfers). We obtain this budgetary figure from the product of the poverty gap index, the value of the poverty line, and the size of the population. The results from table 3 show that, if the Engel curve deflator is used, the poverty gap index in 2010 was 0.130, while it was just 0.079 if the WCPD-vw price index is used. The difference in the total value of the poverty gap is US$2.4b (at market exchange rates since World Bank funding to the GSO for the SCOLI survey was also at market exchange rates). Even if we assume just a one percent social loss from paying poverty alleviation funds to nonpoor people, the overstated poverty gap would have an annual social cost of US$24 million per year. In contrast, the cost of the SCOLI survey was just US$0.25m, and the 32 survey runs only every second year. If mistargeted transfers are treated as more socially costly than one cent in the dollar, the benefit-cost ratio for spending money to get the needed spatial price data is even higher. If we use results from 2012, when the Engel curve does not overstate the poverty gap so much, the difference in the total value of the poverty gap is US$1.3b, and it still greatly exceeds the cost of the survey even at a one percent social loss rate. While these are little more than back of the envelope calculations, they have some basis in the history of the SCOLI surveys in Vietnam. A growing concern about unreliable poverty results due to spatial deflators being formed from inappropriate temporal price indexes caused the World Bank and the GSO to invest in a new program of surveys. This program of work was of such use that Vietnam self-funded the 2012 survey since it also helped answer other policy questions, such as setting cost-of-living adjustments for public sector wages in major cities. Moreover, simple as they are, these calculations give an order of magnitude to the question of how costly it could be for a country if a “no-price” analysis was treated seriously by a government that had the wherewithal to undertake large scale transfer programs. Even if a government did not design transfers based on deflators coming from a food Engel curve, there is a hidden cost when researchers develop and use “no-price” methods. In our opinion, a researcher is implicitly saying “we don’t need price data” when they use “no-price” methods like the food Engel curve method of deflation. This reduces the demand that is placed on the statistics agency to provide higher quality and more extensive price data and price indexes. Governments respond to pressure from constituents, including economists and other researchers. If more demands were made for the right sort of price data, and if the cost of not having such data was shown, better outcomes may result than those that come from using “no-price” methods. 33 VII. CONCLUSIONS In this paper we assess the performance of a “no-price” method of deflating for cost-of-living differences over time and space. Such methods are relied upon by some researchers because many large developing countries do not have spatially disaggregated price surveys. Yet such countries are exactly the place where spatial deflation is needed since it is implausible to assume that prices are the same everywhere, with high internal transport costs and an absence of major brands and chain stores setting prices on a national basis. Moreover, for developing countries emerging from a planned economy past, like China and Vietnam, spatial price differences may be growing since urbanization and the development of urban land price differentials is starting from a low base, and so the need for spatial deflation is unlikely to be reduced in the near future. The method assessed here relies on estimating a food Engel curve and defining the deflator as that needed for nominally similar households to have the same food budget shares in all regions and time periods. This method has been widely used in the literature examining bias in temporal price indexes, where it generally yields results that concur with what theory and other empirical approaches have suggested, in terms of the CPI being an upwardly biased measure of changes in the true cost of living. But there is much less guidance from either theory or from other empirical approaches about spatial deflators. The Balassa-Samuelson effect leads one to predict that the price level will be higher in regions and sectors where nominal incomes are higher, so spatial deflation should show less inequality, but one also can conceive of pathways by which people living in poor areas face higher costs (Muller 2002). Consequently, with more diffuse priors about spatial patterns in the cost-of-living, any empirical evidence—including from “no-price” approaches like the Engel curve method—may be quite influential. This makes the experience of Vietnam in 2010 34 and 2012, where there is a benchmark from comprehensive, spatially disaggregated price surveys, an important opportunity for assessing how well such “no-price” methods work in practice. Our results show that spatial deflators and spatially disaggregated estimates of temporal inflation derived when the food Engel curve method is applied in Vietnam in 2010 and 2012 are poor proxies for the deflators obtained from two benchmark price indexes that rely on spatially disaggregated prices. Based on these benchmarks, substantial distortion in estimates of the level, location, and change in poverty and inequality would occur if Engel method deflators were used in Vietnam. This scope for potentially wrong inferences leads us to conclude that while Engel curve methods may be a useful tool, amongst several, for examining bias in temporal deflators, they are unlikely to proxy for the multilateral price indexes that would be calculated from spatially disaggregated price surveys. Even in the temporal context, a concern exists that the Engel curve method is recovering changes in the cost of living for an unknown household that could be anywhere in the income distribution. As such, deflators based on food Engel curves do not appear to provide reliable evidence needed to account for time-space differences in the cost of living, and there may be no substitute for large developing countries developing spatially disaggregated price surveys. 35 REFERENCES Ackland, R., S. Dowrick, and B. Freyens. 2013. “Measuring Global Poverty: Why PPP Methods Matter.” Review of Economics and Statistics 95 (3): 813–824. Almås, I. 2012. “International Income Inequality: Measuring PPP Bias by Estimating Engel Curves for Food.” American Economic Review 102 (2): 1093–117. Almås, I., and A. Johnsen. 2012. “The Cost of Living in China: Implications for Inequality and Poverty.” Working Paper, Department of Economics, Norwegian School of Economics. Almås, I., A. Kjelsrud, and R. Somanathan. 2013. “A Behaviour-based Approach to the Estimation of Poverty in India.” CESifo Working Paper No. 4122. Barrett, G., and M. Brzozowski. 2010. “Using Engel Curves to Estimate the Bias in the Australian CPI.” Economic Record 86 (272): 1–14. Beatty, T., and E. Larsen. 2005. “Using Engel Curves to Estimate Bias in the Canadian CPI as a Cost of Living Index.” Canadian Journal of Economics 38 (2): 482–99. Beatty, T., and T. Crossley. 2012. “Lost in Translation: What do Engel Curves tell us About the Cost of Living?” Mimeo, University of Minnesota. Brandt, L., and C. Holz. 2006. “Spatial Price Differences in China: Estimates and Implications.” Economic Development and Cultural Change 55 (1): 43–86. Breuer, C., and P. von der Lippe. 2011 “Problems of Operationalizing the Concept of a Cost-of- Living Index.” MPRA Paper No. 32902. Chamon, M., and I. Filho. 2014 “Consumption Based Estimates of Urban Chinese Growth.” China Economic Review 29 (1): 126–37. Chung, C., J. Gibson, and B. Kim. 2010. “CPI Mis-measurements and Their Impacts on Economic Management in Korea.” Asian Economic Papers 9 (1): 1–15. Costa, D. 2001. “Estimating Real Income in the United States from 1888 to 1994: Correcting CPI Bias Using Engel Curves.” Journal of Political Economy 109 (6): 1288–310. Deaton, A., J. Friedman, and V. Alatas. 2004. “Purchasing Power Parity Exchange Rates from Household Survey Data: India and Indonesia.” Princeton Research Program in Development Studies Working Paper. Deaton, A. 1998. “Getting Prices Right: What Should be Done?” Journal of Economic Perspectives 12 (1): 37–46. Deaton, A., and O. Dupriez. 2011. “Spatial Price Differences Within Large Countries.” Mimeo, Princeton University. 36 Diewert, E. 2005. “Weighted Country Product Dummy Variable Regressions and Index Number Formulae.” Review of Income and Wealth 51 (4): 561–70. Dumagan, J., and T. Mount. 1997. “Re-examining the Cost-of-Living Index and the Biases of Price Indices.” Department of Commerce Working Paper ESA/OPD, 97–95. Filho, I., and M. Chamon. 2012. “The Myth of Post-reform Income Stagnation: Evidence from Brazil and Mexico.” Journal of Development Economics 97 (2): 368–86. Gibson, J. 2013. “The Crisis in Food Price Data.” Global Food Security 2 (2): 97–103. Gibson, J., K. Beegle, J. de Weerdt, and J. Friedman. 2015. “What Does Variation in Survey Design Reveal About the Nature of Measurement Errors in Household Consumption?” Oxford Bulletin of Economics and Statistics 77 (3): 466–74. Gibson, J. and Kim, B. (2015) Hicksian Separability Does Not Hold Over Space: Implications for the Design of Household Surveys and Price Questionnaires. Journal of Development Economics 114 (1): 34–40. Gibson, J., and G. Scobie. 2010 “Using Engel Curves to Estimate CPI Bias in a Small, Open, Inflation-targeting Economy.” Applied Financial Economics 20 (17): 1327–35. Gibson, J., S. Stillman, and T. Le. 2008. “CPI Bias and Real Living Standards in Russia During the Transition.” Journal of Development Economics 87 (1): 140–60. Gluschenko, K. 2006. “Biases in Cross-Space Comparisons Through Cross-Time Price Indexes: The Case of Russia.” BOFIT Discussion Paper No. 9. Gong, H., and X. Meng. 2008. “Regional Price Differences in Urban China 1986–2001: Estimation and Implication.” Discussion Paper No. 3621, Institute for the Study of Labor (IZA), Bonn. Hamilton, B. 2001. “Using Engel’s Law to Estimate CPI Bias.” American Economic Review 91 (3): 619–30. Higa, K. 2013. “Estimating Upward Bias in the Japanese CPI Using Engel’s Law.” Global COE Hi-Stat Discussion Paper Series No. 295, Hitotsubashi University. Hill, R. 2004. “Constructing Price Indexes Across Space and Time: The Case of the European Union.” American Economic Review 94 (5): 1379–409. Jerven, M. 2014. “Benefits and Costs of the Data for Development Targets for the Post-2015 Development Agenda.” Working Paper, Copenhagen Consensus Center. Larsen, E. R. 2007. “Does the CPI Mirror the Cost of Living? Engel’s Law Suggests Not in Norway.” Scandinavian Journal of Economics 109 (1): 177–95. 37 Ley, E. 2005 “Whose Inflation? A Characterization of the CPI Plutocratic Gap.” Oxford Economic Papers 57 (3): 634–46. Logan, T. 2009. “Are Engel Curve Estimates of CPI Bias Biased?” Historical Methods 42 (3): 97–110. Majumder, A, R. Ray, and K. Sinha. 2012. “Calculating Rural-Urban Food Price Differentials from Unit Values in Household Expenditure Surveys: A Comparison with Existing Methods and A New Procedure.” American Journal of Agricultural Economics 94 (5): 1218–35. Morduch, J. 1998. “Poverty, Economic Growth, and Average Exit Time.” Economics Letters 59 (3): 385–90. Muller, C. 2002. “Prices and Living Standards: Evidence from Rwanda.” Journal of Development Economics 68 (1): 187–203. Nakamura, E., J. Steinsson, and M. Liu. (2015) Are Chinese Growth and Inflation Too Smooth? Evidence from Engel Curves. American Economic Journal: Macroeconomics (forthcoming). Olivia, S., and J. Gibson. 2013. “Using Engel curves to Measure CPI Bias for Indonesia.” Bulletin of Indonesian Economic Studies 49 (1): 85–101. Oulton, N. 2012. “How to Measure Living Standards and Productivity.” Review of Income and Wealth 58 (3): 424–56. Rao, P. 2004. “The Country-Product-Dummy Method: A Stochastic Approach to the Computation of Purchasing Power Parities in the ICP.” University of Queensland, Australia. Rao, P. 2005. “On the Equivalence of Weighted Country‐Product‐Dummy (CPD) Method and the Rao‐System for Multilateral Price Comparisons.” Review of Income and Wealth 51 (4): 571–80. Ravallion, M., and D. Van De Walle. 1991. “Urban-rural Cost-of-Living Differentials in a Developing Economy.” Journal of Urban Economics 29 (1): 113–27. Selvanathan, E. 1991. “Standard Errors for Laspeyres and Paasche Index Numbers.” Economics Letters 35 (1): 35–38. Summers, R. 1973. “International Price Comparisons Based Upon Incomplete Data.” Review of Income and Wealth 19 (1): 1–16 Van Veelen, M., and R. van der Weide. 2008. “A Note on Different Approaches to Index Number Theory.” American Economic Review 98 (4): 1722–30. 38 World Bank. 2012. Well Begun, Not Yet Done: Vietnam’s Remarkable Progress on Poverty Reduction and the Emerging Challenges, World Bank: Hanoi. 39 Table 1. Key Coefficients from WCPD and Food Engel Curve Regressions All consumption Excluding housing and durables WCPD-vw WCPD-fw Engel WCPD-vw WCPD-fw Engel Urban Mid-Northern Mountains10 -0.081 -0.084 0.016 -0.024 -0.018 0.015 (0.028)** (0.030)** (0.007)* (0.019) (0.019) (0.008) Urban North-Central Coast10 -0.147 -0.146 -0.025 -0.095 -0.093 -0.028 (0.029)*** (0.030)*** (0.005)*** (0.019)*** (0.019)*** (0.007)*** Urban Central Highlands10 -0.111 -0.104 -0.040 -0.090 -0.088 -0.054 (0.029)*** (0.030)*** (0.007)*** (0.019)*** (0.019)*** (0.010)*** Urban Southeast10 -0.025 -0.019 -0.042 -0.025 -0.020 -0.071 (0.029) (0.030) (0.005)*** (0.019) (0.019) (0.007)*** Urban Mekong Delta10 -0.180 -0.186 -0.017 -0.127 -0.125 -0.034 (0.028)*** (0.030)*** (0.006)** (0.019)*** (0.019)*** (0.008)*** Rural Red River10 -0.148 -0.147 -0.009 -0.108 -0.110 0.012 (0.028)*** (0.030)*** (0.005) (0.019)*** (0.019)*** (0.008) Rural Mid-Northern Mountains10 -0.107 -0.123 0.045 -0.047 -0.047 0.039 (0.028)*** (0.030)*** (0.005)*** (0.019)* (0.019)* (0.006)*** Rural North-Central Coast10 -0.203 -0.228 -0.002 -0.136 -0.142 -0.011 (0.028)*** (0.030)*** (0.006) (0.019)*** (0.019)*** (0.008) Rural Central Highlands10 -0.164 -0.179 0.004 -0.110 -0.116 -0.007 (0.028)*** (0.030)*** (0.006) (0.019)*** (0.019)*** (0.009) Rural Southeast10 -0.157 -0.160 -0.038 -0.103 -0.102 -0.058 (0.028)*** (0.030)*** (0.006)*** (0.019)*** (0.019)*** (0.007)*** Rural Mekong Delta10 -0.231 -0.252 0.015 -0.173 -0.180 0.000 (0.028)*** (0.030)*** (0.005)** (0.019)*** (0.019)*** (0.008) Urban Red River12 0.191 0.191 0.025 0.231 0.231 -0.001 (0.029)*** (0.030)*** (0.005)*** (0.019)*** (0.019)*** (0.012) Urban Mid-Northern Mountains12 0.122 0.125 0.011 0.223 0.229 -0.009 (0.029)*** (0.030)*** (0.007) (0.019)*** (0.019)*** (0.012) Urban North-Central Coast12 0.059 0.059 0.003 0.142 0.144 -0.026 (0.029)* (0.030)* (0.005) (0.019)*** (0.019)*** (0.010)* Urban Central Highlands12 0.101 0.107 -0.005 0.164 0.163 -0.037 (0.029)*** (0.030)*** (0.007) (0.019)*** (0.019)*** (0.012)** Urban Southeast12 0.112 0.111 -0.023 0.154 0.151 -0.072 (0.029)*** (0.030)*** (0.005)*** (0.019)*** (0.019)*** (0.010)*** Urban Mekong Delta12 0.001 -0.006 0.000 0.102 0.101 -0.040 (0.028) (0.030) (0.006) (0.019)*** (0.019)*** (0.009)*** Rural Red River12 0.050 0.053 0.013 0.125 0.129 0.014 (0.029) (0.030) (0.005)** (0.019)*** (0.019)*** (0.008) Rural Mid-Northern Mountains12 0.074 0.073 0.056 0.175 0.186 0.025 (0.028)** (0.030)* (0.005)*** (0.019)*** (0.019)*** (0.009)** Rural North-Central Coast12 0.002 -0.012 0.010 0.103 0.105 -0.018 (0.028) (0.030) (0.005) (0.019)*** (0.019)*** (0.008)* Rural Central Highlands12 0.056 0.050 0.030 0.148 0.152 -0.007 (0.028)* (0.030) (0.006)*** (0.019)*** (0.019)*** (0.010) Rural Southeast12 0.010 0.007 0.012 0.096 0.094 -0.022 (0.028) (0.030) (0.006)* (0.019)*** (0.019)*** (0.009)** Rural Mekong Delta12 -0.063 -0.085 0.015 0.046 0.044 -0.022 (0.028)* (0.030)** (0.005)** (0.019)* (0.019)* (0.007)** Log total hhold expenditure -0.138 -0.137 (0.002)*** (0.002)*** Log relative price of food 0.021 -0.054 (0.006)*** (0.041) Constant 0.051 0.055 1.218 -0.015 -0.013 1.163 (0.028) (0.032) (0.052)*** (0.018) (0.020) (0.061)*** Observations 1,920 1,920 18,798 1,872 1,872 18,798 Adjusted R-squared 0.574 0.612 0.576 0.664 0.666 0.454 Note: The WCPD regressions include as unreported covariates seventy-nine fixed effects for each commodity (seventy-seven if excluding housing and durables) and differ according to whether they use fixed-weights (fw) or variable weights (vw). The unreported coefficients for the Engel curve regression are on household size and four demographic ratios (for children, youths, elderly and migrants), the gender, age, sector of activity and education of the household head, and prices for foods eaten away from home. Robust standard errors in ( ), with statistical significance denoted as: *** p<0.001, ** p<0.01, * p<0.05. 1 Table 2. Time-Space Price Indexes from WCPD and Food Engel Curves (Urban Red River in 2010=100) All consumption Excluding housing and durables WCPD-vw WCPD-fw Engel WCPD-vw WCPD-fw Engel **,## Urban Mid-Northern Mountains10 92.2 91.9 112.0 97.6 98.2 111.7*,# (2.6) (2.7) (5.5) (1.8) (1.9) (6.5) Urban North-Central Coast10 86.4 86.4 83.7 91.0 91.1 81.2*,# (2.5) (2.6) (3.2) (1.7) (1.7) (4.0) Urban Central Highlands10 89.5 90.1 74.7**,### 91.4 91.6 67.4***,### (2.6) (2.7) (3.8) (1.7) (1.8) (5.0) Urban Southeast10 97.5 98.1 73.5***,### 97.5 98.0 59.5***,### (2.8) (2.9) (2.9) (1.8) (1.9) (3.2) Urban Mekong Delta10 83.5 83.1 88.1 88.1 88.2 78.2* (2.4) (2.5) (3.9) (1.7) (1.7) (4.6) Rural Red River10 86.2 86.3 93.7 89.7 89.5 109.4**,## (2.5) (2.6) (3.3) (1.7) (1.7) (6.6) Rural Mid-Northern Mountains10 89.9 88.5 138.8***,### 95.4 95.4 133.3***,### (2.5) (2.6) (5.5) (1.8) (1.8) (6.4) Rural North-Central Coast10 81.6 79.6 98.6***,## 87.3 86.8 92.0 (2.3) (2.4) (4.0) (1.6) (1.7) (5.1) Rural Central Highlands10 84.9 83.6 103.1***,## 89.6 89.0 95.0 (2.4) (2.5) (4.8) (1.7) (1.7) (6.1) Rural Southeast10 85.5 85.3 75.6*,## 90.2 90.3 65.6***,### (2.4) (2.5) (3.3) (1.7) (1.7) (3.6) Rural Mekong Delta10 79.4 77.8 111.6***,### 84.1 83.5 100.3**,## (2.2) (2.3) (4.3) (1.6) (1.6) (6.0) Urban Red River12 121.1 121.1 119.6 125.9 125.9 99.4**,## (3.5) (3.6) (4.4) (2.4) (2.4) (8.7) Urban Mid-Northern Mountains12 112.9 113.4 108.6 125.0 125.7 93.4***,### (3.2) (3.4) (5.7) (2.4) (2.4) (8.5) Urban North-Central Coast12 106.1 106.1 101.8 115.2 115.5 82.9***,### (3.0) (3.1) (3.9) (2.2) (2.2) (6.2) Urban Central Highlands12 110.6 111.3 96.4*,## 117.8 117.7 76.3***,### (3.2) (3.3) (5.0) (2.2) (2.3) (6.6) Urban Southeast12 111.8 111.7 84.8***,### 116.6 116.3 59.0***,### (3.2) (3.3) (3.3) (2.2) (2.3) (4.3) 2 Urban Mekong Delta12 100.1 99.4 100.1 110.7 110.7 74.5***,### (2.9) (2.9) (4.4) (2.1) (2.1) (5.1) Rural Red River12 105.1 105.5 110.0 113.3 113.7 110.4 (3.0) (3.1) (3.7) (2.1) (2.2) (6.3) Rural Mid-Northern Mountains12 107.7 107.5 149.9***,### 119.1 120.4 119.7 (3.1) (3.2) (5.9) (2.3) (2.3) (8.1) Rural North-Central Coast12 100.2 98.8 107.4 110.8 111.0 88.0***,### (2.8) (2.9) (4.0) (2.1) (2.1) (4.9) Rural Central Highlands12 105.7 105.1 123.9**,# 116.0 116.4 95.0**,## (3.0) (3.1) (5.8) (2.2) (2.2) (7.1) Rural Southeast12 101.1 100.7 109.2 110.0 109.9 84.9***,### (2.9) (3.0) (4.5) (2.1) (2.1) (5.3) Rural Mekong Delta12 93.9 91.8 111.1***,## 104.7 104.5 85.0***,### (2.7) (2.7) (4.3) (2.0) (2.0) (4.2) Note: Price indexes follow equations (3a) for WCPD-vw and (3b) for WCPD-fw and equation (12) for the food Engel curve method. Robust standard errors in ( ). Statistically significant differences between the Engel curve price index and the WCPD-vw (WCPD-fw) price index for a region and time period denoted as: *** p<0.001, ** p<0.01, * p<0.05 (### p<0.001, ## p<0.01, # p<0.05). 3 Table 3. FGT Poverty Measures for the Rural and Urban Sectors in 2010, Comparing Three Deflators Poverty severity index Headcount (α=0) Poverty gap index (α=1) (α=2) Rate Share Risk Rate Share Risk Rate Share Risk WCPD—variable weights Vietnam 0.271 1.00 1.000 0.079 1.00 1.000 0.034 1.00 1.000 Rural 0.354 0.92 1.305 0.105 0.93 1.324 0.045 0.94 1.337 Urban 0.075 0.08 0.277 0.019 0.07 0.233 0.007 0.06 0.202 WCPD—fixed weights Vietnam 0.262 1.00 1.000 0.077 1.00 1.000 0.032 1.00 1.000 Rural 0.341 0.92 1.301 0.101 0.93 1.320 0.043 0.94 1.334 Urban 0.075 0.08 0.286 0.018 0.07 0.241 0.007 0.06 0.210 Engel curve deflator Vietnam 0.367 1.00 1.000 0.130 1.00 1.000 0.063 1.00 1.000 Rural 0.490 0.94 1.335 0.177 0.96 1.361 0.087 0.97 1.375 Urban 0.075 0.06 0.205 0.019 0.04 0.144 0.007 0.03 0.111 Table 4. Poverty Comparisons for 2010 and 2012a FGT poverty measures Average exit time measures H (=0) PG (=1) PS (=2) (T3%) (T3% /H) WCPD – variable weights 2010 27.1 7.9 3.4 3.6 13.2 (0.6) (0.2) (0.1) (0.1) (0.3) 2012 20.0 5.4 2.1 2.3 11.7 (0.5) (0.2) (0.1) (0.1) (0.3) Change -7.1 -2.6 -1.3 -1.2 -1.5 (0.6) (0.2) (0.1) (0.1) (0.3) WCPD—fixed weights 2010 26.2 7.7 3.2 3.4 13.1 (0.6) (0.2) (0.1) (0.1) (0.3) 2012 19.6 5.2 2.1 2.3 11.6 (0.5) (0.2) (0.1) (0.1) (0.3) Change -6.6 -2.4 -1.2 -1.2 -1.5 (0.6) (0.2) (0.1) (0.1) (0.3) Engel curve deflator 2010 36.7 13.0 6.3 6.2 16.9 (0.6) (0.2) (0.2) (0.2) (0.3) 2012 25.5 7.9 3.5 3.6 14.1 (0.5) (0.2) (0.1) (0.1) (0.3) Change -11.2 -5.1 -2.8 -2.6 -2.8 (0.6) (0.2) (0.1) (0.1) (0.3) Note: “H” is headcount index, “PG” is poverty gap index, “PS” is poverty severity index, “T3%” is the average exit time measure of Morduch (1998) at a 3% annual real growth rate, and “T3%/H” is the average exit time amongst the poor. a Standard errors in ( ) are adjusted for the stratification, clustering, and weighting of the data. 1 Appendix Table 1. Mapping of Prices and Budget Shares Avg budget Code Consumption survey group Price survey item/specification share 101 Plain rice 0.082 White rice #1 (lower quality) White rice #2 (premium variety) 102 Sticky rice 0.005 Sticky rice 103 Maize 0.001 — 104 Cassava 0.000 — 105 Potatoes 0.001 — 106 Bread, flour 0.002 White bread 107 Instant noodles 0.007 Instant noodles 108 Fresh rice noodles 0.002 Fresh rice noodles 109 Vermicelli 0.001 (a) 110 Pork 0.051 Pork: Rump Pork: Belly 111 Beef 0.009 Beef Fresh beef rib 112 Buffalo meat 0.001 (b) 113 Chicken 0.024 Battery chicken meat Live free range chicken Free range chicken meat 114 Duck and other poultry 0.006 Whole local duck 115 Other types of meat 0.002 (c) 116 Processed meat 0.005 Pork- pie 117 Cooking oil, lard 0.009 Cooking oil Lard 118 Fresh shrimp, fish 0.036 Carp Salt-water shrimp Fresh-water shrimp 119 Dried shrimp and fish 0.004 Dried fish 120 Other seafood 0.003 (d) 121 Eggs 0.007 Chicken eggs 122 Tofu 0.005 Tofu 123 Peanuts, sesame 0.001 — 124 Beans of various kinds 0.001 (e) 125 Fresh peas 0.002 Fresh peas 126 Water morning glory 0.004 Water morning glory 127 Kohlrabi 0.001 (f) 128 Cabbage 0.002 Cabbage 129 Tomatoes 0.002 Tomatoes 130 Other vegetables 0.013 (g) 131 Oranges 0.002 Oranges 132 Bananas 0.003 Bananas 133 Mangoes 0.001 Mangoes 134 Other fruits 0.010 (h) 2 135 Fish sauce 0.005 Fish sauce 136 Salt 0.001 Salt 137 MSG 0.002 (i) 138 Glutamate 0.004 (i) 139 Sugar 0.005 White sugar 140 Confectionery 0.005 Fruit candies 141 Condensed milk 0.007 Condensed milk 142 Ice cream, yoghurt, other diary 0.002 (j) 143 Fresh milk 0.004 — 144 Alcohol 0.006 Vodka 145 Beer 0.004 Bottled beer #1 (Northern brand) Bottled beer #2 (Southern brand) 146 Bottled and canned water, soft drinks 0.002 Soft drink Fruit juice Bottled water 147 Instant coffee 0.001 (k) 148 Coffee powder 0.001 Powdered coffee 149 Instant tea powder 0.000 (l) 150 Other dried tea 0.005 Dried tea 151 Cigarettes, waterpipe tobacco 0.010 Cigarettes #1 (Northern brand) Cigarettes #2 (Southern brand) 152 Betel leaves, areca nuts 0.000 — 153 Outdoor meals 0.074 Outdoor meals - breakfast Outdoor meals - lunch/dinner 154 Other food and drinks 0.013 (m) 201 Pocket money for children 0.009 (n) 204 Petrol 0.034 Petrol 205 Kerosene 0.001 Kerosene 212 Other types of fuel 0.030 (o) 213 Deposit fees for vehicles 0.002 — 214 Matches, candles, fire stones, lighters 0.001 (p) 215 Soap, detergent 0.007 Washing detergent 216 Dish washing liquid 0.003 (q) 217 Shampoo, conditioner 0.005 Shampoo 218 Bath soap, shower gel 0.002 Soap 219 Skin care and cosmetics products 0.002 (r) 220 Tooth paste and brush 0.004 Toothpaste 221 Toilet paper, razor 0.002 Toilet paper Books, newspapers, magazines for 222 adults 0.001 Notebook 223 Books, newspapers for children 0.000 Notebook 224 Fresh, nonworship flowers 0.000 — 226 Regular worship activities 0.006 — 227 Haircut, hairdressing 0.005 Men’s haircut Ladies’ haircut 3 228 Other daily expenditures 0.007 — 300 Nonfood, annual spending 0.058 Tailoring Puncture repair 400 Gifts for special occasions 0.012 — dur Durables (user cost) 0.088 DVD player edu Education-related spending 0.036 Notebook School fee for public high school hlth Health-related spending 0.043 Paracetamol Flu medicine util Utilities 0.023 Electricity tariffs rent Rent 0.161 Hedonic regression on dwelling values Notes: Average budget shares use democratic weights applied to the 2010–2012 pooled VHLSS dataset. For items with multiple prices per consumption survey group, the price relativities are averaged before mapping to the budget shares. The 11 items with “—” have no prices and are ignored in the analysis. Items with ( ) use prices of similar items as follows: (a) fresh rice noodles; (b) beef; (c) beef, pork, chicken, and duck; (d) carp, shrimp, and dried fish; (e) fresh peas; (f) cabbage; (g) peas, water morning glory, cabbage, and tomatoes; (h) oranges, bananas, and mangoes; (i) salt; (j) condensed milk; (k) powdered coffee; (l) dried tea; (m) all foods; (n) instant noodles, candies, beef noodle soup, notebooks, and school fees; (o) petrol and kerosene; (p) cigarettes; (q) washing detergent; and (r) shampoo and soap. 4 Appendix Table 2. Impact of Various Modeling Assumptions on Spatial Price Indexes and Inflation Rates No imputed Imputed Pooled house 2012 on Inflation Rebased to Adding more prices prices value equation 2010 base since 2010 2012 prices (1) (2) (3) (4) (5) (6) (7) Urban Red River 100.0 100.0 100.0 118.7 18.7 100.0 100.0 Urban Mid-Northern Mountains 82.1 81.5 81.2 98.3 21.1 82.8 82.6 Urban North-Central Coast 77.9 77.1 77.1 94.6 22.7 79.7 77.9 Urban Central Highlands 86.5 86.7 86.2 104.9 21.7 88.3 88.7 Urban Southeast 96.2 97.6 97.8 110.6 13.1 93.1 92.2 Urban Mekong Delta 74.8 74.2 74.4 87.3 17.3 73.5 73.6 Rural Red River 80.7 80.0 78.8 95.8 21.6 80.7 79.8 Rural Mid-Northern Mountains 80.5 79.5 79.0 94.7 19.9 79.8 78.9 Rural North-Central Coast 71.4 70.6 70.0 85.8 22.6 72.3 70.3 Rural Central Highlands 77.3 77.1 76.4 94.6 23.8 79.7 79.3 Rural Southeast 77.4 77.8 77.4 91.4 18.1 77.0 75.6 Rural Mekong Delta 70.5 70.0 69.8 79.7 14.2 67.1 67.2 Notes: The inflation factor reported is the change in the average price level for a region and sector from the 2010 survey (centered on October) to the 2012 survey (centered on June), so it is not an annual rate of inflation. The additional prices added in column (7) are for thirty more nonfoods and seven more foods, which were included in the 2012 price survey but not the 2010 survey.