Poverty & Equity Global Practice Working Paper 217 EVALUATING THE ACCURACY OF HOMEOWNERS’ SELF-ASSESSED RENT IN METROPOLITAN LIMA Lidia Ceriani Sergio Olivieri Marco Ranzani August 2019 Poverty & Equity Global Practice Working Paper 217 ABSTRACT Attributing a rental value to homeowners’ dwellings is essential in different contexts, including poverty and inequality analysis, the compilation of national accounts, consumer price indexes, and estimation of purchasing power parity indexes. The proposed solution is often to use homeowners’ estimates of the market rent they would pay for their dwelling if they were renting it, which is usually referred to as homeowners’ self-assessed rent. Lack of alternative surveys and up-to-date and complete administrative data about dwellings’ market values typically bounds researchers to test the accuracy of homeowners’ self-assessed rent using only information from household budget surveys. Using 13 years of the Peruvian household budget survey, this paper compares two methods to assess the accuracy of homeowners’ self-assessed rent and finds that the average homeowner in Lima overestimates the market rent of her dwelling by between 8 and 15 percent. However, homeowners’ self-assessment inaccuracy fades away in most years when homeowners are compared with their most observationally similar tenants. This paper is a product of the Poverty and Equity Global Practice Group. It is part of a larger effort by the World Bank to provide open access to its research and contribute to development policy discussions around the world. The authors may be contacted at mranzani@worldbank.org. The Poverty & Equity Global Practice Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. ‒ Poverty & Equity Global Practice Knowledge Management & Learning Team This paper is co-published with the World Bank Policy Research Working Papers. Evaluating the Accuracy of Homeowners’ Self-Assessed Rent in Metropolitan Lima Lidia Ceriani1 Sergio Olivieri2 Marco Ranzani3 JEL: N16; R21; R31; I32. Keywords: Imputed Rent; Hedonic Model; Peru.  The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. The authors wish to thank Erich Battistin, Dean Jolliffe, Peter Lanjouw, Kristen Himelein, and Carlos Rodriguez Castelan for their comments on previous versions of the paper. 1 Georgetown University. 2 World Bank Group, Poverty and Equity Global Practice. 3 World Bank Group, Poverty and Equity Global Practice. 1. Introduction Attributing a value to the flow of services households derive from their dwellings (i.e. rental value) is a recurring knotty issue in different contexts, including the compilation of national accounts (Heston, 1994), the estimation of consumer price indices (Lebow and Rudd, 2003) and purchasing power parity indices (Deaton and Heston, 2010), and the distributional analysis in welfare economics (Balcazar et al., 2017). The key issue consists in estimating a rental value for those households that do not rent the dwelling they live in, either because they own it or for instance because it is provided free of charge by relatives or employers. For such groups, therefore, the rental value must be estimated. Despite the prominence of such issue in different disciplines, the jury is out regarding the best practice for imputing rental values to homeowners. Balcazar et al. (2017) offer a comprehensive review of different imputation methods ranging from parametric to non-parametric models, including rent-to-value and user cost approaches. One of the solutions proposed is to use the owners’ assessment of the market rent they would pay if they rented their dwelling. For example, half of the 21 economies in the Asia and Pacific region who participated in the International Comparison Program for the computation of 2011 purchasing power parities uses self-assessed rental values for imputing rents to homeowners (Asian Development Bank 2014, p.100). The African Development Bank also presents equivalent results for countries that include rents in National Accounts: half of the countries use self-assessment (African Development Bank 2013). Moreover, weights of the Consumer Price Index in the United States are computed based on the value of dwellings found in the Consumer Expenditure Survey (CES). The rental value of owner-occupied dwellings of CES depends on owners’ estimates of how much their residences would rent for (Lebow and Rudd 2003). Finally, according to a World Bank survey, 28 of 70 developing countries include housing in the welfare aggregate and 15 of them use homeowners’ self-assessment. This method relies on the expectation that owners can give a good estimate of the rental market value of their dwellings, perhaps with the help of interviewers, as suggested in Garner and Kogan (2007). This information is frequently collected in household budget surveys, where homeowners are asked to estimate the rent they would pay to live in their current dwelling. However, owner-occupiers may not be able to provide an accurate estimate of their dwelling’s rental price, and the direction and magnitude of the inaccuracy is unclear a priori. In countries where rental markets are very thin due to high rates of home ownership, homeowners might merely lack information about rental prices. Furthermore, homeowners might also overestimate the true rental value of their dwelling compared to rented ones with similar characteristics because of their attachment to specific features of their residences—particularly if they designed or made their homes themselves— or of their neighborhood (Frick et al. 2010; Heston and Nakamura 2009). Extensive evidence on the United States suggests that self-assessed rental equivalences may be in fact overestimated. Goodman and Ittner (1992) explore the accuracy of owners’ estimates of house values relative to the sales prices of the same properties, using a national sample of the United States. The authors find that the median homeowner in the mid-1980s overvalued her house by about 6 percent. Garner and Short (2001) find that housing costs based on self-reported rental equivalence resulted in higher estimates (almost 15 percent) than those based on a hedonic model. Garner et al. (2006) state that median reported rental equivalence in the US Consumer Expenditure Survey is higher than that of any other approach (hedonic model, rent-to-value model, payment approach) based on the same data. Despite the existing empirical evidence on the inaccuracy in homeowners’ self-assessed rental value and the consensus on the necessity of estimating its magnitude before using self-assessed estimates, there is no clear 2 methodological recommendation on how to test it. Moreover, in developing countries in particular, any analysis is hampered by the lack of up-to-date and complete administrative data, and alternative surveys about dwellings’ market. Thus, practitioners must rely solely on the data gathered in household budget surveys to test the accuracy of homeowners’ self-assessed rent. In this paper, we shed some light on how to assess the accuracy of homeowners’ self-assessed rent responses based only on information from household budget surveys and relying on two estimation methods. The paper is organized as follows. Section 2 describes the recent evolution of the housing market in Peru. Section 3 presents the proposed methodology. Section 4 describes the microdata used in the analysis. Section 5 discusses the results; and section 6 presents final remarks. 2. Housing Market in Peru In this section, we briefly describe the housing market in Peru, and particularly in Lima Metropolitana over the past 15 years. The housing market has been very dynamic, driven by a rapid urbanization process, changes in policies and robust growth in the construction sector. The urbanization process started in Peru in the mid- 1950s, when the rural population began moving from the Andes towards the region of the capital city, Lima. As a result, the urban population increased from 47 percent in 1960 to 79 percent in 2013 (World Development Indicators). At the beginning of the urbanization process, the government addressed the growing demand of housing in urban settlements by allowing rural migrants to occupy peripheral land and by recognizing the legal status of existing informal settlements. Efforts to regularize and upgrade such informal settlements (the so called barriadas) were undertaken during the 1970s but were arrested first by the economic crises of the 1980s, and then by the new constitution approved in 1993, which put an end to the recognition of housing rights and dismantled all related institutions. In the early 2000s, the Ministry of Housing was re-established, and the government established the Mivivienda Fund, which is the main public institution for the provision of social housing, and it was re-organized in 2002 following the directions of the National Housing Plan for 2003-2007. Finally, in 2006, the 2006-2011 National Housing Plan shifted the focus of social housing efforts towards the poorest layers of the population, i.e. those with low- and very-low income.4 Thanks to these public housing policies, combined with sustained economic growth, and an expanding urban population, the years spanning between 2005 and 2013 in particular coincided with a period of high growth of the Peruvian housing market. While the economy was growing at an impressive average annual rate of about 7 percent during the period 2005-2013, the construction sector grew at an average rate above 12 percent. These figures are even more striking if we think that this period coincides with the global financial crisis that started in 2008. The contribution of the construction sector to gross domestic product (GDP) increased from 4.1 percent in 2001 to 5.1 in 2007 and peaked at 6.8 percent in 2013 and 2014 (Figure 1). 4 The socio-economic levels are defined by the Peruvian Association of Market Research Firms (APEIM - Asociación Peruana de Empresas de Investigación de Mercados). For an extensive discussion on the history of progressive housing in Peru, see Fernandez-Maldonado (2010) and Apeim (2007). 3 Figure 1. GDP annual growth rate and contribution of the construction sector to GDP, 2001-16 8.0 14.0 7.0 12.0 6.0 10.0 5.0 Percent Percent 8.0 4.0 6.0 3.0 4.0 2.0 2.0 1.0 0.0 0.0 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 GDP (left axis) Construction/GDP (right axis) Source: INEI, 2018. Note: Data for 2015 and 2016 are provisional Among all regions in Peru, Lima metropolitan area, or Lima Metropolitana, which is an administrative division formed by the conurbation of the Peruvian cities of Lima and Callao, has been the most dynamic. The population of Metropolitan Lima increased from 900,000 (15 percent of total population) in 1940 to 10.48 million (36 percent of total population) in 2017, the year of the latest available census (Figure 2). Figure 2. Evolution of the Population in Lima Metropolitana, census years Source: Based on population census data, INEI. As a consequence of the progressive public housing policies, the number of real estate units registered in the Lima district cadaster, which corresponds to about 40 percent of all registered units in the country, increased by 160 percent, from 88,715 units in 2001 to 232,682 units in 2016, reaching a peak of 256,271 in 2014 (Figure 3). 4 Figure 3. Number of real estates registered in the National Buildings Cadaster, 2001-16 700,000 600,000 500,000 400,000 300,000 200,000 100,000 0 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Lima Metropolitana Overall Country except Lima Source: Based on data from the Superintendencia Nacional de los Registros Publicos - Registro Predial Urbano. The revamp of the public housing policies during the second half of the 2000s led to an expansion in the number of individuals with access to credit for financing the purchase or renovation of a house. In particular, public housing policies in this period were focused on increasing access to credit for the poorest socio-economic stratum of the population, which in 2007 accounted for more than 50 percent of the population of Metropolitan Lima. In Metropolitan Lima, the number of loans sponsored by the Mivivienda Fund alone increased from 2,085 in 2008 to 4,407 in 2016, with a peak of 7,879 in 2013 (Figure 4). The number of disbursements of Mivivienda Housing Subsidies (Bono Familiar Habitacional) increased from 1,418 in 2004 to 4,351 in 2016 (Fondo Mivivienda). 5 Figure 4. Disbursements of Mivivienda Funds, Lima Metropolitana 2001-16  9,000 90% 7,879   8,000 80% 7,416  7,199   7,000 6,806  70% 6,558  6,291  Number of Disboursements  6,000 60% 5,403  5,202   5,000 4,792  50% 4,632  Percent 4,407   4,000 40% 2,946  2,841   3,000 2,672  30% 2,085   2,000 20% 1,047   1,000 10%  ‐ 0% 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Disboursements in Lima Metropolitana Mivivienda Disboursements  in Lima Metropolitana as a share of total disboursements Source: Based on data from Fondo Mivivienda S.A. 3. Methodology In this paper, we want to test whether or not the hypothesis that self-assessed rental price reported by homeowners is an accurate estimate of the rental market price of similar dwellings. If homeowners’ estimates are inaccurate, then they should not be used as a direct measure of imputed rent in the construction of welfare aggregates and they will require some form of adjustment in order to accurately reflect the rental market price. The best way to learn about the direction and size of the inaccuracy would be to compare homeowners’ self- reported values with the rental market price had the dwelling been on the market. This is clearly not possible: being an owner or a tenant of the principal dwelling individuals live in is a mutually exclusive status. The issue resembles the problem of evaluating the impact of a policy: the ideal counter-factual would be provided by the observation of an outcome for the same unit had the policy been implemented and had the policy not been implemented, which of course is not possible in non-experimental studies. Assume we have a population made of 1, 2, … , individuals that differ in their tenure status and belong to either one of two groups: tenants and nontenants ~ . With tenants, we refer to the set of individuals who rent their dwelling, paying as a price the market rent. With nontenants, we refer to the set of owners and all individuals who receive the dwelling for free or at subsidized rates from employers, government, or other private individuals such as family members and friends.5 5 For this last set of individuals, the reported rent does not correspond to the market rent of the dwellings they live in, as a subsidized rent is, by definition, lower than the market price. Therefore, their reported rent does not correspond to the flow of services received by living in the dwelling based on market prices. Typically, the question about self-assessed rent is administered in household surveys to all non-tenants. 6 Let be an indicator such that: 1 0 ~ In other words, 1 if individual is a tenant and 0 if individual is a nontenant. The outcome variable of interest is the rental value reported by individuals in the survey, and that is assumed to be equal to the true market rental value if the respondent is a tenant. In order to assess the possible inaccuracy of the rental value reported by a nontenant, we would ideally observe the difference in the value each individual would report if she were a tenant and if she were a nontenant. In experimental settings, where individuals are randomly assigned to treatment, one could retrieve the average treatment effect by simply comparing the outcome among treated and untreated individuals. Formally, | 1 ∗ 1 | 0 ∗ 0 The problem is that only one of the two terms can be observed, as belongs either to the set of tenants or to the set of nontenants. Outside experimental settings, assignment to treatment is not random and selection into treatment, which determines treatment status, might be affected by information that is not available to the researcher. Nonrandom selection might take two forms: the error terms is correlated with the regressors determining assignment or with the unobservable component in the selection equation. There is selection on observable and on unobservable characteristics and the consequence is that the causal relationship between outcome and treatment is not directly observable in the data since treated and untreated are not comparable. In order to estimate the average size of the inaccuracy of self-assessed rental price reported by homeowners, we propose two methodologies that allow to retrieve the average treatment effect on the treated (ATT): Ordinary Least Squares (OLS) and matching estimators. Different approaches use different assumptions about the form of assignment and the type of impact to identify the parameter of interest. Under the assumption of heterogenous treatment effects, OLS can retrieve the average treatment effect on the treated if treatment and unobservables in the outcome equation are not correlated: Matching estimators instead aim at reestablishing experimental conditions in nonexperimental settings by reproducing the treatment group among the untreated. In other words, matching estimators construct “the correct sample counterpart for the missing information on the treated outcomes had they not been treated by pairing each participant with members of the nontreated group” (Blundell and Costa Dias 2009, p. 593). The fundamental assumption to identify the average treatment effect on the treated is that the set of observable characteristics contains all the relevant information about the potential outcome in the absence of treatment that was available to individuals at the time of deciding whether to be treated or untreated. In other words, the researcher has all the information that affects participation and outcomes among the untreated. This is known as the Conditional Independence Assumption (CIA) and can be formally stated as follows: 0 ⊥ | [M1] 7 It implies that unobservables in the outcome equation are orthogonal to treatment conditional on observable characteristics, or in other words that there is no selection on unobservables. The CIA implies that treated and untreated individuals are comparable with respect to the nontreated outcome conditional on . What matching estimators do is to look for one (or a set of) untreated observation(s) with the same realization of for each treated observation. The outcome among the untreated so identified will be a good predictor of the unobserved counterfactual. This is possible only if observable characteristics in do not predict participation perfectly, i.e. there is space for unobserved factors to affect the treatment status. This means that we observe treated and untreated with similar characteristics. Formally: 1| 1 [M2] Under assumptions (M1) and (M2), the matching estimator is obtained by averaging, over the region of the common support, the differences in outcomes among treated and untreated with similar characteristics X using the weights of the distribution of among the treated. Formally, | 1 | 1, 0 | 0, The matching estimator is the average difference in outcomes over the common support, weighted by the distribution of among the treated. As the curse of dimensionality, i.e. the potentially large number of observable characteristics , can be a serious issue to the implementation of matching estimators, a common alternative proposed in the literature is to match on a function of rather than on the set of . We employ the probability of participation given the set of characteristics , or propensity score (Rosenbaum and Rubin 1983), that we obtain by estimating a logit model. The following step is to establish a metric of proximity between the propensity scores for treated and control observations and a set of weights to associate the selected set of untreated observations to each treated observation. Among the possible algorithms the literature has proposed, we apply two methods: (i) nearest neighbor and (ii) kernel matching.6 How do these two methods, namely OLS and matching algorithms, differ? Matching estimators make the same identifying assumption as OLS but avoid imposing additional ones. First, as a nonparametric method, matching does not impose a linear relationship between outcome and treatment. However, OLS might not be as restrictive in its parametric specification if many interactions among covariates are included. Second, OLS relaxes the common support assumption. In other words, while matching only compares comparable observations, OLS might extrapolate over unobservable regions of the distribution of X, unless the regression model is estimated only over the region defined by the common support. In short, if both methods do not correct for differences in unobservables, matching estimators correct for differences in the support of by executing the matching over the region of the common support and eliminates differences in the distribution of over the common support by reweighting untreated observations to mimic the distribution of within 6 Nearest neighbor matching is performed by pairing each treated with the closest (or a certain number of the closest) untreated observation in terms of propensity score. Allowing to choose more than one nearest neighbor is important since the nearest neighbor might be very far in terms of propensity score and might not be a good match. Kernel matching does not restrict the set of matching partners but uses as controls a weighted average of all observations in the comparison group, where a larger weight is given to observations that are closer to the treated observation in terms of propensity score. For a good guide to the implementation of propensity score matching, see Caliendo and Kopeinig (2008). 8 the set of treated observations. Finally, matching estimators are very data hungry and are less efficient than OLS if the latter is correctly specified. For the reasons listed above, we also propose a third way: estimating an OLS regression over the region of the common support. This approach combines the simplicity and efficiency of the OLS estimator with the advantage of restricting the analysis to a set of comparable observations. The choice of the appropriate matching variables as well as the choice of the regressors to be included in the OLS estimation should be dictated by economic theory and eventually adjusted based on information available in the data. Rosenbaum and Rubin (1983) argue that only variables measured before the treatment should be included in order to avoid any endogeneity with respect to the exposure to the treatment. Lechner (2008) shows that this condition can be relaxed as long as the effect of the treatment on the covariates is non-systematic, in other words variables observed after the treatment only induce a measurement error in . 4. Data We use data from the 2004-16 rounds of the Peruvian Household Survey on Living Conditions and Poverty (Encuesta Nacional de Hogares sobre Condiciones de Vida y Pobreza, or ENAHO). This long time-span includes the period of rapid expansion of the housing market and the implementation of the new progressive housing policies of the second half of the 2000s. Over this period, the survey was collected using the same methodology, including sampling design (updated following the 2007 population census) and questionnaire. ENAHO is a continuous survey in the field since 1995 providing detailed information on demographics, education, health, labor market status, farm and non-farm income, participation to social programs, and on perception of governance, democracy, and transparency.7 It is the official source of information of monthly indicators on the evolution of poverty, welfare, and living conditions of Peruvian households. The survey is representative at the department (24 departments) and regional levels (3 geographical regions, namely Costa, Sierra, and Selva) plus Lima Metropolitana (defined, as already mentioned, as the conurbation of the two cities of Lima and Callao). Within each region, the survey is representative for urban and rural areas, separately. The first section of the questionnaire collects information on households’ dwelling characteristics, including type of construction, building materials, number of rooms, ownership status, renovations recently undertaken and means of financing, utilities and other facilities serving the dwelling as well as their cost. The ownership status distinguishes the following six categories: (i) tenants, (ii) outright owners, (iii) owners by invasion, (iv) owners paying mortgage, (v) individuals living for free in an accommodation provided by the employer or (vi) by any other individual or institutions.8 Tenants report the monthly rent paid, while all other individuals are asked how much they could make if they were to rent the dwelling they live in. 7 The Peruvian Statistical Office (Instituto Nacional de Estadistica e Informatica-INEI) conducted a first survey of living standards in 1985, with the support of the World Bank. Subsequently, it implemented a permanent annual national survey on living conditions and poverty since 1995. In 1997 the methodology was improved with the support of the Inter-American Development Bank, the World Bank and the Economic Commission for Latin America and the Caribbean. Starting in 2004, the methodology has been updated, including a change in expansion factors following the 2007 population census. 8 Since the beginning of the urbanization process in Peru, in the mid-1950s, the predominant means for rural migrants to house themselves has been massive invasion of peripheral land (see, for instance, Fernandez-Maldonado (2010) and Gwinne (2007)). 9 Table 1. Number of Observations and Tenancy Rates, 2004-16 ` Observations Share of Market Tenants Overall Country Lima Metropolitana Rest of the country Overall Country Lima Metropolitana Rest of the country 2004 19,502 2,181 17,321 7.57% 10.94% 6.16% 2005 19,895 2,220 17,675 7.83% 11.28% 6.35% 2006 20,577 2,468 18,109 8.33% 11.73% 6.84% 2007 22,204 2,764 19,440 9.85% 13.62% 8.22% 2008 21,502 2,689 18,813 10.10% 15.04% 7.88% 2009 21,753 2,646 19,107 10.56% 15.85% 8.17% 2010 21,496 2,570 18,926 9.92% 14.68% 7.77% 2011 24,809 2,895 21,914 9.17% 14.69% 6.72% 2012 25,091 2,888 22,203 8.95% 13.73% 6.84% 2013 30,453 3,918 26,535 8.97% 14.34% 6.52% 2014 30,848 4,003 26,845 8.94% 13.79% 6.75% 2015 32,179 3,964 28,215 9.29% 15.17% 6.58% 2016 35,774 4,125 31,649 9.50% 15.75% 6.63% Source: Based on data from ENHAO, INEI. The analysis makes use of tenants and owners—both outright and still paying mortgages— leaving out other types of ownership status. The choice is motivated by our goal of assessing the existence (and the extent) of the inaccuracy of homeowners’ self-assessed rental value relative to rental market price paid by tenants. Therefore, we exclude from the sample homeowners that did not pay for the house they live in as well as subsidized tenants, who get their dwelling for free or at a subsidized price. We further restrict our analysis to Metropolitan Lima since it has the highest share of tenants in all years considered and a relatively more homogenous housing market. This allows us to have a larger control group when assessing the quality of homeowners’ self-assessed rental value. The share of tenants in Metropolitan Lima increased from an average of 11 percent in 2004 to an average of 15 percent in 2016 for the reason described in section 2 (Table 1). This is a much larger share compared with the rest of the country, where tenants are about 7 percent of total population throughout the whole period.9 The hedonic theory of consumption (Lancaster 1966; Freeman 2003) shows how housing can be thought of as a commodity composite of different characteristics. These include location of the dwelling, structural attributes of the dwelling like whether it is a detached home or an apartment, type of construction, age of the dwelling, dimensions and number of rooms, etc., and neighborhood characteristics, such as quality of schools, accessibility of public transport, proximity of streets, crime rate, poverty rate, traffic congestion, etc. Controlling for dwelling characteristics helps accounting for the fact that the rental market price is different for dwellings with different attributes, and it also allows for the fact that owners and tenants might self-select into different types of dwellings. However, even when controlling for all the observable characteristics of a dwelling, two individuals with different socio-economic characteristics, say different educational level, might report different self-assessed rental values for an observably similar house. For example, an individual with a university degree might have access to more and better information and be able to come up with a more educated estimate of 9 Further, we exclude all owners for whom the self-assessed rent is missing, accounting for a total of 1,544 observations over 13 years, or about 5 percent of the total sample. Another 158 observations (less than 1 percent of the total sample) are dropped because information on dwelling characteristics was missing. Finally, we restrict the sample to household heads aged between 25 and 64 (included). This is because there are less than 1 percent of owners aged 24 or less, and just 10 percent of tenants are aged 65 or above. 10 the market rental price of her dwelling relative to an individual with primary or no education at all. In other words, misreporting might be correlated with individuals' socio-economic characteristics. Our set of observables includes the following characteristics of the household head: gender, a third-degree polynomial in age, a set of dummies measuring the educational level (i.e. incomplete primary, complete primary, incomplete secondary, complete secondary, incomplete tertiary and complete tertiary education), a set of dummies for the occupational category if employed or the labor market status if unemployed or inactive. In addition, we control for the type of household, namely whether it is a single household, a couple with or without children, a single parent with children, a family with extended members or a composite household. The data set is rich in terms of characteristics of the dwellings: number of rooms, material of walls and floors, a dummy for whether the dwelling has access to water and to sanitation facilities, a set of categorical variables for the type of dwelling that include detached house, apartment in building, apartment in detached house with independent water access and drainage, apartment in detached house with shared water services, hut or other type of dwellings. Table 2 highlights the differences between tenants and owners in terms of socio-economic characteristics and in terms of characteristics of their dwellings. Tenants tend to be younger (on average, 10 percent more tenants than owners belong to the 18-30 age cohort) and more educated (5 percent more has a tertiary degree) and more likely to live alone (on average, 10 percent of tenants are single, as opposed to 5 percent of owners). Although the rate of employment is similar for tenants and owners (about 82 percent), tenants are less likely to be inactive. With respect to the type of dwellings, tenants are more likely to live in apartments than detached houses and their dwellings are generally of better quality in terms of floors, walls, connectivity to electricity, water and sewerage. While the socio-economic characteristics of tenants and owners did not change sensibly between 2004 and 2016, there has been a general improvement in the quality of the dwellings, particularly for homeowners. For example, the share of owner-occupied dwellings connected to water increased from 83 percent in 2004 to 94 percent in 2016. The share of tenant-occupied dwellings connected to electricity was already above 90 percent at the beginning of the period and it grew to 98 percent in 2016. As a consequence, the differences in the quality of dwellings by type of tenancy have become smaller over time. 11 Table 2. Differences in socio-economic and dwelling characteristics between owners and tenants, by years 2004 2005 2006 2007 2008 2009 Tenants- Tenants- Tenants- Tenants- Tenants- Tenants- Owners p-value Owners p-value Owners p-value Owners p-value Owners p-value Owners p-value Owners Owners Owners Owners Owners Owners Household Head Age 18-30 years 4.4% 0.118 0.00 4.4% 0.094 0.00 4.1% 0.087 0.00 4.2% 0.101 0.00 3.4% 0.164 0.00 3.9% 0.108 0.00 31-40 years 20.6% 0.013 0.71 19.4% 0.073 0.04 18.0% 0.137 0.00 20.5% 0.172 0.00 18.0% 0.168 0.00 18.0% 0.201 0.00 41-50 years 33.4% -0.024 0.59 32.7% -0.007 0.86 32.3% -0.035 0.37 31.0% -0.050 0.10 32.5% -0.044 0.16 31.1% -0.076 0.01 51-60 years 31.0% -0.033 0.48 31.8% -0.093 0.01 33.7% -0.126 0.00 32.2% -0.129 0.00 34.4% -0.202 0.00 35.5% -0.169 0.00 +60 years 10.5% -0.074 0.00 11.8% -0.067 0.00 11.9% -0.063 0.00 12.1% -0.094 0.00 11.6% -0.086 0.00 11.5% -0.064 0.00 Gender Male 77.8% -0.053 0.20 74.9% -0.065 0.10 76.1% -0.015 0.66 77.7% -0.030 0.30 73.9% -0.011 0.72 75.4% -0.009 0.76 Education Level No education 1.4% -0.014 0.00 1.8% -0.005 0.54 1.1% -0.007 0.10 0.8% -0.008 0.00 1.0% -0.007 0.07 1.3% -0.007 0.25 Primary incomplete 9.3% -0.055 0.00 7.7% -0.031 0.06 7.7% -0.034 0.03 7.4% -0.047 0.00 6.6% -0.026 0.06 7.4% -0.052 0.00 Primary complete 10.9% -0.036 0.31 10.5% -0.026 0.29 10.6% -0.038 0.04 8.2% -0.028 0.08 9.1% -0.021 0.23 7.4% -0.035 0.01 Secondary incomplete 15.2% -0.070 0.00 14.6% -0.043 0.09 12.7% -0.037 0.09 14.1% -0.029 0.18 14.6% -0.035 0.10 15.3% -0.050 0.02 Secondary complete 29.4% 0.024 0.59 30.8% 0.010 0.79 30.8% -0.001 0.99 27.9% 0.007 0.82 28.7% 0.033 0.30 28.9% 0.045 0.15 Tertiary incomplete 7.5% 0.000 0.99 8.2% 0.032 0.24 7.9% 0.082 0.02 10.9% 0.051 0.04 10.3% 0.066 0.01 8.6% 0.082 0.00 Tertiary complete 26.4% 0.152 0.00 25.9% 0.068 0.09 29.2% 0.034 0.41 30.7% 0.053 0.10 29.7% -0.009 0.76 31.1% 0.018 0.59 Labor Status Legislators, sr officials, managers 2.1% -0.007 0.57 1.9% -0.003 0.76 1.6% 0.003 0.77 1.8% 0.010 0.37 3.1% 0.002 0.85 2.1% 0.006 0.63 Professionals 8.3% 0.087 0.05 9.1% -0.001 0.97 9.4% 0.005 0.86 11.8% 0.016 0.50 10.1% 0.012 0.60 13.1% -0.037 0.08 Technicians, assoc. professionals 10.6% -0.012 0.65 7.3% 0.051 0.08 10.2% 0.024 0.39 13.4% 0.053 0.04 9.7% 0.074 0.00 9.5% 0.076 0.00 Clerks 3.9% 0.041 0.07 4.5% 0.022 0.25 4.2% 0.022 0.23 5.2% 0.025 0.15 5.9% 0.013 0.43 6.6% 0.008 0.67 Service/sales worker 11.0% 0.001 0.96 8.7% 0.031 0.25 10.6% 0.000 1.00 8.7% -0.011 0.53 12.1% -0.007 0.74 12.0% 0.017 0.45 Skilled agri/fishery workers 0.6% -0.006 0.03 0.8% -0.008 0.01 0.7% -0.003 0.49 0.7% 0.004 0.52 0.5% -0.001 0.75 0.5% -0.001 0.77 Craft/related trade workers 10.7% -0.001 0.98 9.0% 0.027 0.31 10.4% -0.002 0.95 13.9% -0.002 0.94 11.6% -0.026 0.18 11.5% -0.019 0.34 Plant/machine operators/assemblers 13.6% -0.020 0.45 13.5% -0.063 0.01 14.0% -0.002 0.94 14.2% -0.049 0.02 14.0% 0.011 0.64 14.9% -0.038 0.07 Elementary Occupation 22.5% -0.086 0.00 20.6% 0.045 0.22 20.5% 0.016 0.68 17.6% -0.004 0.87 19.2% -0.030 0.22 16.3% 0.021 0.41 Unemployed 3.1% 0.011 0.55 3.4% -0.007 0.68 2.5% 0.001 0.92 2.5% -0.009 0.29 1.5% 0.006 0.53 2.2% -0.001 0.94 Inactive 12.2% -0.005 0.86 16.2% -0.052 0.06 15.1% -0.056 0.02 9.9% -0.033 0.07 12.0% -0.050 0.01 11.5% -0.039 0.04 Household Caracteristics Singles 3.9% 0.058 0.08 3.2% 0.027 0.16 4.2% 0.022 0.32 3.4% 0.066 0.00 4.9% 0.045 0.02 5.2% 0.065 0.00 Couple 2.2% -0.012 0.19 2.4% -0.008 0.47 1.9% 0.016 0.24 2.8% 0.010 0.43 2.6% 0.019 0.16 3.7% 0.023 0.17 Couple with children 59.7% -0.067 0.17 61.0% -0.105 0.01 57.1% -0.066 0.12 67.3% -0.067 0.04 61.5% -0.069 0.04 60.1% -0.046 0.18 Single with children 17.5% 0.004 0.89 17.0% 0.094 0.01 16.8% 0.048 0.15 20.6% -0.031 0.24 21.4% -0.008 0.77 18.7% -0.009 0.71 Extended members 13.3% 0.018 0.64 12.4% -0.019 0.47 15.2% -0.009 0.79 4.2% 0.015 0.32 7.9% 0.007 0.71 9.5% -0.017 0.36 Composite 3.5% -0.001 0.93 4.0% 0.010 0.57 4.8% -0.010 0.56 1.7% 0.006 0.53 1.7% 0.007 0.52 2.7% -0.015 0.07 Characteristics of the Dwellings Number of rooms 4.12 -1.059 0.00 4.09 -1.121 0.00 4.23 -1.266 0.00 4.24 -1.492 0.00 4.15 -1.569 0.00 4.30 -1.466 0.00 Quality of walls Stone and concrete 89.8% -0.075 0.03 90.4% -0.061 0.04 89.3% -0.056 0.06 87.1% 0.021 0.31 86.0% 0.018 0.40 85.7% 0.011 0.60 Mud and Straw 4.5% 0.132 0.00 5.0% 0.100 0.00 4.4% 0.111 0.00 4.3% 0.040 0.02 4.2% 0.045 0.01 4.1% 0.045 0.01 Wood 1.7% -0.017 0.00 0.6% -0.004 0.30 1.4% -0.014 0.00 3.9% -0.030 0.00 5.6% -0.044 0.00 5.4% -0.018 0.13 Other material 4.1% -0.041 0.00 3.9% -0.035 0.00 4.9% -0.041 0.00 4.7% -0.031 0.00 4.2% -0.019 0.06 4.8% -0.038 0.00 Quality of roof Reinforced concrete 66.7% 0.039 0.35 67.6% -0.037 0.36 67.4% 0.007 0.85 71.5% 0.027 0.37 70.8% 0.030 0.31 73.2% 0.040 0.14 Wood 3.2% 0.120 0.00 4.4% 0.123 0.00 4.2% 0.113 0.00 3.0% 0.059 0.00 3.1% 0.060 0.00 2.0% 0.077 0.00 Tiles/sheets of calamine, cement fiber 20.2% -0.111 0.00 20.0% -0.093 0.00 19.0% -0.078 0.00 19.2% -0.085 0.00 20.7% -0.082 0.00 20.1% -0.103 0.00 Cane or mat with mud/straw/palm leaves 9.9% -0.048 0.01 8.1% 0.007 0.74 9.4% -0.042 0.02 6.3% -0.001 0.97 5.4% -0.008 0.56 4.7% -0.014 0.21 Quality of surface Parquet or polished wood 15.6% 0.010 0.76 13.5% 0.009 0.79 14.3% 0.070 0.08 15.4% 0.012 0.64 15.8% 0.021 0.44 13.9% 0.029 0.28 Asphaltic films, vinyl films or similar 2.9% 0.007 0.64 5.1% 0.024 0.31 4.9% 0.025 0.26 5.3% 0.009 0.59 5.5% -0.009 0.56 5.5% 0.024 0.19 Tiles 12.3% 0.136 0.01 14.0% 0.076 0.03 13.9% 0.025 0.42 15.2% 0.086 0.00 19.3% 0.010 0.72 21.4% -0.066 0.01 Wood boards 0.9% 0.059 0.01 1.2% 0.032 0.07 1.1% 0.033 0.07 0.9% -0.005 0.33 0.5% 0.008 0.25 0.7% 0.011 0.14 Cement 55.1% -0.089 0.06 55.6% -0.064 0.14 54.0% -0.071 0.09 52.3% -0.012 0.72 49.4% 0.024 0.49 49.9% 0.053 0.12 Other material 13.2% -0.124 0.00 10.6% -0.077 0.00 11.9% -0.082 0.00 10.8% -0.090 0.00 9.4% -0.054 0.00 8.7% -0.051 0.00 Services Connected to Water 82.8% 0.155 0.00 84.7% 0.074 0.00 83.1% 0.118 0.00 86.7% 0.079 0.00 86.7% 0.107 0.00 87.3% 0.097 0.00 Connected to Electricity 98.7% 0.005 0.45 99.1% -0.012 0.23 99.1% 0.009 0.00 99.2% 0.007 0.01 99.3% 0.007 0.00 99.6% 0.001 0.73 Connected to Sewer 80.8% 0.185 0.00 82.2% 0.124 0.00 83.3% 0.124 0.00 85.1% 0.096 0.00 86.2% 0.111 0.00 88.5% 0.097 0.00 Type of dwelling Detached House 83.9% -0.409 0.00 83.2% -0.292 0.00 82.9% -0.324 0.00 84.9% -0.421 0.00 85.8% -0.357 0.00 84.7% -0.400 0.00 Apt in Building 7.2% 0.279 0.00 9.7% 0.173 0.00 9.7% 0.182 0.00 11.1% 0.331 0.00 11.4% 0.244 0.00 12.4% 0.291 0.00 Apt in Detached House (independent) 3.1% 0.108 0.00 3.5% 0.069 0.00 1.9% 0.142 0.00 2.7% 0.032 0.02 2.1% 0.041 0.01 1.7% 0.045 0.00 Apt in Detached House (in common) 0.6% 0.073 0.00 0.5% 0.074 0.00 0.4% 0.048 0.00 1.1% 0.061 0.00 0.5% 0.070 0.00 0.8% 0.068 0.00 Other type 5.1% -0.051 0.00 3.1% -0.024 0.00 5.0% -0.048 0.00 0.2% -0.002 0.08 0.2% 0.002 0.69 0.3% -0.003 0.09 Source: Based on data from ENHAO, INEI. 12 Table 2 (cont.) 2010 2011 2012 2013 2014 2015 2016 Tenants- Tenants- Tenants- Tenants- Tenants- Tenants- Tenants- Owners p-value Owners p-value Owners p-value Owners p-value Owners p-value Owners p-value Owners p-value Owners Owners Owners Owners Owners Owners Owners Household Head Age 18-30 years 4.5% 0.107 0.00 2.2% 0.138 0.00 3.8% 0.067 0.00 2.8% 0.141 0.00 2.9% 0.092 0.00 2.9% 0.117 0.00 4.0% 0.123 0.00 31-40 years 16.7% 0.199 0.00 16.6% 0.163 0.00 16.5% 0.137 0.00 15.3% 0.095 0.00 16.2% 0.130 0.00 17.9% 0.156 0.00 16.3% 0.157 0.00 41-50 years 31.0% -0.045 0.15 30.7% -0.062 0.04 28.6% 0.021 0.52 30.2% -0.007 0.80 30.7% -0.001 0.96 27.7% 0.007 0.77 29.7% -0.019 0.45 51-60 years 36.5% -0.191 0.00 34.7% -0.117 0.00 38.6% -0.169 0.00 36.5% -0.138 0.00 36.4% -0.143 0.00 38.3% -0.220 0.00 36.8% -0.176 0.00 +60 years 11.3% -0.070 0.00 15.8% -0.122 0.00 12.6% -0.055 0.00 15.1% -0.092 0.00 13.7% -0.077 0.00 13.1% -0.061 0.00 13.1% -0.085 0.00 Gender Male 73.7% -0.003 0.93 74.2% -0.044 0.16 74.3% -0.009 0.78 70.5% -0.012 0.65 69.6% 0.023 0.38 69.5% -0.018 0.49 70.2% -0.018 0.48 Education Level No education 1.4% -0.013 0.00 0.8% -0.004 0.29 1.6% -0.016 0.00 0.8% -0.003 0.57 0.8% -0.005 0.20 0.9% -0.009 0.00 1.1% -0.011 0.00 Primary incomplete 6.6% -0.033 0.02 6.6% -0.049 0.00 5.1% -0.018 0.15 6.9% -0.022 0.10 6.0% -0.042 0.00 5.3% -0.026 0.01 5.4% -0.033 0.00 Primary complete 8.3% -0.057 0.00 8.2% -0.024 0.13 9.2% -0.071 0.00 8.5% -0.045 0.00 8.2% -0.029 0.05 8.1% -0.039 0.00 7.1% -0.027 0.04 Secondary incomplete 11.5% 0.001 0.98 14.7% -0.058 0.00 11.7% -0.043 0.02 14.5% -0.048 0.01 12.6% -0.015 0.41 13.3% -0.023 0.22 12.1% -0.016 0.38 Secondary complete 32.8% 0.005 0.88 30.8% 0.009 0.77 31.7% 0.033 0.32 32.3% 0.014 0.62 34.9% 0.030 0.28 33.7% 0.103 0.00 36.6% 0.025 0.36 Tertiary incomplete 10.4% 0.057 0.02 10.6% 0.020 0.38 9.8% 0.054 0.02 8.8% 0.017 0.34 11.7% -0.006 0.76 8.5% 0.019 0.25 8.2% 0.061 0.00 Tertiary complete 29.0% 0.041 0.21 28.2% 0.107 0.00 31.0% 0.061 0.08 28.1% 0.089 0.00 25.7% 0.067 0.01 30.2% -0.025 0.32 29.5% 0.001 0.97 Labor Status Legislators, sr officials, managers 1.7% 0.028 0.05 1.9% 0.025 0.07 2.4% 0.029 0.08 1.4% 0.024 0.02 1.0% 0.022 0.01 2.0% -0.005 0.44 1.4% -0.001 0.91 Professionals 12.4% -0.044 0.03 11.4% 0.015 0.54 12.9% -0.011 0.63 11.3% -0.002 0.90 11.1% -0.019 0.24 11.9% -0.044 0.00 11.5% -0.015 0.36 Technicians, assoc. professionals 10.1% 0.079 0.00 10.7% 0.010 0.67 10.4% 0.053 0.04 9.8% 0.030 0.11 8.9% 0.061 0.00 9.4% 0.030 0.10 9.8% 0.036 0.04 Clerks 5.6% 0.023 0.21 5.5% 0.026 0.15 6.6% 0.021 0.27 5.8% 0.009 0.48 7.2% 0.008 0.59 6.1% 0.028 0.06 6.8% 0.024 0.12 Service/sales worker 12.2% 0.018 0.45 11.7% 0.012 0.58 10.5% 0.015 0.51 12.9% 0.012 0.54 13.5% 0.001 0.97 10.8% 0.037 0.06 12.9% 0.029 0.14 Skilled agri/fishery workers 0.2% 0.004 0.31 0.3% -0.003 0.07 0.5% -0.005 0.03 0.5% -0.002 0.44 0.6% -0.005 0.01 0.4% -0.004 0.03 0.2% 0.004 0.34 Craft/related trade workers 11.2% -0.021 0.31 11.4% 0.013 0.56 10.4% 0.008 0.72 11.1% 0.026 0.20 11.9% -0.027 0.12 11.2% -0.024 0.15 10.7% -0.008 0.63 Plant/machine operators/assemblers 14.8% -0.045 0.04 16.9% -0.056 0.01 15.5% -0.064 0.00 15.3% -0.046 0.01 13.9% 0.009 0.67 14.6% 0.004 0.84 15.6% -0.050 0.01 Elementary Occupation 19.0% -0.007 0.78 17.3% -0.015 0.53 20.0% -0.042 0.10 18.4% -0.004 0.85 18.5% 0.013 0.58 20.0% 0.026 0.28 19.0% -0.008 0.71 Unemployed 2.7% -0.007 0.54 1.8% -0.003 0.73 1.6% 0.011 0.30 1.9% -0.006 0.40 1.1% 0.000 0.99 2.2% -0.006 0.41 2.2% 0.000 0.97 Inactive 9.9% -0.033 0.08 10.6% -0.023 0.25 9.0% -0.018 0.31 11.4% -0.040 0.01 12.2% -0.062 0.00 11.3% -0.041 0.01 9.7% -0.011 0.50 Household Caracteristics Singles 5.7% 0.061 0.00 5.0% 0.086 0.00 6.3% 0.048 0.02 6.1% 0.049 0.00 6.3% 0.052 0.00 5.4% 0.050 0.00 5.8% 0.033 0.03 Couple 3.2% 0.005 0.70 3.0% 0.015 0.29 4.2% 0.013 0.42 2.9% 0.010 0.34 4.3% 0.009 0.48 3.9% 0.023 0.08 3.9% 0.036 0.01 Couple with children 59.1% 0.000 1.00 61.8% -0.096 0.00 60.0% -0.037 0.28 57.8% -0.019 0.51 56.3% 0.006 0.83 58.5% -0.050 0.07 59.0% -0.046 0.10 Single with children 20.0% -0.042 0.11 19.8% -0.004 0.87 19.5% -0.003 0.91 21.1% 0.008 0.75 21.3% -0.045 0.04 20.4% 0.009 0.71 18.1% 0.025 0.26 Extended members 10.5% -0.022 0.28 8.7% -0.001 0.95 8.0% -0.008 0.66 10.6% -0.053 0.00 9.7% -0.014 0.37 10.8% -0.033 0.04 11.3% -0.053 0.00 Composite 1.6% -0.002 0.77 1.8% 0.001 0.89 2.1% -0.013 0.09 1.4% 0.005 0.52 2.1% -0.007 0.31 1.0% 0.001 0.91 2.0% 0.005 0.56 Characteristics of the Dwellings Number of rooms 4.23 -1.284 0.00 4.36 -1.312 0.00 4.26 -1.336 0.00 4.13 -1.266 0.00 4.10 -1.240 0.00 3.94 -1.293 0.00 3.94 -1.185 0.00 Quality of walls Stone and concrete 86.1% 0.034 0.11 86.3% 0.034 0.08 86.4% 0.041 0.04 86.4% 0.014 0.46 84.9% 0.042 0.03 83.9% 0.048 0.01 84.3% 0.051 0.01 Mud and Straw 4.9% 0.037 0.04 3.9% 0.023 0.11 4.2% 0.028 0.08 4.3% 0.047 0.00 5.2% 0.031 0.05 3.9% 0.040 0.01 4.1% 0.033 0.02 Wood 3.5% -0.020 0.02 4.8% -0.023 0.02 6.5% -0.048 0.00 7.4% -0.045 0.00 8.1% -0.063 0.00 10.7% -0.077 0.00 10.5% -0.073 0.00 Other material 5.4% -0.051 0.00 5.0% -0.035 0.00 2.9% -0.022 0.00 2.0% -0.017 0.00 1.8% -0.010 0.08 1.4% -0.011 0.03 1.1% -0.011 0.00 Quality of roof Reinforced concrete 71.3% 0.087 0.00 73.0% 0.055 0.05 71.5% 0.107 0.00 70.9% 0.057 0.03 70.6% 0.076 0.00 69.8% 0.046 0.08 70.9% 0.076 0.00 Wood 3.2% 0.041 0.01 2.3% 0.032 0.01 2.7% 0.044 0.01 3.3% 0.054 0.00 4.2% 0.050 0.00 3.8% 0.074 0.00 3.4% 0.036 0.01 Tiles/sheets of calamine, cement fiber 21.2% -0.129 0.00 21.3% -0.074 0.00 23.2% -0.141 0.00 23.7% -0.113 0.00 23.0% -0.128 0.00 24.2% -0.117 0.00 23.8% -0.122 0.00 Cane or mat with mud/straw/palm leaves 4.3% 0.000 0.99 3.4% -0.012 0.24 2.7% -0.010 0.29 2.1% 0.003 0.78 2.3% 0.002 0.84 2.1% -0.002 0.77 1.9% 0.009 0.32 Quality of surface Parquet or polished wood 12.3% 0.054 0.05 11.8% 0.051 0.06 11.0% 0.062 0.02 11.8% 0.082 0.00 9.8% 0.093 0.00 11.6% 0.025 0.17 10.0% 0.042 0.01 Asphaltic films, vinyl films or similar 7.4% -0.017 0.31 7.2% 0.015 0.46 6.8% 0.027 0.20 3.9% 0.018 0.14 5.8% -0.006 0.63 4.9% 0.028 0.04 7.4% 0.035 0.03 Tiles 18.2% 0.067 0.03 18.2% 0.031 0.26 20.8% 0.055 0.07 23.3% -0.009 0.70 23.7% 0.019 0.45 23.7% 0.021 0.38 24.2% 0.024 0.32 Wood boards 0.5% 0.022 0.02 0.6% 0.021 0.02 0.4% 0.019 0.03 0.9% 0.021 0.02 1.0% 0.015 0.07 0.6% 0.006 0.38 1.3% 0.018 0.06 Cement 54.0% -0.073 0.04 56.1% -0.073 0.03 55.5% -0.127 0.00 53.2% -0.057 0.05 52.8% -0.083 0.00 52.2% -0.022 0.44 51.5% -0.081 0.00 Other material 7.6% -0.053 0.00 6.2% -0.045 0.00 5.6% -0.037 0.00 6.9% -0.056 0.00 6.8% -0.038 0.00 7.1% -0.058 0.00 5.5% -0.038 0.00 Services Connected to Water 88.9% 0.096 0.00 89.7% 0.085 0.00 89.8% 0.078 0.00 89.6% 0.064 0.00 89.9% 0.081 0.00 88.8% 0.081 0.00 93.8% 0.039 0.00 Connected to Electricity 99.7% 0.003 0.08 99.8% 0.002 0.20 99.9% 0.001 0.32 99.5% 0.005 0.02 99.6% 0.001 0.70 99.7% -0.002 0.64 99.8% 0.002 0.10 Connected to Sewer 88.2% 0.104 0.00 89.7% 0.087 0.00 90.8% 0.070 0.00 90.8% 0.069 0.00 90.0% 0.087 0.00 88.3% 0.089 0.00 91.5% 0.061 0.00 Type of dwelling Detached House 82.0% -0.433 0.00 81.7% -0.392 0.00 84.2% -0.436 0.00 83.8% -0.422 0.00 82.3% -0.386 0.00 79.8% -0.404 0.00 79.6% -0.355 0.00 Apt in Building 13.5% 0.335 0.00 14.7% 0.305 0.00 12.4% 0.328 0.00 12.7% 0.309 0.00 14.2% 0.309 0.00 16.8% 0.284 0.00 16.8% 0.296 0.00 Apt in Detached House (independent) 3.7% 0.033 0.04 2.5% 0.047 0.00 2.1% 0.063 0.00 2.4% 0.059 0.00 2.9% 0.032 0.02 2.6% 0.064 0.00 3.0% 0.024 0.05 Apt in Detached House (in common) 0.8% 0.065 0.00 1.0% 0.039 0.00 1.3% 0.045 0.00 1.1% 0.054 0.00 0.5% 0.045 0.00 0.4% 0.059 0.00 0.5% 0.037 0.00 Other type 0.1% 0.000 0.91 0.0% 0.000 0.00 0.0% 0.000 0.00 0.0% 0.000 0.00 0.0% 0.000 0.00 0.3% -0.003 0.08 0.1% -0.001 0.21 Source: Based on data from ENHAO, INEI. 13 Both self-assessed rent and market rent paid by tenants decreased in the years of the global financial crisis, 2007-2008. Excluding 2007-2008, both self-assessed rent and reported market rent increased around 3 percent per year. The unconditional (not controlling for household characteristics) difference between average self- assessed and market monthly rental value varies year by year and ranges between PEN-13 in 2013 and PEN1,795 in 2006 (2009 prices), as Table 3 summarizes. Table 3. Average Market and Self-Assessed Monthly Rent and their unconditional difference, years 2004-2016, PEN, 2009 prices. Market Rent Self-Assessed Rent Delta (1) (2) (2)-(1) 2004 5,430.01 5,595.44 165.43 2005 4,105.20 5,855.68 1,750.48 2006 4,611.96 5,960.06 1,348.10 2007 4,199.83 5,995.60 1,795.77 2008 3,866.37 5,080.27 1,213.89 2009 3,943.10 5,277.21 1,334.11 2010 4,572.61 5,101.53 528.92 2011 4,571.52 5,506.92 935.39 2012 4,702.89 5,646.18 943.29 2013 5,521.34 5,507.97 -13.37 2014 5,649.41 5,899.54 250.13 2015 5,532.50 6,258.27 725.77 2016 6,268.59 6,416.29 147.70 Source: Based on data from ENHAO, INEI. 5. Results Table 4 illustrates OLS estimates separately for each year between 2004 and 2016. The coefficient attached to the dummy for being an owner measures the average difference between the logarithm of self-assessed rental value reported by homeowners and the logarithm of market rent paid by market tenants ceteris paribus. In other words, it captures the average difference for dwellings with similar observable characteristics and for homeowners and tenants with similar individual characteristics and household composition. The difference appears to be statistically significant and sizeable in the period 2004-07, suggesting that homeowners provided evaluations of their dwellings’ rental that differ from the market price. On average, homeowners overestimate the market rent they would have to pay in case they rented the dwelling they live in by between 12 and 16 percent. Conditional on observable characteristics, the average difference between homeowners’ self- assessment and market price reported by tenants is between PEN380 and PEN515 (measured in 2009 prices). Our hypothesis is that the observed difference is ascribable to lack of information about the rental market in a period where housing markets were not developed in Peru. The Peruvian housing market expanded considerably starting in 2008, thanks to public housing policies combined with sustained economic growth and an expanding urban population. Between 2008 and 2016, self-assessed rents resemble more accurately market prices: the estimated difference is smaller in magnitude and generally not statistically significant. Exceptions are 2012 and 2015, where homeowners overestimate the market rental value of their dwelling by 13.4 and 8.6 percent, respectively. However, more research is needed to test whether access to more information actually affects the ability of homeowners to better estimate the rental value of the dwelling they live in or whether other factors such as attachment to the property might still affect their self-assessment. 14 Next, we turn to estimates obtained via nearest neighbor (NN) and Kernel matching estimators. First, we implement the common support by trimming 10 percent of the treated observations at which the propensity score density of the untreated observations is the lowest and imposing a caliper. Second, as we do not condition on all covariates but on the propensity score, we check if the matching algorithm is able to balance the distribution of the relevant variables in both the treated and untreated groups (Table A1). The standardized bias (SB) suggested by Rosenbaum and Rubin (1985) is an indicator commonly used to assess the distance in marginal distributions of observable characteristics. For each covariate, the SB is defined as the difference of sample means in the treated and matched untreated subsamples as a percentage of the square root of the average of sample variances in both groups. Our estimates indicate that particularly with Kernel and nearest neighbor matching using 4 neighbors, the median SB reduces after matching below 5 percent, a threshold that in most empirical studies is seen as sufficient (Caliendo and Kopeinig 2008). In addition, as proposed by Sianesi (2004), we re-estimate the propensity score only on treated and matched untreated, and compare the pseudo-R2s before and after matching. After matching, there are no systematic differences in the distribution of covariates between the two groups and as a result the pseudo-R2 is very low (Table A1). Our estimates obtained implementing matching estimators are displayed in Table 5. The inaccuracy of homeowners’ self-assessment vanishes throughout the period of analysis. One exception is 2010 where we find a statistically significant difference with algorithms that make use of a larger number of comparable tenants. Interestingly, homeowners seem to be underestimating the price they would pay to rent their dwelling. The fact that the statistical significance fades away in the case of matching estimators is likely ascribable to the ability of the matching process to reduce (or eliminate) differences in the distribution of observable characteristics over the common support by reweighting untreated observations to mimic the distribution of observables within the set of treated observations. Finally, we re-run OLS regressions for each year restricting the sample to the region of common support constructed for matching estimators. Our estimates, shown in Table 6, are not qualitatively different from the estimates obtained with the full sample. On average, homeowners overestimate the market price by 12 percent in all four years predating 2008, but only in two of the nine years after 2008. Although the matching estimators allow for a more accurate comparison between alike households and dwellings, thus giving a more precise estimate of the homeowners’ self-assessment accuracy, it suffers from an important drawback: nothing can be stated about the precision of self-reported rental values outside the common support. The same drawback applies to the OLS estimate on the common support. In the case of Lima Metropolitana, all clues seem to point in the same direction: over the period 2008-2016 homeowners on average provide a precise estimate of rental market prices. 15 Table 4. Average accuracy of homeowners’ self-assessment – OLS estimates by year, 2004-16. (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Homeowner 0.115* 0.124** 0.149** 0.117*** 0.0117 0.0633 0.0491 0.0684 0.126** 0.0532 0.0363 0.0823** 0.0102 (0.0620) (0.0567) (0.0596) (0.0434) (0.0460) (0.0421) (0.0488) (0.0466) (0.0526) (0.0417) (0.0360) (0.0393) (0.0368) Constant 10.25*** 7.327*** 8.287*** 7.958*** 9.391*** 7.278*** 6.626*** 4.295*** 8.112*** 9.727*** 6.581*** 6.129*** 6.692*** (1.544) (1.522) (1.555) (1.276) (1.373) (1.234) (1.422) (1.495) (1.402) (1.251) (1.332) (1.299) (1.117) Observations 1,349 1,302 1,359 1,424 1,450 1,288 1,188 1,322 1,329 1,894 1,975 1,971 2,032 Adjusted R-squared 0.661 0.635 0.649 0.648 0.591 0.621 0.568 0.517 0.510 0.519 0.537 0.565 0.532 Source: Based on data from ENHAO, INEI. Note: Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 16 Table 5. Average accuracy of homeowners’ self-assessment – matching estimates by year, 2004-16. (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) Algorithm 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 difference 680.5 525.9 651.6 -374.4 60.3 947.5 -1330.6 -777.0 -734.4 -710.4 -207.0 -168.2 -515.4 s.e. NN(1) 1366.9 512.8 605.6 665.8 521.3 506.5 698.0 542.0 634.0 657.7 595.0 808.7 650.5 t-statistics 0.5 1.0 1.1 -0.6 0.1 1.9 -1.9 -1.4 -1.2 -1.1 -0.3 -0.2 -0.8 difference 711.5 415.7 735.8 -430.4 155.4 370.9 -1391.0 -388.2 -702.2 -708.2 -37.3 -423.5 -238.2 s.e. NN(2) 1008.4 491.1 561.5 633.5 476.8 483.0 609.6 462.4 575.9 587.6 501.2 733.3 584.2 t-statistics 0.7 0.8 1.3 -0.7 0.3 0.8 -2.3 -0.8 -1.2 -1.2 -0.1 -0.6 -0.4 difference 321.7 305.6 844.2 -540.2 51.4 541.6 -1187.2 -256.8 -734.8 -530.1 -86.2 -222.8 -115.9 s.e. NN(3) 912.0 477.4 539.0 612.3 460.7 465.0 565.9 442.7 540.2 563.5 480.2 669.5 551.2 t-statistics 0.4 0.6 1.6 -0.9 0.1 1.2 -2.1 -0.6 -1.4 -0.9 -0.2 -0.3 -0.2 difference -1271.6 402.2 830.0 -648.5 116.3 477.1 -1510.2 -331.6 -759.2 -322.3 -236.6 -67.6 -11.2 s.e. NN(4) 846.6 473.6 548.6 600.8 450.6 444.5 546.2 427.4 530.4 574.5 470.4 626.9 535.0 t-statistics -1.5 0.8 1.5 -1.1 0.3 1.1 -2.8 -0.8 -1.4 -0.6 -0.5 -0.1 0.0 difference -1567.6 334.7 743.4 -683.9 -21.1 444.6 -1235.7 19.2 -767.3 -181.7 -283.8 -7.9 -754.2 s.e. Kernel 647.3 451.4 476.6 544.0 424.5 400.9 508.7 391.5 468.5 500.7 436.4 494.6 501.3 t-statistics -2.4 0.7 1.6 -1.3 0.0 1.1 -2.4 0.0 -1.6 -0.4 -0.7 0.0 -1.5 Source: Based on data from ENHAO, INEI. 17 Table 6. Average accuracy of homeowners’ self-assessment – OLS estimates over the region of common support by year, 2004-16. (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Homeowner 0.102* 0.115* 0.155*** 0.112** 0.00445 0.0569 0.0485 0.0631 0.114** 0.0373 0.0150 0.0736* 0.00220 (0.0617) (0.0589) (0.0596) (0.0441) (0.0467) (0.0424) (0.0496) (0.0468) (0.0540) (0.0428) (0.0359) (0.0398) (0.0371) Constant 10.78*** 7.142*** 8.261*** 7.577*** 9.590*** 7.413*** 6.603*** 5.981*** 7.315*** 9.738*** 6.101*** 6.530*** 6.660*** (1.764) (1.583) (1.593) (1.311) (1.411) (1.337) (1.458) (1.502) (1.425) (1.271) (1.403) (1.339) (1.151) Observations 1,152 1,179 1,213 1,307 1,324 1,185 1,091 1,217 1,203 1,736 1,814 1,808 1,855 Adjusted R-squared 0.609 0.629 0.626 0.626 0.581 0.599 0.532 0.500 0.499 0.506 0.523 0.535 0.527 Source: Based on data from ENHAO, INEI. Robust standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 18 5 Conclusions The use of homeowners’ self-assessments from household budget surveys to estimate the rental value of the dwellings they occupy is a pervasive practice. However, researchers and practitioners often neglect to carry out preliminary analyses to test the accuracy of homeowners’ assessments. This paper proposes two methods to test the accuracy of such assessments making use only of information from a typical household budget survey. Lack of supplemental sources, such as, for instance, accessible and up-to-date administrative data from national cadasters, is a very common issue faced by researchers and practitioners working in developing countries. We propose two methods to evaluate the average size of the inaccuracy of self-assessed rental values reported by homeowners: (i) OLS (overall and restricted to the region of common support) and (ii) matching estimators. Using data covering the 2004-2016 period, our findings indicate that homeowners in Metropolitan Lima provide unbiased estimates of the service value of the dwellings they live in. More precisely, OLS estimates (both for the full sample and for the sample restricted to the region of the common support) indicate that homeowners overestimate the rental market value of their dwellings by 12 percent on average during the period 2004 to 2007. Yet, such bias vanishes for the period 2008-2016. This result seems to corroborate the hypothesis that as the rental market develops, in the case of Peru thanks to public policies, demographic changes, and urbanization, homeowners might have access to more and higher quality information that allows them to estimate rental market values more accurately. Estimates based on matching algorithms, which compare similar households and dwellings, point instead to no difference, even for the period 2004 to 2007. The difference between OLS and matching estimates is likely attributable to the fact the matching allows for a more accurate comparison of rental prices between dwellings and households that share very similar characteristics. However, in addition to being rather computationally and data demanding, matching algorithms implemented over the common support deliver estimates that are typically restricted to a subset of observations in the sample. Such restriction is not desirable in cases where the ultimate purpose is the construction of full distributions of rental values, such as for example in the case of welfare aggregates. This exercise offers a cautionary tale about the suitability of self-assessed rental values as a good approximation of market rents: homeowners might report biased estimates for different reasons, e.g. because the question asked to collect their estimate is misleading, because of sentimental attachment to the property, or because of lack of information about the rental market. Researchers and practitioners shall therefore test the data to check whether such difference exists on average. We propose the following roadmap, which is summarized in Figure 5. Step 1. Perform an OLS estimate on the difference between market rents reported by tenants and self-assessed rents reported by owners, controlling for all the variables typical of hedonic models for rental values (see, for details, Balcazar and al., 2017). Use self-assessed values if no significant difference is detected. Step 2. Re-estimate the OLS model restricting the sample to the region of common support between owners and tenants. This would rule out the possibility that the estimated difference is due to comparisons between dwellings that are not similar enough. If no difference is detected at this step, self-assessment can be safely used for households that lie within the common support. Outside the common support, instead, it is still possible that self-assessed rents do not correctly approximate market rents. We believe that self-assessment can be safely 19 used also outside the common support in cases where the rental market is well developed: if there is a thick flow of information about housing prices, it is very unlikely that such information is only limited to homeowners on the common support. The assessment of the level of rental market development could be based for example on the share of market tenants by geographical location or on information external to the survey. On the other hand, if researchers and practitioners believe the opposite to be the case, for example, information about the value of housing is restricted only to urban areas, then self-assessment cannot be safely used to approximate housing services outside the common support. Step 3. Use a matching estimator to refine the comparisons between alike dwellings and households. If no difference is found, follow the same approach discussed at Step 2. If a difference is still observed, it may be due to omitted variables bias (i.e. some important variables about households or dwelling characteristics are missing) or mis-reporting (i.e. homeowners do not report the rental market value for their dwellings). In both cases, we advise against using self-assessment. Survey data should be improved to be an appropriate source for rental values, by including more information, or by reformulating some questions. For example, as discussed in Ceriani et al. (2019), rephrasing the question asked to homeowners as “How much would you receive as a rent if you were to lend your apartment?” as opposed “How much would you pay if you were to rent the dwelling you are currently living in?” could be one way to mitigate self-assessment bias due to mis-reporting. In fact, the theoretical and experimental literature on the difference between willingness to accept (WTA) and willingness to pay WTP for the same good concludes that measures of WTA are usually superior to measures of WTP (among others, Hanemann 1991; Fehr et al. 2015; Tunçel and Hammit 2014). Self-assessment rental values are often underlying the weighting system for important macroeconomic indicators, such as consumer price indices (used as measures of inflation) and national accounts, as well as welfare aggregates for the analysis of households’ well-being (for example, poverty). Given the importance of housing as a share of total household consumption, its estimate should undergo a fine scrutiny. Further research should concentrate on how household budget surveys, which are the typical source of information for these matters, might be improved to better capture rental market values. Possible ways forward could include over- sampling of unusual housing types (including over-sampling of tenants), expert assessment of each sampled household’s dwelling, or linking the sampled population with administrative data on housing prices. 20 References Altonji, J. G., Elder, T. E., and C. R., Taber. 2005. Selection on Observed and Unobserved Variables: Assessing the Effectiveness of Catholic Schools, Journal of Political Economy, 113(1): 151–184. Asian Development Bank. 2014. Purchasing power parities and real expenditures, Mandaluyong City, Philippines: Asian Development Bank. African Development Bank, Statistical Capacity Building Division. 2013. The Reliability of Economic Statistics in Africa, Focusing on GDP Measurement, African Statistical Journal, Vol. 17. Asociación Peruana de Empresas de Investigación de Mercados. 2007. Niveles Socio Economicos 2007-2008, APEIM, Lima. Balcazar, C.F., L., Ceriani, S., Olivieri, and M., Ranzani. 2017. Rent-Imputation for Welfare Measurement: A Review of Methodologies and Empirical Findings, Review of Income and Wealth, 63(4):881-898. Blundell, R. and M., Costa Dias. 2009. Alternative Approaches to Evaluation in Empirical Microeconomics, Journal of Human Resources, 44(3):565-640. Caliendo, M. and S., Kopeinig. 2008. Some Practical guidance for the Implementation of Propensity Score Matching, Journal of Economic Surveys, 22(1):31-72. Ceriani, L., S. Olivieri and M. Ranzani. 2019. Housing, Imputed Rent, and Households’ Welfare, mimeo. Deaton, A., and A. Heston. 2010. Understanding PPPs and PPP-Based National Accounts. American Economic Journal: Macroeconomics 2 (4): 1–35. Fernandez-Maldonado, A.M. 2010. Recent housing policies in lima and their effects on sustainability, Paper presented at the 46th ISoCaRP Congress. Fehr, D., Hakimov, R. and D., Kübler. 2015. The willingness to pay–willingness to accept gap: A failed replication of Plott and Zeiler, European Economic Review, 78(C):120-128. Freeman, A.M. 1983. The Measurement of Environmental and Resource Values: Theory and Methods, Resources for the future, Washington, D.C. Frick, J. R., Grabka, M., Smeeding, T. M., and P., Tsakloglou. 2010. Distributional effects of imputed rents in five European countries, Journal of Housing Economics, 19(3):167–179. Garner, T. and K. S., Short. 2001. Owner-occupied shelter in experimental poverty measures, Paper prepared for the Annual Meeting of the Southern Economic Association Tampa, Florida. Garner, T. I., K. S., Short and U., Kogan. 2006. What do we knows about the value of owner-occupied housing services? Rental equivalence and other approaches, Southern Economics Association Annual Meeting, Charleston, South Carolina. 21 Garner, T. I. and U. Kogan. 2007. Comparing approaches to value owner-occupied housing using U.S. consumer expenditure survey data, Paper prepared for the Annual Meetings of the Allied Social Sciences Associations Society of Government Economist, Chicago, Illinois. Goodman, J. L. and J.B., Ittner. 1992. The accuracy of home owners’ estimates of house value, Journal of Housing Economics, 4(2):339–357. Gwinne, W. B. 2007. Housing. In Giugale, M. M., Fretes-Cibils, V. and J. L., Newman, editors, An Opportunity for a Different Peru: Prosperous, Equitable, and Governable, The World Bank, Washington, D.C. Hanemann, W.M. 1991. Willingness to pay and willingness to accept: how much can they differ?, American Economic Review, 81(3):635–647 Heston, Alan, (1994), A brief review of some problems in using national accounts data in level of output comparisons and growth studies, Journal of Development Economics, 44, issue 1, p. 29-52. Heston, A. W. and A. O., Nakamura 2009. Questions about the equivalence of market rents and user costs for owner occupied housing, Journal of Housing Economics, 18(3): 273–279. Lancaster, K. J. 1966. A new approach to Consumer Theory, Journal of Political Economy, 74(2), 132-157. Lebow, D. E. and J. B., Rudd. 2003. Measurement error in the consumer price index: Where do we stand?, Journal of Economic Literature, 41(1):159–201. Lechner, M. 2008. A note on endogenous control variables in causal studies, Statistics and Probability Letters, 78(2): 190–195. Rosenbaum, P.R. and D.B., Rubin. 1983. The Central Role of the Propensity Score in Observational Studies for Causal Effects, Biometrika, 70(1):41-55. Rosenbaum, P. and D., Rubin. 1985. Constructing a control group using multivariate matched sampling methods that incorporate the propensity score, The American Statistician, 39(1): 33–38. Sianesi, B. 2004. An evaluation of the Swedish system of active labour market programmes in the 1990s, Review of Economics and Statistics, 86(1): 133–155. Tunçel, T. and J., Hammitt. 2014. A new meta-analysis on the WTP/WTA disparity, Journal of Environmental Economics and Management, 68(1):175-187. 22 Table A 1. Standardized median bias and Pseudo-R2 before and after matching by year, 2004-16. (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 Before 18.9 15.9 13.6 11.1 10.2 11.4 11.0 11.0 12.5 10.0 7.4 8.3 8.6 Median bias After NN(1) 11.1 5.9 7.3 8.1 7.9 7.1 5.5 6.3 6.0 5.9 5.2 6.9 4.2 Before pscore 0.4 0.3 0.3 0.3 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 Pseudo R2 After 0.3 0.1 0.1 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 Before 18.9 15.9 13.6 11.1 10.2 11.4 11.0 11.0 12.5 10.0 7.4 8.3 8.6 Median bias After NN(2) 12.4 5.1 7.7 8.5 8.8 5.2 8.2 7.0 7.0 5.7 5.2 4.7 3.5 Before pscore 0.4 0.3 0.3 0.3 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 Pseudo R2 After 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.0 Before 18.9 15.9 13.6 11.1 10.2 11.4 11.0 11.0 12.5 10.0 7.4 8.3 8.6 Median bias After NN(3) 10.7 5.9 6.7 9.3 7.1 4.9 7.3 8.2 6.0 4.9 4.0 4.4 3.2 Before pscore 0.4 0.3 0.3 0.3 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 Pseudo R2 After 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.0 0.1 0.0 Before 18.9 15.9 13.6 11.1 10.2 11.4 11.0 11.0 12.5 10.0 7.4 8.3 8.6 Median bias After NN(4) 12.1 5.9 6.5 8.6 7.5 4.7 7.4 6.9 5.6 4.8 4.8 2.5 3.1 Before pscore 0.4 0.3 0.3 0.3 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 Pseudo R2 After 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.0 0.0 0.0 Before 18.9 15.9 13.6 11.1 10.2 11.4 11.0 11.0 12.5 10.0 7.4 8.3 8.6 Median bias After 6.6 5.9 4.3 9.0 5.9 6.4 6.4 4.9 4.4 4.0 4.4 3.0 2.9 Kernel Before 0.4 0.3 0.3 0.3 0.4 0.4 0.4 0.3 0.3 0.3 0.3 0.3 0.3 Pseudo R2 After 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.0 0.0 0.0 0.0 Source: Based on data from ENHAO, INEI. 23 Figure 5. A roadmap to assess the suitability of self-assessed rents as an accurate estimate of the market rental value of housing among homeowners Source: Authors’ original elaboration. 24 To access full collection, visit the World Bank Documents & Report in the Poverty & Equity Global Practice Working Paper series list. www.worldbank.org/poverty