80520 Volume 26 • Number 2 • 2012 ISSN 0258-6770 (PRINT) ISSN 1564-698X (ONLINE) THE WORLD BANK ECONOMIC REVIEW Volume 26 • 2012 • Number 2 THE WORLD BANK ECONOMIC REVIEW Conditional Cash Transfers and HIV/AIDS Prevention: Unconditionally Promising? Hans-Peter Kohler and Rebecca L. Thornton Just Rewards? Local Politics and Public Resource Allocation in South India Timothy Besley, Rohini Pande, and Vijayendra Rao An Axiomatic Approach to the Measurement of Corruption: Theory and Applications James E. Foster, Andrew W. Horowitz, and Fabio Méndez How Much of Observed Economic Mobility is Measurement Error? IV Methods to Reduce Measurement Error Bias, with an Application to Vietnam Paul Glewwe Inequality of Opportunity in Egypt Nadia Belhaj Hassine Can Global De-Carbonization Inhibit Developing Country Industrialization? Aaditya Mattoo, Arvind Subramanian, Dominique van der Mensbrugghe, and Jianwu He Pages 165–349 Trade Liberalization and Investment: Firm-level Evidence from Mexico Ivan T. Kandilov and Aslı Leblebiciog˘lu www.wber.oxfordjournals.org 2 THE WORLD BANK ECONOMIC REVIEW editors Alain de Janvry and Elisabeth Sadoulet, University of California at Berkeley assistant to the editor Marja Kuiper editorial board Downloaded from http://wber.oxfordjournals.org/ at International Monetary Fund on August 19, 2013 Harold H. Alderman, World Bank (retired) Caroline Freund, World Bank Chong-En Bai, Tsinghua University, China Paul Glewwe, University of Minnesota, Pranab K. Bardhan, University of California, USA Berkeley Philip E. Keefer, World Bank Thorsten Beck, Tilburg University, Justin Yifu Lin, World Bank Netherlands Norman V. Loayza, World Bank Johannes van Biesebroeck, K.U. Leuven, William F. Maloney, World Bank Belgium David J. McKenzie, World Bank Maureen Cropper, University of Maryland, Jaime de Melo, University of Geneva USA Ugo Panizza, UNCTAD Asli Demirgüç-Kunt, World Bank Nina Pavcnik, Dartmouth College, USA Jean-Jacques Dethier, World Bank Vijayendra Rao, World Bank Quy-Toan Do, World Bank Martin Ravallion, World Bank Frédéric Docquier, Catholic University of Jaime Saavedra-Chanduvi, World Bank Louvain, Belgium Claudia Paz Sepúlveda, World Bank Eliana La Ferrara, Università Bocconi, Italy Jonathan Temple, University of Bristol, UK Francisco H. G. Ferreira, World Bank Dominique Van De Walle, World Bank Augustin Kwasi Fosu, United Nations Christopher M. Woodruff, University of University, WIDER, Finland California, San Diego The World Bank Economic Review is a professional journal used for the dissemination of research in development economics broadly relevant to the development profession and to the World Bank in pursuing its development mandate. It is directed to an international readership among economists and social scientists in government, business, international agencies, universities, and development research institutions. The Review seeks to provide the most current and best research in the field of quantita- tive development policy analysis, emphasizing policy relevance and operational aspects of economics, rather than primarily theoretical and methodological issues. Consistency with World Bank policy plays no role in the selection of articles. The Review is managed by one or two independent editors selected for their academic excellence in the field of development economics and policy. The editors are assisted by an editorial board composed in equal parts of scholars internal and external to the World Bank. World Bank staff and outside researchers are equally invited to submit their research papers to the Review. For more information, please visit the Web sites of the Economic Review at Oxford University Press at www.wber.oxfordjournals.org and at the World Bank at www.worldbank.org/research/journals. Instructions for authors wishing to submit articles are available online at www.wber.oxfordjournals.org. Please direct all editorial correspondence to the Editor at wber@worldbank.org. THE WORLD BANK ECONOMIC REVIEW Volume 26 † 2012 † Number 2 Conditional Cash Transfers and HIV/AIDS Prevention: Unconditionally Promising? 165 Hans-Peter Kohler and Rebecca L. Thornton Just Rewards? Local Politics and Public Resource Allocation in South India 191 Timothy Besley, Rohini Pande, and Vijayendra Rao An Axiomatic Approach to the Measurement of Corruption: Theory and Applications 217 ´ ndez James E. Foster, Andrew W. Horowitz, and Fabio Me How Much of Observed Economic Mobility is Measurement Error? IV Methods to Reduce Measurement Error Bias, with an Application to Vietnam 236 Paul Glewwe Inequality of Opportunity in Egypt 265 Nadia Belhaj Hassine Can Global De-Carbonization Inhibit Developing Country Industrialization? 296 Aaditya Mattoo, Arvind Subramanian, Dominique van der Mensbrugghe, and Jianwu He Trade Liberalization and Investment: Firm-level Evidence from Mexico 320 Ivan T. Kandilov and Aslı Leblebiciog˘ lu SUBSCRIPTIONS:A subscription to The World Bank Economic Review (ISSN 0258-6770) comprises 3 issues. Prices include postage; for subscribers outside the Americas, issues are sent air freight. Annual Subscription Rate (Volume 26, 3 Issues, 2012): Institutions—Print edition and site-wide online access: £176/ $264/E264, Print edition only: £162/$242/E242, Site-wide online access only: £147/$221/E221; Corporate—Print edition and site-wide online access: £264/$395/E395, Print edition only: £242/$362/E362, Site-wide online access only: £219/$330/E330; Personal—Print edition and individual online access: £43/$64/E64. US$ rate applies to US & Canada, EurosE applies to Europe, UK£ applies to UK and Rest of World. There may be other subscription rates available; for a complete listing, please visit www.wber.oxfordjournals.org/subscriptions. Readers with mailing addresses in non-OECD countries and in socialist economies in transition are eligible to receive complimentary subscriptions on request by writing to the UK address below. Full prepayment in the correct currency is required for all orders. Orders are regarded as �rm, and payments are not refundable. Subscriptions are accepted and entered on a complete volume basis. Claims cannot be con- sidered more than four months after publication or date of order, whichever is later. All subscriptions in Canada are subject to GST. Subscriptions in the EU may be subject to European VAT. If registered, please supply details to avoid unnecessary charges. For subscriptions that include online versions, a proportion of the subscription price may be subject to UK VAT. Personal rates are applicable only when a subscription is for individual use and are not available if delivery is made to a corporate address. Downloaded from http://wber.oxfordjournals.org/ at International Monetary Fund on August 19, 2013 The current year and two previous years’ issues are available from Oxford University Press. Previous BACK ISSUES: volumes can be obtained from the Periodicals Service Company, 11 Main Street, Germantown, NY 12526, USA. E-mail: psc@periodicals.com. Tel: (518) 537-4700. Fax: (518) 537-5899. CONTACT INFORMATION: Journals Customer Service Department, Oxford University Press, Great Clarendon Street, OxfordOX2 6DP, UK. E-mail: jnls.cust.serv@oup.com. Tel: þ44 (0)1865 353907. Fax: þ 44 (0)1865 353485. In the Americas, please contact: Journals Customer Service Department, Oxford University Press, 2001 Evans Road, Cary, NC 27513, USA. E-mail: jnlorders@oup.com. Tel: (800) 852-7323 (toll-free in USA/Canada) or (919) 677-0977. Fax: (919) 677-1714. In Japan, please contact: Journals Customer Service Department, Oxford University Press, Tokyo, 4-5-10-8F Shiba, Minato-ku, Tokyo, 108-8386, Japan. E-mail: custserv.jp@oup.com. Tel: þ 81 3 5444 5858. Fax: þ 81 3 3454 2929. POSTAL INFORMATION: The World Bank Economic Review (ISSN 0258-6770) is published three times a year, in February, June, and October, by Oxford University Press for the International Bank for Reconstruction and Development/THE WORLD BANK. Send address changes to The World Bank Economic Review, Journals Customer Service Department, Oxford University Press, 2001 Evans Road, Cary, NC 27513-2009. Periodicals postage paid at Cary, NC and at additional mailing of�ces. Communications regarding original articles and editorial management should be addressed to The Editor, The World Bank Economic Review, The World Bank, 3, Chemin Louis Dunant, CP66 1211 Geneva 20, Switzerland. ENVIRONMENTAL AND ETHICAL POLICIES: Oxford Journals, a division of Oxford University Press, is committed to working with the global community to bring the highest quality research to the widest possible audience. Oxford Journals will protect the environment by implementing environmentally friendly policies and practices wherever possible. Please see http://www.oxfordjournals.org/ethicalpolicies.html for further information on environmental and ethical policies. DIGITAL OBJECT IDENTIFIERS: For information on dois and to resolve them, please visit www.doi.org. PERMISSIONS: For information on how to request permissions to reproduce articles or information from this journal, please visit www.oxfordjournals.org/jnls/permissions. ADVERTISING: Advertising, inserts, and artwork enquiries should be addressed to Advertising and Special Sales, Oxford Journals, Oxford University Press, Great Clarendon Street, Oxford, OX2 6DP, UK. Tel: þ 44 (0)1865 354767; Fax: þ 44(0)1865 353774; E-mail: jnlsadvertising@oup.com. DISCLAIMER: Statements of fact and opinion in the articles in The World Bank Economic Review are those of the respective authors and contributors and not of the International Bank for Reconstruction and Development/THE WORLD BANK or Oxford University Press. Neither Oxford University Press nor the International Bank for Reconstruction and Development/THE WORLD BANK make any representation, express or implied, in respect of the accuracy of the material in this journal and cannot accept any legal responsibility or liability for any errors or omissions that may be made. The reader should make her or his own evaluation as to the appropriateness or otherwise of any experimental technique described. The World Bank Economic Review is printed on acid-free paper that meets the minimum require- PAPER USED: ments of ANSI Standard Z39.48-1984 (Permanence of Paper). INDEXING AND ABSTRACTING: The World Bank Economic Review is indexed and/or abstracted by CAB Abstracts, Current Contents/Social and Behavioral Sciences, Journal of Economic Literature/EconLit, PAIS International, RePEc (Research in Economic Papers), and Social Services Citation Index. COPYRIGHT # 2012 The International Bank for Reconstruction and Development/THE WORLD BANK All rights reserved; no part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise without prior written permission of the publisher or a license permitting restricted copying issued in the UK by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1P 9HE, or in the USA by the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Typeset by Techset Composition Limited, Chennai, India; Printed by Edwards Brothers Incorporated, USA. Conditional Cash Transfers and HIV/AIDS Prevention: Unconditionally Promising? Hans-Peter Kohler and Rebecca L. Thornton Conditional cash transfers (CCTs) have recently received considerable attention as a po- tentially innovative and effective approach to the prevention of HIV/AIDS. We evaluate a conditional cash transfer program in rural Malawi which offered �nancial incentives to men and women to maintain their HIV status for approximately one year. The amounts of the reward ranged from zero to approximately 3–4 months wage. We �nd no effect of the offered incentives on HIV status or on reported sexual behavior. However, shortly after receiving the reward, men who received the cash transfer were 9 percentage points more likely and women were 6.7 percentage points less likely to engage in risky sex. Our analyses therefore question the “unconditional effectiveness� of CCT program for HIV prevention: CCT Programs that aim to motivate safe sexual behavior in Africa should take into account that money given in the present may have much stronger effects than rewards offered in the future, and any effect of these programs may be fairly sensitive to the speci�c design of the program, the local and/or cultural context, and the degree of agency an individual has with respect to sexual behaviors. JEL codes: I12, C93, O12 Since the beginning of the HIV/AIDS epidemic, various strategies have been put in place to curb the spread of the disease and prevent further infections. There is ongoing research focusing on ways to reduce the HIV transmission rate such as treating of other sexually transmitted diseases (STDs), vaccines and microbicides, and male circumcision. The majority of HIV prevention strategies have targeted behavior change, encouraging individuals to shift from risky to less risky sex. These strategies thus promote programs such as educa- tion about the disease and how to protect oneself, HIV testing to know one’s Hans Peter Kohler (corresponding author) is Frederick J. Warren Professor of Demography, Department of Sociology and Population Studies Center, 3718 Locust Walk, University of Pennsylvania, Philadelphia, PA 19104-6299, USA; Email: hpkohler@pop.upenn.edu. Rebecca L. Thornton is Assistant Professor, Department of Economics, University of Michigan, 213 Lorch Hall Ann Arbor, MI 48109, USA; Email: rebeccal@umich.edu. We gratefully acknowledge the support for this research through the National Institute of Child Health and Human Development (NICHD grant numbers R21 HD050653, RO1 HD044228 and R01 HD053781) and the University of Pennsylvania University Research Foundation. We thank the MDICP team for assistance with data collection. We also thank You-Tyng Luo, Sayeh Nikpay, Giordano Palloni, and Nick Snavely for excellent research assistance. A supplemental appendix to this article is available at http://wber.oxfordjournals.org. THE WORLD BANK ECONOMIC REVIEW, VOL. 26, NO. 2, pp. 165 –190 doi:10.1093/wber/lhr041 Advance Access Publication November 2, 2011 # The Author 2011. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 165 166 THE WORLD BANK ECONOMIC REVIEW own or one’s partner’s status, condom promotion, community, peer, and faith- based group advocacy, HIV destigmatization campaigns, better negotiation of risk such as through condom use or partner selection, and the promotion of ab- stinence programs (for an in-depth review, see Bertozzi and others 2006). However, despite these prevention efforts, evidence of dramatic behavior changes as a response to these programs in Africa is controversial, and no single interven- tion has emerged as an established approach (McCoy and others 2010).1 This paper evaluates a new HIV prevention strategy: offering �nancial incen- tives for individuals to maintain their HIV status. Conditional cash transfers (CCTs) have been found to be effective in a variety of settings (Fiszbein and Schady 2009). In the developing world, some of the most well known CCTs have involved incentives for households, parents, or children to engage in healthy behavior or to increase schooling attainment/performance. Important examples include Oportunidades (Progresa) in Mexico (Levy 2006; Lindert and others 2006), the Bolsa Escola Program in Brazil (de Janvry and others 2005; World Bank 2001), the Red de Proteccion Social program in Nicaragua (Maluccio and Flores 2005), as well as smaller programs in other developing countries (de Janvry and Sadoulet 2006; Lagarde and others 2007). In devel- oped countries, CCTs have also focused on speci�c health behavior such as stopping smoking (Gine ´ and others 2009; Volpp and others 2009), losing weight (Charness and Gneezy 2009; Volpp and others 2008a), or taking medi- cine (Volpp and others 2008b). Until recently, there have been no programs that directly incentivized individuals to stay free of sexually transmitted dis- eases, although several such programs are currently underway, including a program that gave �nancial rewards for testing negative for non HIV sexually transmitted diseases every few months in Tanzania (RESPECT) (deWalque and others 2011; World Bank 2010b),2 and a program for adolescents in Mexico (Galarraga and Gertler 2010). Another program in Malawi found that condi- tional and unconditional cash transfers for adolescent girls were associated with lower rates of marriage (Baird and others 2010) and HIV (World Bank 2010a). Recent press releases have heralded these conditional cash incentive programs as potentially promising and innovative approaches to HIV/AIDS prevention. The UC Berkeley news release about the RESPECT program, for example, begins, “Giving out cash can be an effective tool in combating sexual- ly transmitted infections in rural Africa� (Yang 2010), and this promise of CCT programs for HIV/AIDS infection has been widely reported in the media (Dugger 2010; Jack 2010; Over 2010; World Bank 2010a). Our analyses question the “unconditional effectiveness� of such CCT pro- grams for HIV prevention. In particular, CCT programs that aim to motivate 1. One exception to this may be male circumcision, however, roll-out of these services has been slow. 2. Because the complete manuscript of the de Walque and others (2011) study is under embargo, it cannot be cited and/or discussed in detail at this time, therefore, we limit our discussion of this study to information that is currently publicly available. Kohler and Thornton 167 safe sexual behavior in Africa need to take into account that money given in the present may have much stronger effects than rewards in the future, and any effect of these programs may be fairly sensitive to the speci�c design of the program, the local and/or cultural context, and the degree of agency individuals have with respect to sexual behaviors. We derive this conclusion from an evaluation of a conditional cash transfer program that was implemented in 2006 in rural Malawi. In 2006, approximately 1,300 men and women were tested for HIV. They were then offered �nancial incentives of random amounts ranging from zero to values worth approximately four month’s wages if they maintained their HIV status for approximately one year. Throughout the year, respondents were asked about their sexual behavior three times, through interviewer-administered sexual diaries. Respondents were then tested for HIV, and �nancial incentives were awarded based on whether they had maintained their HIV status. After the second round of testing, the incentives program stopped. Using the randomized design, we evaluate the effects of being offered an in- centive on reported sexual activity and condom use before the second round of HIV testing. We �nd no statistical difference in reported behavior between those offered incentives and those who were not over three rounds of data.3 In addition, there were no differential effects by time of the survey, gender, educa- tion, expectations, or measures of female empowerment of our respondents. One important aspect to consider in interpreting our results is whether we should have expected �nancial rewards to affect changes in sexual behavior at all. Outside of an incentives program, if individuals rationally maximize their lifetime utility, they should optimally choose how much risky or safe sex to engage in. Individuals facing higher risks of infection should adjust their behav- ior to substitute towards safer sex (Oster 2007; Philipson and Posner 1995).4 Given that there is no cure for HIV, the cost of infection is high, and would be arguably much higher than the four months’ wage offered through the incen- tives program. On the other hand, there is a growing body of both theoretical and empirical literature in which individuals are hyperbolic, have dif�culty with commitment, have addictive behaviors, substantially underestimate their survival probabilities and overestimate their probabilities of being 3. It is important to note that our evaluation measures self-reported sexual behavior in response to the incentive. If those who were offered incentives were more likely to overstate safe sexual behavior, our estimates would overstate the true program effects and would thus represent upper-bounds. Since our study documents mostly the absence of any effects of the incentive program, our conclusions are conservative and not sensitive to the most likely form of misreporting in which those who were offered incentives were more likely to overstate safe sexual behavior. On the other hand, other research has documented that self-reported sexual behavior does strongly correlate with HIV status and that use of ACASI computer methods of interviewing may not have large effects on the results (Mensch and others 2008). 4. There are, however, a variety of potential non-behavioral reasons for the lack of behavioral change in response to the AIDS epidemic, such as lack of information (about how to prevent infection), poverty or high mortality rates from other diseases (i.e., lower life-time earnings), or lack of bargaining power (i.e., to suggest condom use or abstinence). 168 THE WORLD BANK ECONOMIC REVIEW HIV-positive, and/or fail to adequately update subjective assessments of their HIV status in response to new information such as HIV test results.5 The insights from behavioral economics are important for the evaluation of CCT programs as individuals who want to abstain from having sex or want to use a condom trade off sexual pleasure in the present for future lifetime utility and possible rewards received through CCT programs. If individuals place a higher value on the present, then offering cash incentives could help increase the short-term bene�ts of engaging in safe sex in the present. There are two other recent studies similar to ours. First, a similar program was implemented in Tanzania (RESPECT study) that randomly offered cash incentives to participants every four months (either 10 U.S. dollars or 20 U.S. dollars) for remaining free of a set of curable sexually transmitted infections, in- cluding chlamydia, gonorrhea and syphilis (de Walque and others 2011; World Bank 2010b). In that program, at the end of the trial period, 9 percent of partici- pants eligible for the highest incentive amount tested positive for curable infec- tions compared to 12 percent among the control group. These results thus suggest that enrollees in this program who were offered a US$20 incentive experienced a 25 percent lower STI prevalence than the control group enrollees after one year (deWalque and others 2011; Yang 2010). A second study in Malawi randomly offered girls and their parents approximately 15 U.S. dollars each month if the girls attended school regularly as well as additional payments for school fees (given either to the school or the girl herself) as well as compen- sation equivalent to the cost of school fees for some of the girls (Baird and others 2010). A year later, girls offered the incentives were 6 percentage points more likely to be in school, as well as less likely to be infected with HIV (1.2 percent versus the control group’s 3 percent) (World Bank 2010b). In these cases there are some notable differences and similarities with the program we evaluate in this paper. A �rst difference is that the amount of cash offered in both programs were substantially larger in both cases. In the Tanzania project, the amount of cash offered mattered for their results within the study: the group eligible for the lower incentives had the same infection rate as the control group that was offered no payments. A second difference is that in the Tanzania case, any participant who tested positive for an STI during the study received free medical treatment throughout the program. To the extent that the incentive was offered for treatable STIs, obtaining outside treatment could have biased results towards �nding effects on STIs. In the Malawi case, participants either received money with no conditions or received money if they attended school a certain percentage of days; the amount of money that participants and their families received was substantially higher as well. 5. See for instance Anglewicz and Kohler (2009); Delavande and Kohler (2009, forthcoming); Fudenberg and Levine (2006); Gul and Pesendorfer (2001, 2004); Laibson (1997) and O’Donoghue and Rabin (1999, 2001). Research on addiction or self-control include Bernheim and Rangel (2004); Gruber and Koszegi (2001) and Gul and Pesendorfer (2007). Kohler and Thornton 169 In the case of the Malawi Incentive Program that is evaluated in the present paper, the failure of the monetary incentive to motivate behavior change may be due to a number of different factors that may be context or program specif- ic. Rural men and women in Malawi may be less likely to respond to �nancial incentives than higher risk individuals such as urban men and women or indivi- duals who are not in stable marital relationships. It may also be that the amount of money was too small to induce a change in behavior. Other possi- bilities are that the offer of the �nancial reward one year in the future was too far away from the present to overcome hyperbolic discounting, or that there were concerns by respondents about the creditability of receiving the incentive payment in about one year conditional on their HIV status.6 In the cases where these particular aspects are important for respondents, conditional incentive payments may not affect short term decisions to engage in safer sexual behav- ior. These issues are therefore important in thinking through the design of future programs. Although the conditional offer of money had no impact on reported sexual behavior, we �nd large effects of receiving money approximately one week after the second (and �nal) round of HIV testing when the incentive program had ceased. Men who received the money were 12.3 percentage points more likely to have vaginal sex and had approximately 0.5 days more of sex. While they were 5 percentage points more likely to report using a condom, overall there was a 9 percentage point net increase in risky sex. On the other hand, women were 6.7 percentage points less likely to have engaged in risky sex, a result that is driven by abstinence rather than increased condom use. The �nd- ings of the response to receiving the monetary transfer provide further evidence that money matters and can be protective for women. This �nding also may have important implications for future CCTs offering �nancial incentives over time based on sexual behavior or STI status. In particular, the total effect of money may include two potentially asymmetric effects of the incentive offer and the direct effect of money itself. Importantly, the fact that we �nd no sig- ni�cant impact cautions policymakers to take care in considering CCT pro- grams as a panacea for the HIV epidemic. This paper proceeds as follows: Section I describes the experimental design and the data, and Section II discusses our empirical estimation strategy. Section 6. While we do not have direct evidence, we perceive that the creditability of the promise of a conditional cash transfer one year in the future was relatively high in our project for several reasons. First, the MDICP project has been interviewing respondents in the study villages since 1998, and in many cases, respondents of the Malawi Incentive Project were themselves or had relatives who had been interviewed by the project since 1998. Moreover, during the one year period relevant for the CCT payments, the respondents were visited three times as part of the collection of sexual diaries that are described in more detail below. These repeated visits reminded respondents of their participation in the Malawi Incentive Project and signaled to the respondents the ongoing commitment of the project to the conditional incentive payments at the end of the study period. This is also related to a paper that �nds evidence in an experiment that a large proportion of Malawian farmers are time inconsistent (Gine ´ and others 2010). 170 THE WORLD BANK ECONOMIC REVIEW III presents the estimates of the offered cash incentives on sexual behavior. Section IV presents the effects of receiving the cash reward on sexual behavior. Section V concludes. I . M A L AW I I N C E N T I V E S P R O J E C T Sample and Survey Data The Malawi Incentives Project builds upon the Malawi Diffusion and Ideational Change Project (MDICP), a longitudinal study of men and women in three districts of rural Malawi. The original respondents in the MDICP study were randomly selected from 125 villages in 1998 and included ever- married women and their husbands; these individuals were reinterviewed in 2001. In 2004, an additional sample of randomly selected adolescent men and women (ages 14–24) from the same villages was added to the original sample. Each respondent in the original MDICP sample or in the adolescent refresher sample was eligible to be reinterviewed in 2006. It is important to note that while the respondents were representative at the time of the original sampling, some respondents attrited at each subsequent survey wave. During the surveys in 2004 and 2006, a separate team of nurses offered respondents free tests for HIV through either oral swabs (in 2004) or rapid tests (in 2006) (Anglewicz and others 2009; Obare and others 2009). We do not utilize the panel data from earlier waves of the study, but rather focus on a subsample of respondents who were interviewed and accepted an HIV test in 2006. Appendix S.1, available in the online supplemental materials for this article at http://wber.oxfordjournals.org, presents a timeline of the incentives program. During the 2006 testing of the MDICP respondents, 92 percent of the respondents who were offered an HIV test accepted the test. Among these respondents, the HIV prevalence rate was 9.2 percent. To enroll individuals into the Malawi Incentives Project, we randomly selected respondents from the 2006 MDICP survey, with a higher weight on HIV discordant couples (from their 2004 and 2006 HIV results). Of those who were tested for HIV in 2006, a total of 1,402 individuals were invited to participate in the incentives project. Those individuals were approached one to two months after the 2006 survey and HIV testing. A total of 1,307 (or 93 percent) were enrolled into the incen- tives program. Table 1 presents summary statistics for the 1,307 individuals analyzed in this paper. 45 percent are male, with an average age of 36 years. The majority, 84 percent, were married. The sample is essentially rural and consists of indivi- duals engaged in subsistence agriculture. Moreover, HIV for these individuals is a very salient disease. For example, respondents report knowing approxi- mately eight friends who have died from AIDS, and while only 29 percent believe there is some likelihood of a current infection of HIV, 57 percent believe there is a future likelihood of becoming infected. Kohler and Thornton 171 T A B L E 1 . Summary Statistics Mean Standard Dev (1) (2) Male 0.450 0.498 Age 35.78 12.96 Married 0.838 0.369 Expenditures 3130 5781 Subjective Health 2.065 0.935 Number of lifetime sexual partners 3.108 3.780 Acceptable to use condom 0.405 0.491 Ever used condom with current partner 0.263 0.440 Fear about HIV 1.593 0.752 Number friends died of HIV 8.197 8.045 Some likelihood of HIV infection (current) 0.287 0.453 Some likelihood of HIV infection (future) 0.566 0.496 HIV positive at baseline 0.087 0.282 Enrolled as a “couple� 0.238 0.426 Notes: This table presents baseline summary statistics among 1,307 respondents who partici- pated in the incentives program. Expenditures are measured as household expenditures in the past 3 months (on clothes, schooling, medical expenses, fertilizer, agricultural inputs, and funerals). Subjective health represents self-reported health and was phrased: “In general, would you say your health is: Excellent (1), Very Good (2), Good (3), Fair (4), Poor (5).� Number of lifetime sexual partners includes any partner (long-term or short-term) that the respondent had sex with. Fear about HIV was phrased as: “How worried are you that you might catch HIV/AIDS? Not worried at all (1), Worried a little (2), Worried a lot (3).� Some likelihood of infection was coded one if the respondent answered low, medium, high, or don’t know, and zero otherwise. Each vari- able was measured before incentives were offered. Financial Incentives At the time of the HIV test in 2006, individuals were randomly selected to be offered HIV counseling as either a couple or as an individual.7 The majority, 76 percent, of those involved in the incentives project were tested as an individ- ual. One to two months later, each individual was visited to introduce them to the incentives program. Each individual or couple randomly drew a token out of a bag to determine their incentive amount. The incentive amounts included zero, 500 Kwacha (approximately 4 U.S. dollars), or 2,000 Kwacha (approxi- mately 16 U.S. dollars) for an individual, or zero, 1,000 Kwacha, or 4,000 Kwacha (approximately 32 U.S. dollars) for a couple. Each individual was given a voucher for the monetary amount they randomly drew, and was told that they must maintain their HIV status in order to receive the money 7. Only married spouses who were both in the MDICP sample were given the chance to have the couples counseling. If both of the spouses agreed to the couple testing, they would both be tested and both learn the HIV results together. If one of the individuals did not consent, then both members of the couple would receive individual counseling, and only learn their own HIV results. 172 THE WORLD BANK ECONOMIC REVIEW approximately one year later.8 Couples were told that both members of the couple must maintain their HIV status in order for the couple to receive the money.9 Couples who divorced, separated, or for whom one member was away would receive one half of the couple’s incentive after one year if the individual who was tested maintained his/her status. Because of the endogeneity of choice or ability to test as a couple or individual, this paper evaluates the effect of the program on individuals, rather than on the couple as a unit. Results are robust to controlling for the type of testing they received and point estimates change very little (results not shown). The �nancial incentives were viewed as a signi�cant amount among respon- dents. Most of the respondents are subsistence farmers, and based on Whiteside (1998), piecework daily rates ( ganyu) for farm workers are approxi- mately 20 Kwacha for men and 5 –10 Kwacha for women. Several different experiments in Malawi have found large responses to very small incentive amounts. A program that offered cash incentives to learn their HIV results after testing found that even just 10 Kwacha increased the likelihood of travel- ing for results by almost 20 percentage points (Thornton 2008). Another study that randomly offered 30 Kwacha to individuals for a day’s work found that 80 percent of individuals showed up for work (Goldberg 2010). It is important to note that the �nancial incentive was not speci�cally tied to being HIV-negative at the second round of testing. In particular, the sample for the Malawi Incentive Project included HIV-negative persons and HIV-positive persons (including, but not exclusively, respondents who were part of a dis- cordant couple) in order to avoid the possibility that an exclusion from enroll- ment in the study would signal to outsiders information about a MDICP respondent’s HIV status. If an HIV-positive individual was enrolled as an indi- vidual (due to a spouse being away, or a spouse not giving consent to couple counseling), he or she would automatically receive the monetary amount at the end of the study (conditional in participating in the �nal HIV test and survey). In the analysis we only examine the effect of the incentive among those who were HIV-negative at the beginning of the study although results are robust to including HIV-positives (results not shown). The incentives were distributed between the three levels, across both couples and individuals, with an equal probability of receiving each incentive amount. In practice, the realized (ex-post) distribution of the incentives resulted in 35 percent receiving zero, 32 percent receiving a medium-level incentive, and 33 percent re- ceiving a high-level incentive. The distribution of the incentives given out was roughly identical to the theoretical distribution. We cannot reject that the realized 8. Due to logistical issues, the second round of HIV testing was conducted several months after that, approximately 15 months after the �rst round of testing. 9. If the married couple had agreed to the couples’ HIV testing, they were offered to be enrolled into the couples’ incentive program. All of the respondents who had had individual HIV testing, or those whose partner was away or who refused the couples’ incentive program, were offered the individual incentive program. Kohler and Thornton 173 T A B L E 2 . Baseline Characteristics by Incentives Offered Zero Incentive Medium Incentive High Incentive p-value of (N ¼ 455) (N ¼ 420) (N ¼ 432) joint test (1) (2) (3) (4) Male 0.446 0.469 0.435 0.59 Age 34.80 35.52 37.07 0.03 Married 0.844 0.831 0.838 0.87 Expenditures 3013 3131 3250 0.84 Subjective Health 2.031 2.000 2.163 0.03 Number of lifetime 2.940 3.349 3.053 0.32 sexual partners Acceptable to use 0.400 0.392 0.424 0.62 condom Used condom with 0.261 0.257 0.271 0.89 current partner Fear about HIV 1.597 1.579 1.603 0.89 Number friends died of 7.816 8.581 8.222 0.40 HIV Some likelihood of HIV 0.294 0.288 0.280 0.92 infection (current) Some likelihood of HIV 0.593 0.557 0.547 0.38 infection (future) HIV positive at baseline 0.105 0.088 0.067 0.13 Enrolled as a “couple� 0.209 0.240 0.266 0.13 Standard errors in parenthesis *signi�cant at 10%; **signi�cant at 5%; ***signi�cant at 1%. Notes: This table presents baseline demographic statistics by incentives amounts among 1,307 respondents who participated in the incentives program. Expenditures are measured as household expenditures in the past 3 months (on clothes, schooling, medical expenses, fertilizer, agricultural inputs, and funerals). Subjective health represents self-reported health and was phrased: “In general, would you say your health is: Excellent (1), Very Good (2), Good (3), Fair (4), Poor (5).� Number of lifetime sexual partners includes any partner (long-term or short-term) that the respondent had sex with). Fear about HIV was phrased as: “How worried are you that you might catch HIV/AIDS? Not worried at all (1), Worried a little (2), Worried a lot (3).� Some likelihood of infection was coded one if the respondent answered low, medium, high, or don’t know, and zero otherwise. Each variable was measured before incentives were offered. and theoretical distributions of incentives is equal using a Kolmogorov-Smirnov test for equality of distributions (p-value of 0.997, not shown). Table 2 presents baseline summary statistics among those offered zero, medium, and high amounts of the incentive. For almost every variable, there is no signi�cant effect of incentives. In comparing some of the averages across in- centive groups, there are some signi�cant differences (for example, age and self-reported health); these differences are small in magnitude and we also include these demographic controls in the analysis. Sexual Diaries and HIV Testing Approximately three to six months after the incentives were offered and vou- chers given out, respondents were interviewed in their homes and asked about 174 THE WORLD BANK ECONOMIC REVIEW their recent sexual behavior. In particular, they were asked about their sexual activities and condom use over the previous nine days. These interviewer admi- nistered diaries were collected three times over the period of the study, which we identify as Round 1, Round 2, and Round 3, respectively. These were un- announced visits that occurred approximately every three months; the same questionnaire was administered each time. At the end of the third round, respondents were visited by a project nurse and were offered another HIV test. This HIV test was tied to the �nancial incentives and thus was required in order to be eligible to receive any of the �nancial incentives. At the end of the study, of the 1,076 HIV-negative individuals who took a test at the follow-up, seven were HIV positive. This is an incidence rate of less than one percent. It is important to note that the study was not originally designed to be powered to detect changes HIV incidence, which would require a much larger sample size. Instead, we designed the study to examine the effects on sexual activity, including condom use (see also Section IV and foot- note 3). It is also important to note that while initially, discordant couples were overrepresented in the sample, there were actually quite few at baseline and then followed until the end of the study, thus making it dif�cult to analyze effects among these couples. Table 3, Panel A presents attrition statistics across each round of sexual diary and obtaining a follow-up HIV test. Approximately 93 percent of the sample completed round 1 diaries, 89 percent completed round 2 diaries, and 92 percent completed round 3 diaries. Men (who tend to be more mobile in Malawi) were less likely to complete rounds (between 3.0 and 4.2 percentage points less than women; results not shown). Individuals who were HIV positive in 2006 were less likely to complete rounds, and this became more of a factor over time. HIV-positives were 6.6 percentage points less likely to complete round 1 diaries, 9.9 percentage points less likely to complete round 2 diaries, 10 percentage points less likely to complete round 3 diaries, and 20 percentage points less likely to take the follow-up test. Almost all of the respondents (98 percent) completed at least one round of diaries, with an average of 2.7 rounds. At the end of the study, 89 percent of all enrolled respondents obtained a follow-up HIV test after round 3. Panel B presents attrition statistics among HIV-negatives who form the main sample for the analyses in the remainder of the paper. Importantly, completion rates of sexual diary rounds and obtaining a follow- up HIV test are correlated with the incentive offered. Those who were offered incentives (and in some cases, higher levels of incentives) were more likely to complete sexual diaries and were more likely to take the HIV test at the end of the study. But only in a few cases (two out of seven) is the difference in attri- tion between the different incentive levels statistically signi�cant. Although in each round respondents received a small gift for their participation (soap), those who were not offered a �nancial reward may have had a lower return in continuing to answer survey questions. They would also had little potential T A B L E 3 . Attrition/Survey Completion Rates Panel A: Entire Sample All Zero Incentive Medium Incentive High Incentive p-value of joint test (1) (2) (3) (4) (5) Enrolled in Incentives Project 1307 455 420 432 – Completed Round 1 0.929 0.921 0.921 0.944 0.31 Completed Round 2 0.889 0.884 0.902 0.882 0.57 Completed Round 3 0.916 0.890 0.931 0.928 0.05 Completed at Least One Round 0.979 0.971 0.988 0.977 0.23 Completed Each Round 0.829 0.802 0.845 0.843 0.16 Number rounds completed 2.734 2.695 2.755 2.755 0.29 Follow-up HIV Test 0.884 0.820 0.910 0.928 0.00 Panel B: HIV Negatives All Zero Incentive Medium Incentive High Incentive p-value of joint test (1) (2) (3) (4) (5) Enrolled in Incentives Project 1193 407 383 403 – Completed Round 1 0.935 0.929 0.927 0.948 0.41 Completed Round 2 0.898 0.892 0.914 0.888 0.45 Completed Round 3 0.925 0.899 0.940 0.935 0.06 Completed at Least One Round 0.983 0.975 0.992 0.983 0.19 Completed Each Round 0.840 0.816 0.856 0.849 0.25 Number rounds completed 2.757 2.720 2.781 2.772 0.33 Follow-up HIV Test 0.902 0.848 0.924 0.935 0.00 Standard errors in parenthesis *signi�cant at 10%; **signi�cant at 5%; ***signi�cant at 1%. Notes: This table presents survey completion rates by incentives amounts. The sample includes 1,307 respondents who participated in the incentives program. Panel B exludes 114 individuals who were HIV positive at baseline. Kohler and Thornton 175 176 THE WORLD BANK ECONOMIC REVIEW gain in participating in a follow-up HIV test after having already learned their status one year earlier. If attritors who were offered the incentive were also more likely to have engaged in riskier sex, then our estimates of the effective- ness of the program on safe sexual behavior would be upwardly biased. We would underestimate the true effect if the pattern of differential attrition was reversed. If we check for differential attrition by interacting indicators of base- line risky sexual behavior (HIV status in 2006, ever using a condom, or number of sexual partners) and the incentive, there is no signi�cant effect on completing any round (not shown). There was also no differential attrition on completing at least one round of sexual diaries. For our analysis below, we �rst pool our results across all three rounds of sexual diaries, mitigating the effects of differential attrition at each single round. Another strategy is to use baseline observable characteristics to construct inverse probability weights. This proced- ure predicts the probability that an individual has not attrited; the inverse of this probability is the weight in each regression. Thus, individuals who are more likely to have missing sexual diary rounds are given more weight. The main results reported below do not differ substantially using these weights (results not shown). From the information in the diaries, we extract several indicators of risky or safe sexual behavior. These include, at any of the rounds measured only among women, being pregnant, having any vaginal sex (during the nine days of the sexual diary), the number of days having vaginal sex, whether or not the re- spondent used a condom during the days of the diary conditional on having sex, and if condoms were present at home. We also construct a composite vari- able indicating whether the respondent had safe sex—that is, it is equal to one if the respondent had sex with a condom or had no sex at all.10 Across all three rounds, 8.6 percent of women were pregnant, 53 percent engaged in vaginal sex across the nine days of diary collection with an average of 1.5 days of sex (across the three rounds). Across the rounds, 12.6 percent report using a condom. In each of the rounds, only 12.1 percent reported having condoms at home. 55.6 percent of respondents practiced “safe sex�—either abstaining or using a condom. We �nd no evidence of recall bias when comparing the reported number of sexual encounters on the �rst day of the diary (i.e., “yester- day�) with the last day of the diary (i.e., “day 9�). 10. While multiple partnerships may have been one important indicator of risky sexual behavior, we do not observe a lot of variation in reported multiple partnershipsin the sexual diaries. For example, in the �rst round of sexual diaries, only 3.7 percent of those who reported having one partner also reported having two. Only 0.4 percent report having more than two partners in that week. The majority of those reporting multiple partners are polygamous men. This may be due to the fact that multiple partnerships are infrequent and we do not capture this very well using the diary method, or that there is underreporting. The interviewers used in the data collection were local enumerators and while there is evidence that “insiders� may increase data quality in Africa (Sana and Weinreb 2008), there may have been some reluctance to report multiple partnerships. For this reason, we do not analyze this variable in this paper. Kohler and Thornton 177 I I . E M P I R I CA L S T R AT E GY Using the fact that the incentives were randomly offered to empirically measure the impact of �nancial incentives on reported sexual behavior, we estimate the following speci�cation: Yi;j ¼ a þ bðAny Incentivei Þ þ gXi þ 1i;j : ð1Þ Because the incentive project included individuals who were HIV-positive at baseline primarily to protect the con�dentiality of HIV status of responders, rather than for the identi�cation of program effects, we estimate in this paper the effects of the incentives only among individuals who were HIV-negative at baseline. We �rst pool each of the rounds of the sexual diaries together. Y indi- cates reported sexual behavior for an individual i in round j. “Any Incentive� indicates that the individual was offered a positive (nonzero) incentive offer. “X� is a vector including indicators of gender, age, marital status, if the incen- tive was given as an individual or as a couple, and HIV status in 2006, as well as district and sexual diary round dummies. Standard errors are clustered by in- dividual for the pooled regressions. In addition to pooling rounds, we estimate the above equation separately for each round; for these speci�cations we pool by village.11 In a simple comparison of means which does not include any controls or adjustments in standard errors, the results are unchanged (Appendix S.2). In addition to measuring the effect of being offered any incentive, we could also include dummy variables indicating whether respondents were offered medium- valued or high-valued incentives. All of the results are robust to this alternative speci�cation (Appendix S.3). Another approach would be to only include the entire sample of those who were HIV-positive in 2006. The main results do not differ substantially among a pooled sample of HIV-positives and HIV-negatives (not shown). Although we run linear speci�cations, results are also robust to non-linear models when we have a binary outcome variable as well as using person-day observations. Our primary coef�cient of interest is b, which measures the impact of being offered a �nancial incentive on reported sexual behavior. While typically, �- nancial status is correlated with other omitted variables which also influence sexual behavior, because the incentives were randomly allocated, b is an un- biased estimate of the impact of cash offered on sexual behavior. We present how the baseline characteristics correlate with the incentives amount in Table 2. In general, there were few signi�cant differences in baseline variables across incentive amounts. 11. The majority of the individuals in the sample are married. A subset of the individuals’ spouses are included in the sample. This is by design when the couples counseling was introduced in 2006. However, given that individuals may report differently from each other, or may have extramarital relationships, we treat each observation as independent. 178 THE WORLD BANK ECONOMIC REVIEW It is worth remarking on the fact that we report effects on follow-up HIV status and reported sexual behavior. In general, there was no direct bene�t for respondents to overstate their safer sexual practices to the interviewers. To the extent that misreporting is not correlated to the randomized incentives, there is no reason to worry about biased estimates. On the other hand, even though respondents were no more likely to earn their incentive money if those offered larger incentive amounts did in fact overstate their safe sexual behavior because they perceived some greater bene�t from doing so, our estimates would be an upper bound of the true effect of the program. Because our main �ndings suggest an absence of important effects of the incentive payments on reported sexual behaviors, our conclusion is conservative with respect to misreporting in which individuals with higher incentives report safer sexual practices. III. EFFECTS OF OFFERING A FINANCIAL INCENTIVE Results This section reports our main results of the effect of being offered �nancial incentives on reported sexual behavior. We �rst pool all three rounds of reported sexual behavior and estimate average effects of being offered a �nan- cial incentive. Table 4 presents the main results of the impact of being offered any incentive to maintain HIV status with the control group means of each dependent variable in the last row. There are no signi�cant effects of the incentive on any measure of reported sexual behavior. Not only are the coef�cients not statistically signi�cant, the size of the coef�cients are small. Figure 1 graphs the coef�cients, with 95 percent con�dence bars to illustrate the relatively small point estimates. Power cal- culation suggest that our incentive project, with an enrollment of about 1,200 HIV-negative individuals at baseline of whom 84 percent participated in all three rounds of sexual diaries, would have been able to detect: with a probability of more than 90 percent (with a ¼ 0.1), a 15 percent increase in having any vaginal sex during the nine days prior to each of the sexual diaries in response to receiving any incentive, a 15 percent increase in having safe sex—that is, it is equal to one if the respondent had sex with a condom or had no sex at all—during these periods, or a 15 percent decline in the number of days with vaginal sex during these periods. Our sample size would have allowed us to detect with a probability of more than 75 percent a 25 percent increase in the probability of using condoms (conditional on having sex) or having condoms at home in response to receiving any incentives; with more than 80 percent probability, our study would have detected a 40 percent reduction in the probability of a woman being pregnant.12 12. These power calculations are based on an enrollment of 1,200 individuals in the study, of whom 84 percent participate in all three sexual diaries, a a ¼ 1, three repeated measures of each sexual behavior with means and standard deviations for the control group as observed in the data, and a correlation of repeated measurements of these sexual behaviors as observed in the data. T A B L E 4 . Impact of Incentive Offer on Reported Sexual Behavior, All Rounds, HIV-Negatives Pregnant (Women) Any Vaginal Sex Days of Vaginal Sex Used Condom Condoms at Home Safe/No Sex (1) (2) (3) (4) (5) (6) Any Incentive 2 0.003 0.012 0.039 2 0.018 0.004 2 0.012 [0.015] [0.021] [0.089] [0.018] [0.014] [0.021] Male 0.136*** 0.455*** 0.094*** 0.082*** 2 0.099*** [0.020] [0.089] [0.018] [0.014] [0.020] Married 0.029 0.315*** 0.910*** 2 0.045 0.076*** 2 0.280*** [0.022] [0.029] [0.106] [0.044] [0.019] [0.026] Age 2 0.005 0.003 0.041** 2 0.011** 2 0.004 2 0.005 [0.003] [0.005] [0.018] [0.005] [0.003] [0.004] Age-squared 0.000 2 0.000 2 0.001*** 0.000 0.000 0.000* [0.000] [0.000] [0.000] [0.000] [0.000] [0.000] Some school 2 0.023 0.022 0.097 2 0.006 0.025* 2 0.025 [0.017] [0.026] [0.106] [0.019] [0.014] [0.025] Number of children 2 0.005 0.013*** 0.060*** 2 0.003 2 0.003 2 0.016*** [0.003] [0.004] [0.017] [0.003] [0.002] [0.004] Rumphi 0.003 2 0.089*** 2 0.033 0.140*** 0.054*** 0.172*** [0.018] [0.025] [0.116] [0.022] [0.018] [0.025] Balaka 0.009 2 0.133*** 2 0.293*** 0.020 0.008 0.112*** [0.019] [0.025] [0.105] [0.020] [0.016] [0.025] Round 2 2 0.002 2 0.022 2 0.078 2 0.003 2 0.001 2 0.037** [0.013] [0.018] [0.067] [0.016] [0.012] [0.018] Round 3 2 0.002 2 0.046** 2 0.025 2 0.026 2 0.016 2 0.020 [0.017] [0.019] [0.078] [0.017] [0.012] [0.019] Constant 0.249*** 0.253*** 0.046 0.373*** 0.124** 0.871*** [0.070] [0.089] [0.342] [0.091] [0.058] [0.087] Observations 1,777 3,258 3,258 1,748 3,258 3,258 R-squared 0.036 0.095 0.064 0.089 0.045 0.087 Mean of dep var in control group 0.095 0.535 1.538 0.1294 0.112 0.5471 Kohler and Thornton * signi�cant at 10%; ** signi�cant at 5%; *** signi�cant at 1%. Note: All columns present OLS regressions. Robust standard errors clustered by individual, in brackets. “Used a condom� is conditional on reported sexual activity. “Safe Sex or No Sex� is equal to one if the respondent abstained from sex or used a condom and zero otherwise. 179 180 THE WORLD BANK ECONOMIC REVIEW Figure 1. Effect of Incentive Offer on Reported Sexual Behavior The study was therefore adequately powered to detect changes in sexual behaviors at the magnitude that is suggested by other studies about the effect of �nancial incentives or HIV prevention programs on sexual behaviors, and these power cal- culations along with the small point estimates and relatively small standard errors for the coef�cients (Figure 1) indicate that the incentive project did not substan- tially change sexual behaviors in response to receiving �nancial incentives that reward maintaining one’s HIV status. To the extent that individuals receiving higher incentives may have felt the need to over-report safe sexual behavior to interviewers, these results are over- estimates of the true effect of the incentive on actual sexual behavior. Appendix S.3 also reports the effect of different amounts of incentive (low or high) as compared to being offered no incentive. The covariates in the table are of expected signs and magnitudes. Married individuals are signi�cantly more likely to engage in sex and less likely to use a condom. Men report more sexual activity but more condom use/ownership. Individuals who are HIV positive are less likely to report having sex and more likely to report using condoms and having condoms at home. There are also sharp differences in reported sexual behavior between districts (Rumphi, Balaka, and Mchinji), which could be due in part to ethnic or geographic differences. Table 5 presents the estimates separately for each round of data collection. Again, there are no statistically signi�cant effects of being offered the incentive during any round or for any variable. In addition to the main set of variables, we also present the effects on HIV status during round 3. Overall, there appears to be little impact of the offer of the incentive on any sexual behavior. T A B L E 5 . Impact of Incentive Offer on Reported Sexual Behavior, Separate Rounds, HIV-Negatives HIV Positive Pregnant Any Vaginal Days Used Condoms at Safe/No Dependent Variable: (Round 3) (Women) Sex Vaginal Sex Condom Home Sex (1) (2) (3) (4) (5) (6) (7) Round 1 “Any Incentive� Coef�cient – 0.020 0.039 0.145 2 0.031 0.006 2 0.029 – [0.027] [0.033] [0.111] [0.025] [0.021] [0.033] Observations 601 1,101 1,101 621 1,101 1,101 R-squared 0.040 0.085 0.072 0.133 0.066 0.075 Mean of dep var in control – 0.084 0.539 1.493 0.144 0.117 0.576 group Round 2 “Any Incentive� Coef�cient – 2 0.007 2 0.011 2 0.061 2 0.010 2 0.004 0.003 – [0.023] [0.029] [0.118] [0.029] [0.020] [0.033] Observations 579 1,064 1,064 574 1,064 1,064 R-squared 0.037 0.102 0.077 0.122 0.057 0.106 Mean of dep var in control – 0.097 0.551 1.562 0.141 0.121 0.521 group Round 3 “Any Incentive� Coef�cient 0.001 2 0.022 0.008 0.031 2 0.012 0.011 2 0.008 [0.005] [0.022] [0.028] [0.117] [0.028] [0.017] [0.028] Observations 1,071 597 1,091 1,091 552 1,091 1,091 R-squared 0.009 0.048 0.116 0.060 0.031 0.033 0.100 Mean of dep var in control 0.006 0.105 0.515 1.562 0.104 0.097 0.543 group * signi�cant at 10%; ** signi�cant at 5%; *** signi�cant at 1%. Note: Each cell presents a separate OLS regression of the dependent variable on a dummy variable for whether the respondent was offered any incen- tive, controlling for whether the respondent is male, married, has some schooling, HIV status at baseline, number of living children, and distric �xed effects. Robust standard errors clustered by village, in brackets. Kohler and Thornton 181 182 THE WORLD BANK ECONOMIC REVIEW Channels There may be a variety of reasons as to why we might observe no effect of offering �nancial incentives on sexual behavior. We explore several of these channels in Appendix S.4 by measuring heterogeneous responses to the incen- tive offer. One often-cited reason for the lack of observed behavior change in Africa in response to the HIV epidemic is that there is a lack of knowledge or awareness of how to change behavior. In theory, education could be positively correlated to behavior change—either because individuals learn how to protect themselves from infection, or because education raises the return to staying un- infected (de Walque 2006; Oster 2007). While overall, the average number of years of education is quite low, at 4.5 years, with 23 percent having never attended school, there is essentially no differential impact of education on the response to being offered any incentive. Another possibility why there was no observed behavior change in response to the �nancial incentive is that the incentive amount was not enough to induce behavior change. Given the levels of poverty in this sample, it is dif�cult to reason that there were no individuals who would not have been affected on the margin and that we wouldn’t pick up changes in response to the monetary incentives. However, we can estimate any possible effects among those who are at different levels of income. While those with higher income are less likely to have sex and less likely to use a condom, there is no consistent pattern to the interaction between income and the incentive in our outcome variables. Similarly, men and women could have responded differently to being offered incentives, either due to preferences or ability to bargain on sexual behavior within the relationship. Again, there is no consistent pattern in the difference in the response to the incentive by gender. To examine whether bargaining power was important for women, we estimate the impact of the incentive among women who were more or less “empowered.� We construct an index by taking an average of a series of attitudinal questions related to gender empower- ment.13 This index ranges from zero to one with an average value of 0.47. Higher values of the index indicates higher levels of empowerment. We 13. The empowerment questions included: “Do you think it is proper for a wife to leave her husband if: He does not support her and the children �nancially?; He beats her frequently?; He is sexually unfaithful?; She thinks he might be infected with HIV?; He does not allow her to use family planning?; He cannot provide her with children?; He doesn’t sexually satisfy her?� “Is it acceptable for you to go to: The local market without informing your husband?; The local health center without informing your husband.� “A woman has the right to refuse sex with her husband when she: Is tired from working hard; Doesn’t feel like it or is not in the mood; During the abstinence period after childbirth; Is no longer attracted to her husband.� “A woman has the right to refuse unprotected sex with her husband when she: Thinks her husband may have HIV/AIDS; Thinks she may have HIV/AIDS; Doesn’t want to risk getting pregnant.� “If a woman often refuses sex with her husband, is it acceptable for the husband to: Refuse to eat her nsima; Sleep with another sexual partner; Sleep with her by force; Stop providing for her.� Kohler and Thornton 183 estimate differential effects among females only and again there appears to be no differential response. Overall, the results indicate no response to being offered the monetary in- centive on the sample.14 This may be due to the fact that the monetary reward was too far in the future, that it was not enough money, or that these indivi- duals were already optimizing prior to the incentive scheme. One possibility that we assert is less likely to be a reason for the limited effects is the credibility of receiving the incentives. These individuals were part of a larger longitudinal study which had previously offered—and distributed—�nancial rewards for traveling to a mobile clinic to learn their HIV results (Thornton 2008). It is therefore unlikely that the credibility of the program was the most critical factor. I V. E F F E C T S OF RECEIVING A FINANCIAL INCENTIVE Approximately one week after the third round of sexual diaries, HIV testing, and distribution of the monetary incentives, interviewers returned to each re- spondent to administer the sexual diaries. Again, completion of a survey was correlated to incentive status. Overall, 91.6 percent were interviewed although this varied by no incentive (89.1 percent), medium incentive (93.1 percent) and a high incentive (92.9 percent). Seven individuals, who were HIV-negative at enrollment, tested HIV-positive in at the follow-up test and did not receive the �nancial reward. Table 6, Panel B presents the effects of receiving the monetary incentives on sexual behavior, separately among men and women. These results are intention-to-treat effects as they do not exclude those who did not receive the incentives.15 Among men, those who received any incentive were 12.3 percent- age points to engage in any vaginal sex and had 0.5 additional days of sex. Men who received incentives were also signi�cantly more likely to report using a condom during sex (6.9 percentage points more likely), but overall, they were more likely to engage in riskier sex. These results are similar to �ndings in Luke (2006), who found that wealthier men were more likely to engage in sex but also more likely to use condoms. Women who received the incentive, on the other hand, were less likely to report having any vaginal sex and there was no impact on reported condom use. In some cases, the amount of the incentive also mattered—for example, among women, the largest effect of the incentive 14. There were also no differential effects by marital status or age of the respondent (not shown). 15. One question is whether there might be large differences in the mean in the control pre-program, round 3, and post-program. Because we do not have baseline sexual diaries in our data, we cannot compare pre-program sexual behavior. However, we can test the average across the rounds in the control group. For each of the dependent variables in Table 4 and 5, there are no signi�cant differences in the average value across around, either pooled, or disaggregated by gender. In Table 6, There are some differences in the average in the control group before and after receiving incentive money, as can be seen with the average values of the control group presented in Table 6. 184 T A B L E 6 . Effect of Receiving Incentive on Reported Sexual Behavior, Round 4, HIV-Negatives Panel A: Attrition to Round 4 Survey All Zero Incentive Medium Incentive High Incentive p-value of joint test (1) (2) (3) (4) (5) Completed Round 4 0.853 0.828 0.854 0.878 0.13 Panel B: Effects of Receiving an Incentive on Sexual Behavior Men Women Any Vaginal Sex Days Vaginal Sex Used Condom Safe/No Sex Any Vaginal Sex Days Vaginal Sex Used Condom Safe/No Sex (1) (2) (3) (4) (5) (6) (7) (8) Any Incentive 0.123** 0.514** 0.052* 2 0.090** 2 0.074* 2 0.153 0.000 0.067* [0.049] [0.213] [0.031] [0.045] [0.040] [0.183] [0.030] [0.038] Observations 447 447 280 447 568 568 332 568 R-squared 0.135 0.085 0.135 0.183 0.057 0.025 0.017 0.048 Mean of dep var in 0.538 1.490 0.078 0.497 0.644 1.892 0.055 0.392 control group THE WORLD BANK ECONOMIC REVIEW Panel C: Effects of Receiving an Incentive on Sexual Behavior Men Women Any Vaginal Sex Days Vaginal Sex Used Condom Safe/No Sex Any Vaginal Sex Days Vaginal Sex Used Condom Safe/No Sex (1) (2) (3) (4) (5) (6) (7) (8) High Incentive 0.131** 0.338 0.041 2 0.092* 2 0.046 2 0.056 0.020 0.046 [0.056] [0.231] [0.037] [0.055] [0.050] [0.221] [0.034] [0.045] Low Incentive 0.115** 0.688** 0.063 2 0.088* 2 0.100** 2 0.243 2 0.019 0.087* [0.053] [0.276] [0.040] [0.046] [0.044] [0.205] [0.032] [0.047] Observations 447 447 280 447 568 568 332 568 R-squared 0.135 0.089 0.136 0.183 0.059 0.026 0.021 0.049 Mean of dep var in 0.538 1.490 0.078 0.497 0.644 1.892 0.055 0.392 control group Notes: All coef�cients are from OLS regressions. Control variables not shown. “Vaginal Sex� is a dummy variable equal to one if the respondent reported having had vaginal sex. “Used a Condom� is a dummy variable equal to one if the respondent reported using a condom. “Safe Sex or No Sex� is a dummy variable equal to one if the respondent either reported using a condom or reported not having sex. Each regression includes controls for whether the respondent is male, married, has some schooling, HIV status at baseline, number of living children, and distric �xed effects. Robust standard errors clustered by village, in brackets. Kohler and Thornton 185 was among those who received the largest incentive, rather than the medium- valued incentive (Table 6 Panel C). For men, on the other hand, there is no statistical difference between the response to high incentives and medium incentives. Researchers and policy makers have associated the lack of �nancial resources among women as a determinate of riskier sexual behavior because of the mon- etary transfers they receive from men (Dupas 2009; Hallman 2004; Halperin and Epstein 2004; Robinson and Yeh 2009; Shelton and others 2005; Wines 2004; Wojcicki 2002). Similarly, researchers have hypothesized a positive rela- tionship between male wealth and unsafe sexual behavior because men with higher incomes can afford to purchase riskier sex (e.g., Gertler and others 2005; Luke 2006).16 Evidence quantifying the effects of income transfers on sexual behavior among men and women, however, is limited, and often con- founded by omitted variables that bias causal estimates.17 There could be several mechanisms through which receiving the �nancial in- centive affected sexual behavior among both the men and women. First, the money could have been directly used by the men to purchase risky sex, and the money could have been used by the women to substitute for “selling� risky sex. Another possible mechanism for men is that the incentive may have been a signal that the individual was HIV-negative. If everyone in the village knew about the incentives program, a man could use the earning of the incentive as an indication that he was not infected. Ironically, this could have resulted in an increase in risky sex. How exactly the money affected sexual behavior is worth exploring in future research.18 V. C O N C L U S I O N This paper presents the results from a conditional cash transfer program in rural Malawi where individuals were offered �nancial rewards to maintain their HIV status for approximately one year. We �nd no overall signi�cant or substantial effects of being offered the reward on subsequent self-reported sexual behavior. Despite the fact that self-reports might be biased towards 16. The relationship between income and HIV has been studied in other settings. For example, research has found a positive correlation between household assets and HIV or early adult mortality (de Walque 2006; Shelton and others 2005; Yamano and Jayne 2004). Alternatively, wealthier men might have higher returns to safe sex. 17. Several exceptions include Duflo and others (2006), who �nd that Kenyan girls receiving free school uniforms were less likely to become pregnant, Baird and others (2010), who �nd that direct payments of secondary school fees lead to signi�cant declines in early marriage, teenage pregnancy, and self-reported sexual activity, and Yeh (2009), who �nds that health shocks lead Kenyan prostitutes to engage in riskier sexual behavior. 18. It is worth briefly mentioning that to the extent that self-reported sexual behavior may introduce measurement error leading to an attenuation of results–consistent with the results in Tables 4 and 5, the fact that we �nd effects in the last round help to mitigate the concern that self-reports are dramatically mismeasured or random answers. 186 THE WORLD BANK ECONOMIC REVIEW individuals overreporting safe sexual behavior, we estimate small and statistic- ally insigni�cant point estimates. Moreover, we �nd no evidence of possibly heterogeneous effects where incentives would affect sexual behavior among the more educated or the poorer individuals, and there is no evidence for differen- tial effects according to female bargaining power. There is also no evidence that the largest incentive payments—which are equivalent to approximately four months of income for the average respondent in our study—have an effect on sexual behaviors. These �ndings of our study are in sharp contrast to some recent press reports and published �ndings of related conditional cash transfer programs that claim to document the effectiveness of conditional cash transfer programs for HIV prevention (e.g., deWalque and others 2011; World Bank 2010b). The �ndings of this paper add to the literature on conditional cash transfers and HIV prevention in two important ways. First, the �nding of no impact of the �nancial reward on sexual behavior speaks to the design of future CCTs related to sexual transmitted diseases. The success of CCTs in promoting be- havior change in other contexts has varied, often depending on the speci�c setting, design, and implementation (Filmer and Schady 2009; Fiszbein and Schady 2009). In particular, the effectiveness of a CCT program is dependent onthe particular target population, conditions, enforcement, credibility, and payment levels. The lessons from this evaluation can thus help in the design of future evaluations and CCT programs and for understanding sexual behavior more generally. For example, it seems plausible that rewards offered in more frequent intervals over the year might be more effective in affecting sexual be- havior than a one time reward offered in one year. In addition, it might be useful targeting individuals who are in less stable sexual relationships or who are more at risk such as unmarried adolescents. Second, our study �nds large and signi�cant effects approximately one week after receiving the incentives money. Men who received money were 12 per- centage points more likely to have vaginal sex and had approximately 0.5 days more of sex. While condom use among these men increased (by 5 percentage points), on net risky sex increased by 9 percentage points. On the other hand, women were 6.7 percentage points less likely to have engaged in risky sexual sex. Hence, sexual behaviors are clearly responsive to cash payments, possibly because of the income effect resulting from these payments, and the behavioral responses to receiving cash payments differ between men and women. These �ndings help to further quantify how men and women respond to money and raiseadditional important questions. Is there an increase in risky sex among men due to the fact that they purchase risky sex, or is increase in income a signal to women that the men were HIV-negative? Did the men spend money on items that increased their level of attraction (such as purchasing new clothes or soap)? Additional investigation into the response of men and women to cash transfers is warranted. In particular, if giving men money increases their risky sex, then studies that pay respondents may actually increase the risky sex in the Kohler and Thornton 187 study. This is related to work by Oster (2007), who �nds a strong relationship between exports and HIV prevalence rates. There are several arguments that have been posed for and against offering incentives for individuals to stay HIV-negative. The arguments against largely fall into two categories: ethics and ef�cacy. Concerns about coercion or equity have been raised by ethicists, which are somewhat parallel to arguments against other CCT programs and merit scholarships (Wadman 2008). In this sample, the HIV-positives were also given incentives for maintaining their own status, and given rewards for maintaining their partner’s status. While we saw no large effects of the reward, this particular design might help to get around concerns of ethics. In terms of the effect of the program, while we found no effect, we hope that the lessons learned will guide future program design. To the extent that there are social externalities of HIV, offering rewards might help to reduce the epidemic. If individuals can be incentivized during risky stages of life, then such a cash transfer program could be cost effective. The result that money can be protective for women is encouraging; however, that men receiving money practiced riskier sex calls for caution in scaling up these types of programs, as well as other poverty assistance programs. REFERENCES Anglewicz, P., J. Adams, F. Obare, H. P. Kohler, and S. Watkins. 2009. “The Malawi Diffusion and Ideational Change Project 2004–06: Data Collection, Data Quality and Analyses of Attrition.� Demographic Research 20(21): 503 –540. Anglewicz, P., and H. P. Kohler. 2009. “Overestimating HIV Infection: The Construction and Accuracy of Subjective Probabilities of HIV Infection in Rural Malawi.� Demographic Research 20(6): 65 – 96. Baird, S., E. Chirwa, C. McIntosh, and B. O¨ zler. 2010. “The short-term impacts of a schooling condi- tional cash transfer program on the sexual behavior of young women.� Health Economics 19(S1): 55– 68. Bernheim, B. D., and B. Rangel. 2004. “Addiction and Cue-Triggered Decision Processes.� American Economic Review 94(5): 1558–1590. Bertozzi, S., N. S. Padian, J. Wegbreit, L. M. DeMaria, B. Feldman, H. Gayle, J. Gold, R. Grant, and M. T. Isbell. 2006. “HIV/AIDS Prevention and Treatment.� In D. T. Jamison, J. G. Breman, A. R. Measham, G. Alleyne, M. Claeson, D. B. Evans, P. Jha, A. Mills, and P. Musgrove, eds., Disease Control Priorities in Developing Countries, 2nd ed. Oxford: Oxford University Press. Charness, G., and U. Gneezy. 2009. “Incentives to Exercise.� Econometrica 77(3): 909– 931. ´ re, and P. Lanjouw. 2005. de Janvry, A., F. Finan, E. Sadoulet, D. Nelson, K. Lindert, B. de la Brie “Brazil’s Bolsa Escola Program: The Role of Local Governance in Decentralized Implementation.� Social Protection Discussion Paper 0542, World Bank, Washington, D.C. de Janvry, A., and E. Sadoulet. 2006. “Making Conditional Cash Transfer Programs More Ef�cient: Designing for Maximum Effect of the Conditionality.� World Bank Economic Review 20(1): 1–29. de Walque, D. 2006. “Who gets AIDS and how? The determinants of HIV infection and sexual beha- viors in Burkina Faso, Cameroon, Ghana, Kenya and Tanzania.� World Bank Policy Research Working Paper No. 3844. 188 THE WORLD BANK ECONOMIC REVIEW de Walque, D., W. H. Dow, R. Nathan, and C. A. Medlin. 2011. “Evaluating Conditional Cash Transfers to prevent HIV and other sexually transmitted infections (STIs) in Tanzania.� Paper pre- sented at the Annual Meeting of the Population Association of America, Washington, D.C., March 31 –April 2, 2011. Delavande, A., and H. P. Kohler. 2009. “Subjective Expectations in the Context of HIV/AIDS in Malawi.� Demographic Research 20(31): 817– 874. ———. “The impact of HIV Testing on Subjective Expectations and Risky Behavior in Malawi.� Demography (forthcoming). Duflo, E., P. Dupas, M. Kremer, and S. Sinei. 2006. “Education and HIV/AIDS Prevention: Evidence from a Randomized Evaluation in Western Kenya.� World Bank Policy Research Working Paper No. 4024. Dugger, C. W. 2010. “African Studies Give Women Hope in H.I.V. Fight.� New York Times, July 19, 2010. Dupas, P. 2011. “Do Teenagers Respond to HIV Risk Information? Evidence from a Field Experiment in Kenya.� American Economic Journal: Applied Economics 3(34): 1–34. Filmer, D., and N. Schady. 2009. “Are there diminishing returns to transfer size in conditional cash transfers?� World Bank Policy Research Working Paper No. 4999. Fiszbein, A., and N. Schady. 2009. Conditional Cash Transfers Reducing Present and Future Poverty. Washington, D.C.: World Bank Publications. Fudenberg, D., and D. K. Levine. 2006. “A Dual-Self Model of Impulse Control.� American Economic Review 96(5): 1449– 1476. Galarraga, O., and P. Gertler. 2010. “Cash and a brighter future: The effect of conditional transfers on adolescent risk behaviors: Evidence from urban Mexico.� Unpublished manuscript. Gertler, P., M. Shah, and S. M. Bertozzi. 2005. “Risky Business: The Market for Unprotected Commercial Sex.� Journal of Political Economy 113(3): 518–550. ´ , X., J. Goldberg, D. Silverman, and D. Yang. 2010. “Revising Commitments: Time Preference and Gine Time-Inconsistency in the Field.� Unpublished manuscript, Department of Economics, University of Michigan, Ann Arbor, MI. ´ , X., D. Karlan, and J. Zinman. 2009. “Put Your Money Where Your Butt Is: A Commitment Gine Contract for Smoking Cessation.� World Bank Policy Research Working Paper No. 4985. Goldberg, J. 2010. “Kwacha gonna do? Experimental Evidence about Labor Supply in Rural Malawi.� Unpublished manuscript, Department of Economics and Ford School of Public Policy, University of Michigan, Ann Arbor, MI. Gruber, J., and B. Koszegi. 2001. “Is Addiction ‘Rational’? Theory and Evidence.� The Quarterly Journal of Economics 116(4): 1261– 1303. Gul, F., and W. Pesendorfer. 2001. “Temptation and Self-Control.� Econometrica 69(6): 1403– 1435. ———. 2004. “Self-Control and the Theory of Consumption.� Econometrica 72(1): 119– 158. ———. 2007. “Harmful Addiction.� Review of Economic Studies 74(1): 147– 172. Hallman, K. 2004. “Socioeconomic Disadvantage and Unsafe Sexual Behaviors Among Young Women and Men in South Africa.� Policy Research Division Working Paper No. 190. New York: Population Council. Halperin, D., and H. Epstein. 2004. “Concurrent sexual partnerships help to explain Africa’s high HIV prevalence: Implications for prevention.� Lancet 364: 4–6. Jack, A. 2010. “HIV cut in Africa by paying teenagers.� Financial Times, July 19, 2010. Lagarde, M., A. Haines, and N. Palmer. 2007. “Conditional cash transfers for improving uptake of health interventions in low and middle-income countries: A systematic review.� Journal of American Medical Association 298(16): 1900– 1910. Laibson, D. 1997. “Golden Eggs and Hyperbolic Discounting.� Quarterly Journal of Economics 112(2): 443–477. Kohler and Thornton 189 Levy, S. 2006. Progress Against Poverty: Sustaining Mexico’s PROGRESA-Oportunidades Program. Washington, D.C.: Brookings Institution Press. Lindert, K., E. Skou�as, and J. Shapiro. 2006. “Redistributing Income to the Poor and the Rich: Public Transfers in Latin America and the Caribbean.� Washington, D.C.: World Bank, SP Discussion Paper No. 0605. Luke, N. 2006. “Are Wealthy Sugar Daddies Spreading HIV?: Exploring Economic Status, Informal Exchange, and Sexual Risk in Kisumu, Kenya.� Paper presented at the Annual Meetings of the Population Association of America, Los Angeles, CA, March 30– April 1, 2006. Maluccio, J., and R. Flores. 2005. Impact evaluation of a conditional cash transfer program: The Nicaraguan Red de Proteccio ´ n Social. Washington, D.C.: International Food Policy Research Insitute. McCoy, S. I., R. A. Kangwende, and N. S. Padian. 2010. “Behavior Change Interventions to Prevent HIV Infection among Women Living in Low and Middle Income Countries: A Systematic Review.� AIDS and Behavior 44(3): 469–482. Mensch, B., P. Hewett, R. Gregory, and S. Helleringer. 2008. “Sexual Behavior and STI/HIV Status Among Adolescents in Rural Malawi: An Evaluation of the Effect of Interview Mode on Reporting.� Studies in Family Planning 39: 321–334. Obare, F., P. Fleming, P. Anglewicz, R. Thornton, F. Martinson, A. Kapatuka, M. Poulin, S. C. Watkins, and H. P. Kohler. 2009. “Acceptance of Repeat Population-based Voluntary Counseling and Testing for HIV in Rural Malawi.� Sexually Transmitted Infections 85(2): 139– 144. O’Donoghue, T., and M. Rabin. 1999. “Doing It Now or Later.� American Economic Review 89(1): 103–124. ———. 2001. “Choice and Procrastination.� Quarterly Journal of Economics 116(1): 121– 160. Oster, E. 2007. “HIV and sexual behavior change: Why not Africa.� NBER Working Paper #13049. Over, M. 2010. “Incentives Offer New Hope for Preventing HIV Infections (Postcard from Vienna).� Global Health Policy Blog, Center for Global Development, July 21, 2010. Philipson, T. J., and R. A. Posner. 1995. “A Theoretical and Empirical-Investigation of the Effects of Public-Health Subsidies for STD Testing.� Quarterly Journal of Economics 110(2): 445–474. Robinson, J., and E. Yeh. 2009. “Transactional Sex as a Response to Risk in Western Kenya.� World Bank Policy Research Working Paper No. 4867. Sana, M., and A. Weinreb. 2008. “Insiders, Outsiders, and the Editing of Inconsistent Survey Data.� Sociological Methods Research 36(4): 515–541. Shelton, J. D., M. M. Cassell, and J. Adetunji. 2005. “Is poverty or wealth at the root of HIV?� Lancet 366(9491): 1057–1058. Thornton, R. L. 2008. “The Demand for Learning HIV Status and the Impact on Sexual Behavior: Evidence from a Field Experiment.� American Economic Review 98(5): 1829–1863. Volpp, K. G., L. K. John, A. B. Troxel, L. Norton, J. Fassbender, and G. Loewenstein. 2008a. “Financial Incentive-Based Approaches for Weight Loss: A Randomized Trial.� Journal of the American Medical Association 300(22): 2631– 2637. Volpp, K. G., G. Loewenstein, A. B. Troxel, J. Doshi, M. Price, M. Laskin, and S. E. Kimmel. 2008b. “A test of �nancial incentives to improve warfarin adherence.� BMC Health Services Research 8(1): 272. Volpp, K. G., A. B. Troxel, M. V. Pauly, H. A. Glick, A. Puig, D. A. Asch, R. Galvin, J. Zhu, F. Wan, J. DeGuzman, E. Corbett, J. Weiner, and J. Audrain-McGovern. 2009. “A randomized, controlled trial of �nancial incentives for smoking cessation.� New England Journal of Medicine 60(7): 699– 709. Wadman, M. 2008. “Payments in planned HIV trial raise ethical concerns.� Nature Medicine 14(6): 593. Whiteside, M. 1998. “When the whole is more than the sum of the parts: The effect of cross-border interactions on livelihood security in southern Malawi and northern Mozambique.� Report for Oxfam GB. 190 THE WORLD BANK ECONOMIC REVIEW Wines, M. 2004. “Women in Lesotho become easy prey for H.I.V.� New York Times, July 20, 2004. Wojcicki, J. M. 2002. “Commercial Sex Work or Ukuphanda? Sex-for-Money Exchange in Soweto and Hammanskraal Area, South Africa.� Culture, Medicine and Psychiatry 26(3): 339– 370. World Bank. 2001. “Brazil: An Assessment of the Bolsa Escola Programs.� World Bank Report 20208-BR, Washington, D.C. ———. 2010a. “Malawi and Tanzania Research Shows Promise in Preventing HIV and Sexually-Transmitted Infections.� World Bank News & Broadcast, July 18, 2010. The World Bank, Washington, D.C. ———. 2010b. “The RESPECT study: Evaluating Conditional Cash Transfers for HIV/STI Prevention in Tanzania.� Washington, D.C.: World Bank Results Brief. Yamano, T., and T. S. Jayne. 2004. “Working-Age Adult Mortality and Primary School Attendance in Rural Kenya.� Tegemeo Institute for Agricultural Development and Policy, Nairobi. Yang, S. 2010. “Cash rewards and counseling could help prevent STIs in rural Africa.� University of California at Berkeley Press Release, July 18, 2010. Just Rewards? Local Politics and Public Resource Allocation in South India Timothy Besley, Rohini Pande, and Vijayendra Rao What factors determine the nature of political opportunism in local government in South India? To answer this question, we study two types of policy decisions that have been delegated to local politicians—bene�ciary selection for transfer programs and the allocation of within-village public goods. Our data on village councils in South India show that, relative to other citizens, elected councillors are more likely to be selected as bene�ciaries of a large transfer program. The chief councillor’s village also obtains more public goods, relative to other villages. These �ndings can be inter- preted using a simple model of the logic of political incentives in the context that we study. JEL codes: R51, H11, H72 Locally elected of�cials increasingly are responsible for the allocation of local public goods and for selecting bene�ciaries for transfer programs in many low- income settings. Yet when it comes to how citizens access and use their polit- ical clout as politicians and as voters, our knowledge remains limited. In this paper, we use village and household data on resource allocation by elected village councils in South India to evaluate the nature of political opportunism in a decentralized setting. In 1993, a constitutional amendment in India instituted village-level self gov- ernment, or Gram Panchayats (GP). A typical GP comprises several villages with chief village councillor (the Pradhan) resident in one of them. The amend- ment also required political reservation of a fraction of councillor positions for historically disadvantaged groups (low castes and women). Timothy Besley (corresponding author) is a Professor of Economics at London School of Economics, Rohini Pande is a Professor of Public Policy at Harvard University, and Vijayendra Rao is a Lead Economist in the Development Research Group of the World Bank. Email: t.besley@lse.ac.uk, rohini_pande@harvard.edu, and vrao@worldbank.org. The authors are grateful to Lupin Rahman, Radu Ban, Siddharth Sharma and Jillian Waid for research assistance, IMRB staff for conducting the survey and numerous seminar audiences, the editor and anonymous referees for comments. The authors thank the World Bank’s Research Committee and the South Asia Rural Development Unit for �nancial support. The opinions in the paper are those of the authors and do not necessarily reflect the points of view of the World Bank or its member countries. THE WORLD BANK ECONOMIC REVIEW, VOL. 26, NO. 2, pp. 191 –216 doi:10.1093/wber/lhr039 Advance Access Publication October 31, 2011 # The Author 2011. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 191 192 THE WORLD BANK ECONOMIC REVIEW On political selection, we �nd that elected councillors are disproportionately drawn from politically and economically advantaged households. This effect is muted among councillors elected from reserved positions. However, irrespect- ive of reservation status, the Pradhan is always more likely to belong to the village with the most electoral clout. Here, we de�ne a village’s electoral clout as the fraction of winning coalitions formed from among all villages in a GP in which that village is decisive to maintaining the coalition’s majority status. To examine political opportunism we consider two policy outcomes: bene�- ciary status for an important anti-poverty program (Below the Poverty Line [BPL] card) and allocation of public goods across villages belonging to the same GP. The BPL card program entitles households to buy food below market prices, while the GP oversees selection of bene�ciary households. To identify political opportunism in BPL card allocation we exploit within-village variation in access to political power. Controlling for wealth, education, and asset-based eligibility, a politician is more likely to have a BPL card than a nonpolitician. The effect of being a politician on the likelihood of getting a BPL card is of the same magnitude as the effect of being landless, despite the fact that politicians are signi�cantly more likely to own land and assets that make them of�cially ineligible for BPL card. Thus we interpret BPL card ownership by politicians as a prima facie measure of opportunism. Moreover, such opportunism is corre- lated with worse targeting. In villages where the Pradhan has a BPL card (and/ or reports that s/he decides BPL card targeting), the average landless person is less likely to obtain a BPL card. The use of political of�ce to access BPL cards appears to be limited to nonreserved politicians. However, as reserved politi- cians are also more likely to be eligible for BPL cards, the likelihood of having a BPL card ends up being similar for reserved and nonreserved politicians. However, reserved politicians appear to do a better job targeting lower castes. Turning to cross-village resource allocation, we �nd that, after controlling for a village’s electoral clout, being the Pradhan’s village is correlated with greater access to public goods. This difference in public good provision between the Pradhan’s village and other villages in the GP is absent in census data prior to decentralization. Thus, to the extent electoral clout matters, it appears to do so by determining which village is the Pradhan’s village. The richness of our household and village data allows us to to control for obvious sources of omitted variable bias. In our analysis of BPL card allocation we exploit within-village variation in political power. That said, a causal inter- pretation of our �ndings relies on the identifying assumption that access to pol- itical power and access to public resources are not jointly determined by unobserved individual characteristics (in the case of BPL cards) or village char- acteristics (in the case of public goods). We also relate our �ndings to political economy models of resource alloca- tion. The observed patterns in the data are consistent with a simple political economy model where politicians have a cost advantage in both accessing public transfer programs and in targetting public goods to their own group. Besley, Pande and Rao 193 The results on cross-village allocation of public goods are consistent with a model of agenda control in which a minimal winning coalition will prevail with resources allocated favorably within the coalition. Taken together, our results suggest that local democracy per se does not eliminate rent extraction. However, institutions that influence selection proce- dures ( plurality rule and mandated reservation) change the nature of resource allocation. At the same time, electoral competition appears to have yielded limited incentive effects; while voters state lower satisfaction with opportunistic politicians, political opportunism persists. Our �ndings contribute to a growing empirical literature on local govern- ment in low-income settings. There is literature on how local governments rep- resent voter preferences. Foster and Rosenzweig (2004) and Faguet (2004) provide evidence from India and Bolivia that decentralization bene�ts the median voter. Other studies focus on the role of political reservation. Chattopadhyay and Duflo (2004) and Beaman and others (2009, 2010) show that political reservation for women altered public good allocation in Indian villages. In previous work, we have found that reservation for lower castes improves targeting of lower caste households for home-improvement programs (Besley and others 2004a). In addition, we document the fact that the Pradhan’s village received more public goods. This paper pushes this research agenda forward by explicitly looking at the nature of political opportunism in Panchayats. We use new data on BPL card allocation to evaluate personal gains to politicians. In the case of cross-village allocation of public goods, we explicitly examine the selection of the Pradhan’s village and whether account- ing for the electoral clout of villages mutes the Pradhan-village effect. Our cross-village analysis of public good provision is related to recent work by Chattopadhyay and others (2006). Using data on public goods allocation across hamlets, they �nd that low-caste Pradhans provide more public goods in low-caste hamlets. However, unlike this study, they do not �nd evidence for greater public good provision in the Pradhan village. A possible explanation is the apparently greater entrenchment in our setting; unlike in Chattopadhyay and others (2006), political reservation does not alter the likelihood that the most populous village in the GP will be the Pradhan’s village. The remainder of the paper is organized as follows: In Section I we describe the institutional setting and in Section II we provide a theoretical framework which motivates our empirical analysis. Section III describes the data, and Section IV the results. Section V concludes. I . B AC K G RO U N D A 1993 constitutional amendment made a three-tier elected local government obligatory throughout India. Our focus is on the lowest tier of local self- government—a popularly elected village council called the Gram Panchayat (GP). 194 THE WORLD BANK ECONOMIC REVIEW We use data from the four South Indian states of Andhra Pradesh, Karnataka, Kerala and Tamil Nadu. Each Indian state separately decided which policies to decentralize to the GP and how to demarcate the physical boundaries of a GP. Apart from Kerala, where each village is mandated as a separate GP, all states in our sample use a population criterion.1 In all cases, a GP is subdivided into wards (the population per ward varies between 300 and 800) and elections occur at the ward level. The GP council consists of elected ward members and is headed by the Pradhan. The 73rd constitutional amendment mandated political reservation of a certain fraction of Pradhan positions in each state in favor of historically dis- advantaged lower castes and women. Only individuals belonging to the group bene�ting from reservation can stand for election in a seat reserved for that group. The law requires that one-third of Pradhan positions in every state be reserved for women while the extent of caste reservation reflects the group’s population share in the state. In all states, the caste reservation status of a GP is �rst assigned, and then one-third of the positions in both caste-reserved and caste-unreserved categories are reserved for women. Thus, a signi�cant fraction of positions are reserved for women belonging to lower caste groups. Finally, the amendment also mandated the formation of a village-level supervisory body consisting of all adults registered in the electoral rolls of a GP, the Gram Sabha. A GP has responsibilities of civic administration with limited independent tax-raising powers.2 It is typically responsible for bene�ciary selection of gov- ernment welfare schemes and the construction and maintenance of village public goods. While Panchayat legislation requires that the Pradhan decide the choice of bene�ciaries and public good allocation in consultation with villagers and ward members, �nal decision-making powers remain vested with the Pradhan. Since 1997 the Indian government has used a targeted public food distribu- tion system which provides BPL cardholders subsidized food while charging a near-market price for the others. In 2000–01, for our sample states, the annual income gain from having a BPL card was roughly 5 percent of an agricultural labor household’s annual expenditure.3 The cost of the subsidy is borne by the federal government and the cost of surveying households and food 1. The average population per GP is 1,650 in Andhra Pradesh, 6,500 in Karnataka, over 20,000 in Kerala, and 4,000 in Tamil Nadu. The much higher population of Kerala GP reflects the high population density in Kerala villages—at 819 pp sq. km, Kerala is roughly thrice as densely populated as the rest of India. 2. On average, roughly 10 percent of a GP’s total revenue comes from own revenues with the remainder consisting of transfers from higher levels of government. 3. Under the public food distribution system BPL households enjoy a 50 percent subsidy on up to 20 kg of food grains per month. Planning Commission (2005) calculations suggest that the effective annual income gain was Rs. 1025 in Andhra Pradesh, Rs. 520 in Karnataka, Rs. 1414 in Kerala and Rs. 809 in Tamil Nadu. We combine these �gures with data from the 1999 National Sample Survey to compute the implied income gain for an agricultural household. Besley, Pande and Rao 195 disbursement is borne by the state government. Hence, BPL card allocation does not impact the Panchayat budget. However, many GP-administered welfare schemes, for example, employment and housing schemes, restrict eligi- bility to BPL households. BPL eligibility is determined by a combination of state-speci�c income and asset criteria. To identify BPL-eligible households, the GP, together with state government of�cials, conducts a census collecting the relevant information. GP politicians bear substantial responsibility for conducting this survey.4 They choose the village surveyors and, using the survey results, prepare a preliminary ‘BPL’ list of recipients. The BPL eligibility criteria used by the four states in our sample was broadly similar. A household was typically eligible if the annual household income placed it below the state poverty line and if it did not own land. In addition, households were automatically excluded from BPL eligibility if they owned any of a de�ned set of assets (Attanasova and others (2010)). Our survey contained information on four of these assets: phone own- ership, color TV ownership, motorized vehicle ownership, and water pump ownership. We use this information to create an indicator variable noassets. The preliminary BPL list is supposed to be �nalized at a Gram Sabha meeting. However, in reality politicians enjoy substantial discretion in selecting BPL households, and villager oversight is relatively limited. While 76 percent of the villages we surveyed held a Gram Sabha in the past year, only 20 percent of households report ever having attended a Gram Sabha. Moreover, bene�ciary selection was discussed in only 22 percent of Gram Sabha meetings (Besley, Pande and Rao (2005)). This is also reflected in politician perceptions—only 9 percent of the 540 politicians whom we surveyed stated that the Gram Sabha decided the �nal BPL list; by contrast, 87 percent believed that this power lay with a Panchayat of�cial. Turning to public goods provision, GP of�cials allocate both the resources raised by taxing households and the funds transferred from the state govern- ment. While the category of expenditure for state funds is often speci�ed, the GP has complete discretion over which villages and, within villages, which areas are to bene�t from such expenditure. II. THEORETICAL ISSUES In this section, we discuss some background theoretical issues which we use to think about the empirical �ndings. We consider the implications of a view that GP politicians use their political authority in a self-interested way to influence transfers within and between villages. The basic structure is to consider V villages in a GP labeled v ¼ 1, . . . V. Each village comprises a group of citizens, some of whom are poor. We 4. The central government uses the Planning Commission’s poverty estimates to release food grains to each state. Each state government decides district-wise BPL card quota. Within a district, a BPL quota is determined at the GP level. 196 THE WORLD BANK ECONOMIC REVIEW consider spending which can be targeted to villages (public goods) and spend- ing which can be targeted to poor individuals (BPL cards). Between-village targeting The GP allocates a budget of size B across the villages each with a share of population pv, and village public expenditures which are denoted by Gv with X V pv Gv ¼ B: v¼1 A stylized representation is to think in terms of resource allocation controlled by a village council with a set of representatives—one for each village. Within each GP, one elected representative is the Pradhan and possesses agenda setting power. The public resources in Gv generally take the form of very local public goods—for example, roads and water. That is why the issue of intervillage allo- cation is so important to villagers. Suppose that the Pradhan proposes an allocation to other council members and that this must be agreed to by a majority of council members in order to be accepted.5 If the village council cannot agree to a public good allocation, then the status quo is that each district gets at least G and the Pradhan’s village gets B À G. This de�nes a simple bargaining game between the Pradhan and other elected representatives. The Pradhan knows that he can offer G to (V 2 1)/2 of the villages and get T À GðV 2 À1Þ for his own village. The remaining vil- lages get nothing, which exceeds what his village would get in the status quo. While this is simple and extreme, it is indicative of what will happen in a wide variety of circumstances where there is a �xed agenda power.6 Summarizing, resource allocation in the agenda setting model has the feature that the allocation of public spending to village v, denoted by G* v , follows: 8 > > ðV À 1Þ > : 0 otherwise: The key empirically relevant observation from the agenda setter model is the resource advantage for the Pradhan’s village. Given this advantage, it is obviously in the interest of every village to capture the Pradhan’s chair. And we would expect the largest village to have an advantage in this process. However, we should not ignore the possibility of coalition formation during the electoral process. A candidate in one village 5. The classic analysis of agenda setting is by Romer and Rosenthal (1978). Riker (1962) �rst proposed the importance of minimum winning coalitions in legislative bargaining. 6. Things are more complex in models such as Baron (1991) where agenda setting power varies randomly over time. Besley, Pande and Rao 197 may withdraw from the race for Pradhan and deliver the votes from his village to another candidate in exchange for belonging to a winning coalition ex post. For example, with three villages of equal size, a candidate from one village could drop out with a coalition of two-third of the voters supporting a remain- ing candidate. This would be credible if the winning candidate could reward the village whose candidate dropped out. A coalition proof equilibrium would then be one where there is no candidate who could drop out of the race and bene�t in this way. Following this logic, we should expect each Pradhan to as- semble a minimal winning coalition in which he gets ( just over) half the support of either the voters or the ward members in a GP. There are typically many winning coalitions possible for any given allocation of population across villages. For example, in the case of three villages with a third of the population each, there are six possible winning coalitions each con- taining two-thirds of the population. A village is the Pradhan’s village in two out of these six coalitions. But there is no obvious reason to expect any one of these coalitions to prevail in practice. In order to remain agnostic about which coalition will form, we choose an ex ante measure of the each village’s “power� by computing the fraction of winning coalitions (i.e., with more than half the population) formed from among all villages in a GP in which that village is decisive in maintaining a coalition containing 50 percent of the GP population. A coalition with more than half the GP population is assumed to be winning with the Pradhan being chosen randomly from among the coalition partners. In an ex ante sense, we expect villages with a larger power score of this kind to have a greater chance of being the Pradhan’s village ex post. A village is more powerful if there are more coalitions in which it is decisive. On this basis, any village with more than half the GP population has a power score of one. In a One or Two village GP a single village is powerful. The interesting cases arise for GPs with more than two villages in which case the power of a village is a nonlinear function of the vector of village populations. Thus we suppose that the power variable is a determinant of the location of the Pradhan’s viilage and will explore empirically whether a village’s power score predicts whether it will become the Pradhan’s village. We can also test whether, independent of the pattern of political control, power influences �nal resource allocation. Within-village targeting The members of the elected council also control households’ access to transfers from the state. A key decision which we focus on here is whether or not a household receives a BPL card. Such cards are intended to be for the poor. But to target them effectively requires (i) that all of the poor can be identi�ed, and (ii) that the village council wants to target only the poor. A benevolent policy maker would target only the poor and mistakes would occur only if there are information costs. Nonbenevolent policy makers may choose to target 198 THE WORLD BANK ECONOMIC REVIEW according to political preference or self-interest, which creates political and agency costs.7 One role of political institutions is to reduce the size of such costs, either by picking more honest politicians or by creating better electoral incentives to help disadvantaged groups. Within villages, elected politicians play a key role in deciding who receives a transfer, thus political incentives should matter. There are probably good reasons to believe that politicians are fairly well-informed about who is poor in a village so the main focus is on political and agency costs. When deciding how to allocate BPL cards, we expect two basic components of a politician’s payoff to matter: (i) their basic preference about who should get such cards, and (ii) the incentives and constraints due to the political process. There are several models of within-village politics which could be used to motivate how the allocation of BPL cards could be affected by politics. First, there may be political distortions due to the use of strategic transfers to gain election as in a probabilisic voting model as reviewed in Persson and Tabellini (2000) and used to model Panchayats by Bardhan and Mookherjee (2010). These would tend to give a policy advantage to key groups of “swing� voters. Another class of models stresses the possibility of ex post rent-seeking by politi- cians as in a political agency framework of the kind reviewed in Besley (2006). These would tend to motivate reasons why politicians themselves would bene�t from holding of�ce. Political reservation could make a difference in either of these frameworks by changing the targeting strategies of politicians who compete for of�ce or by affecting the types of politicians selected (such as their honesty, competence or identity). One important role of reservation in theory is to try to change who holds of�ce with a view toward changing policy outcomes. But reservation could also change incentives since a reserved politician faces a lower probabil- ity of being elected again since their seat may not be reserved in the future. This suggests that the allocation of BPL cards will vary across reserved and unreserved politicians. We should also test for the possibility that political of�ce is used for personal gain by politicians who reward themselves with BPL cards. Given that one important role of politicians is to allocate BPL cards, there is an interesting question of whether the politicians are selected from a particular group. In standard Downsian models of political competition, selection does 7. A survey of all households in one village in Uttar Pradesh provides evidence for the idea that such costs depend on the household type. Das Gupta, Hoff and Pandey (2011) �nd that many low and middle caste households reported that they obtained a ration card with dif�culty, if at all. However, 19 percent reported that they did not obtain a card even after making repeated visits to request one. In contrast, most high caste households reported that they obtained a ration card easily; for 63 percent of high caste, compared to 34 percent of SC and OBC, the ration card was delivered to their homes. The survey also found that for non-SC households, the level of wealth had no effect on the probability that it obtains a Below Poverty Line ration card and, in line with the arguments developed here; targeting appears to be based only on political favoritism. Besley, Pande and Rao 199 not matter since electoral strategy determines the policy outcome. However, citizen-candidate approaches as developed in Besley and Coate (1997) and Osborne and Slivinski (1996) examine a world where, because of dif�culties of commiting to policies up front, the identity of candidate matters. Such models could be used to see whether politicians are drawn from among the village elite. This would depend, in general, on the costs of entry, participation in pol- itical networks and the form of electoral coalitions. The �rst two are more likely to favor educational and income elites. However, how the last influence matters is unclear since it depends on whether the poor can mobilise around speci�c candidates which serve their interests. We would expect political reser- vation to affect selection as in Chattopadhyay and Duflo (2004) and Pande (2003). I I I . D ATA Our analysis uses survey data from over 500 villages which we collected between September and November 2002. The sample villages are distributed across nine boundary districts in the four southern states of India—Andhra Pradesh, Karnataka, Kerala and Tamil Nadu.8 We randomly sampled six GPs in three blocks in each district. In GPs with less than four villages, we sampled all villages; otherwise, we sampled the Pradhan’s village and two randomly selected villages.9 In each village we conducted a Participatory Resource Appraisal (PRA) in which we obtained information on community demographics and public good provision, and surveyed an elected Panchayat of�cial. In the Pradhan’s village the Pradhan was interviewed; otherwise, we interviewed a randomly selected village councillor was interviewed. In a random subsample of three GPs per block (259 villages) we conducted household interviews in surveyed villages. We surveyed 20 households in each village where we required that four be scheduled caste or tribe (SC/ST) households. Household selection was random, and we alternated between male and female respondents. Our �nal household sample size is 5,180. Table 1 provides some descriptive statistics. While the average respondent has over four years of education, politicians are signi�cantly more educated. Average land holdings are 2.2 acres; however, among politicians this �gure rises to 5.7 acres. Politicians elected from non-reserved seats are signi�cantly more landed than those elected from reserved seats. Only 7 percent of the villa- ger respondents, but 25 percent of the politicians, belong to a family where someone held a political position. Finally, 21 percent of village households and 25 percent of politician households possess a BPL card. Thus, while for the 8. At the time of survey at least one year had lapsed since the last GP election in each state. 9. To account for the higher GP population in Kerala we sampled three GPs per block and six wards per GP—the Pradhan’s ward and �ve randomly selected wards. 200 THE WORLD BANK ECONOMIC REVIEW T A B L E 1 . Descriptive Statistics Politicians Overall Mean Non-politicians All Unreserved Reserved Household sample Respondent characteristics Years of Education 4.49 4.33 7.28 8.00 6.51 (4.55) (4.49) (4.36) (3.88) (4.63) Land owned in acres 2.26 2.07 5.71 6.82 4.65 (4.77) (4.38) (8.24) (9.21) (7.05) Family political history (%) 6.70 5.70 25.70 27.30 24.54 (25.00) (23.20) (43.70) (44.63) (43.11) SC/ST (%) 22.90 23.00 22.96 6.89 37.99 (42.00) (42.00) (42.00) (25.38) (48.60) Female (%) 49.10 49.80 35.30 15.32 54.12 (49.90) (50.00) (47.80) (36.09) (49.91) Bene�ciary Status (% households) BPL card (%) 21.95 21.60 25.37 26.81 24.01 (41.30) (41.00) (43.50) (44.38) (42.79) No assets (%) 68.30 70.40 29.60 16.47 41.93 (46.50) (45.60) (45.70) (37.16) (49.43) Perceptions (% non-politicians) Pradhan looks after village needs (%) 38.40 (48.63) Pradhan keeps election promises (%) 36.10 (48.03) Village facilities better than 7.40 neighboring villages (%) (26.20) Village sample Overall GP activism 0.14 (0.61) Village population 1524.80 (1339.50) Power 0.39 (0.35) Pradhan’s Village (%) 38.31 (48.66) Pradhan reserved (%) 54.40 (49.85) Indirect elections (%) 58.77 (49.20) Notes: 1. Years of education refer to respondent’s years of education. Land owned is the acres of land owned by respondent’s household. Family political history ¼ 1 if any household member has held a political position. SC/ST ¼ 1 if the respondent is a scheduled caste or scheduled tribe and female ¼ 1 if the respondent is a female. BPL card is a dummy ¼ 1 if household has a BPL card. No asset is an indicator variable ¼ 1 if the household doesnot possess any of the following: (i) phone, (ii) color TV, (iii) motorized vehicle, and (iv) water pump. 2. Each perception variable ¼ 1 if the respondent agrees with the statement and zero otherwise. 3. Overall GP activism is the average standardized public good provision, where we average across the following categories: roads, transport, electricity, water, sanitation, irrigation, educa- tion, and health. Pradhan reserved ¼ 1 if the position of the Pradhan is reserved for women or low caste. Pradhan’s village ¼ 1 if the Pradhan lives in that village. Power measures the propensity for a village to belong to all the possible voter coalitions which contain more than half the voter population in the GP. 4. Source: Descriptive statistics from survey data described in the text. Besley, Pande and Rao 201 most part politicians belong to the political and economic elite, it appears that they have a greater chance of having a BPL card than a randomly selected non- politician household. Moreover, respondents are critical of local politicians— less than 40 percent believe the Pradhan looks after village needs or keeps elec- tion promises. Less than 10 percent of the respondents believe that their village facilities are better than in neighboring villages. Turning to village-level variables, over half of Pradhan positions are subject to some form of reservation. Roughly 30 percent of both the caste-reserved and caste-unreserved Pradhan positions are reserved for women. Within a block, the assignment of reservation status for the Pradhan position is, in effect, random. Consistent with this, in Besley and others (2004a) we show that public good provision in 1991 was statistically indistinguishable in GPs with and without a reserved Pradhan. To measure public good provision, we collected information on the number of public good investments during the PRA. We collected data for the follow- ing categories: roads, village transport, water, sanitation, irrigation, electricity, education and health. For each category, we construct a count variable denot- ing how many investments occurred in the village since the last GP election. We then construct a standardized investment measure for each category (z-score) by subtracting the mean for non-Pradhan villages and dividing by the corresponding standard deviation. To measure the electoral clout of village v in a GP with n villages we con- sider all coalitions of size less than n with a population greater than half the GP population as winning coalitions. The “own� coalition of village v is the number of winning coalitions which include v and no longer remain a winning coalition when v is removed. For each village we construct a variable which we call “power� which is the ratio of the own coalition size of v to the total number of winning coalitions in the GP. From this calculation, the average village in our sample belongs to 39 percent of the winning coalitions in the GP. We then measure the electoral clout of village by whether it is the Pradhan’s village (i.e. the Pradhan lives in it). I V. E M P I R I C A L A N A L Y S I S The main hypotheses that we test, following on from the discussion above, are: † Agenda Setting: The Pradhan’s village will receive a larger share of Panchayat resources than other villages in the GP. † Self-interest: Politicians are more likely to have a BPL card than other citizens, all else equal. † Group Targeting: Households are more likely to have a BPL card if a politician from their own group is in of�ce. 202 THE WORLD BANK ECONOMIC REVIEW As a background, we �rst examine the correlates of being a politician and of being the Pradhan’s village. We then examine whether the structure of political authority affects individuals’ and villages’ propensity to receive public goods. Selection of Pradhan Village We estimate the following village-level linear probability model: Pvgb ¼ bb þ d1 Xvgb þ hvgb: Pvgb is a dummy variable for village v in GP g in block b which is equal to one if the Pradhan lives in that village. We use bb to denote block �xed effects and Xvgb is a vector of village characteristics. GPs in Kerala consist of one village and hence, by de�nition, each village is a Pradhan’s village. We therefore exclude the Kerala villages from these regressions. We cluster standard errors by GP. The results are in Table 2. In column (1), the independent variable of inter- est is in log village population. A 1 percent increase in village population increases the probability that the village is the Pradhan’s village by 0.24 percent. In column (2) we include other measures of a village’s political power—whether the village is the GP Headquarters and the number of wards in the village. Both variables are positively correlated with village population and also predict the choice of the Pradhan’s village. That said, the effect of village population is robust to the inclusion of these additional variables. In columns (3) and (4), we investigate the importance of a village’s relative population share within a GP. We argued above that since GP elections are based on plurality rule, a village’s relative population share should be the rele- vant determinant of which village captures the Pradhan’s chair. In column (3) we see that a 1 percent increase in the share of GP population living in a village increases its likelihood of being the Pradhan’s village by 0.6 percent (this is the difference in the coef�cients on the log of village population and the log of the GP population). In column (4) we measure a village’s population in- fluence within a GP by its ‘power’—the percentage of winning coalitions in the GP that a village belongs to. This variable positively predicts the Pradhan’s village, and its inclusion renders the effect of a village’s own population vari- able insigni�cant. The effect of the power variable is large: a move from a power of one to a power of one-third reduces the probability of being the Pradhan’s village by roughly 25 percent. In column (5) we show that the im- portance of village demographics in predicting the Pradhan’s village is not influenced by the reservation status of the Pradhan’s position. Overall, these results demonstrate an important role for the population structure across villages in predicting the location of the Pradhan’s village. It also tells us that, at the very least, it will be important to control for village population when we investigate whether living in the Pradhan’s village yields a bene�t in terms of public good provision. Besley, Pande and Rao 203 T A B L E 2 . Selection of Pradhan Villages (1) (2) (3) (4) (6) Village population 0.247 0.153 0.258 0.063 0.044 (0.037) (0.036) (0.042) (0.040) (0.040) Number of wards in village 0.059 0.039 0.038 0.057 (0.011) (0.011) (0.013) (0.019) GP Head quarter 0.220 0.156 0.155 0.148 (0.083) (0.086) (0.090) (0.097) GP population -0.237 (0.034) Power 0.385 0.209 (0.133) (0.115) Village Population* 0.008 Pradhan reserved (0.017) Number of wards in village* -0.003 Pradhan reserved (0.018) GP Headquarter* -0.094 Pradhan reserved (0.145) Power* -0.112 Pradhan reserved (0.235) N 394 389 376 389 389 Notes: 1. OLS regressions reported with robust standard errors, clustered by GP in paren- theses. All regressions include block �xed effects. 2. The dependent variable is a dummy variable ¼ 1 if the Pradhan lives in the village. These regressions exclude Kerala GPs which are one-village GPs. Village population and GP population are entered in logs. 3. Source: Authors’ analysis based on survey data described in the text. Holding Political Of�ce We now look at the selection of politicians and investigate whether individual characteristics affect the likelihood that the respondent is an elected politician. We estimate a linear probability model of the form piv ¼ av þ rxiv þ 1iv ; ð1Þ where piv is a dummy variable for whether respondent i is a politician in village v, av is a village �xed effect and xiv is a vector of individual and house- hold characteristics. The regression exploits within-village variation to estimate the effect of household and individual characteristics on political selection. Standard errors are clustered at the village level. Table 3 reports the results. In column (1) we see that two socioeconomic characteristics increase the likelihood that the respondent is a politician: educa- tion and owning land. An additional year of education increases the probability of being a politician by 0.6 percent and an additional acre of land by 0.6 percent. Politicians are also 7 percent less likely to lack the assets that make a household eligible for a BPL card. Thus we would be surprised, based on 204 T A B L E 3 . Selection of Politicians Politician Pradhan Dependent variable (1) (2) (3) (4) (5) (6) Female -0.004 -0.017 0.014 0.146 -0.060 0.245 (0.006) (0.004) (0.004) (0.065) (0.060) (0.059) SC/ST 0.045 0.005 0.042 0.187 -0.010 0.232 (0.009) (0.006) (0.007) (0.083) (0.064) (0.078) Education 0.006 0.003 0.004 0.017 0.018 0.004 (0.001) (0.001) (0.001) (0.009) (0.008) (0.008) Land owned 0.006 0.005 0.002 0.001 0.003 -0.001 (0.002) (0.001) (0.001) (0.005) (0.005) (0.004) THE WORLD BANK ECONOMIC REVIEW No assets -0.071 -0.047 -0.027 -0.178 -0.072 -0.137 (0.008) (0.006) (0.006) (0.073) (0.063) (0.066) Family political 0.118 0.076 0.049 0.073 0.113 -0.021 history (0.020) (0.017) (0.016) (0.067) (0.056) (0.065) Sample Villagers and Villagers and Villagers and All Politicians Village Councillors Village Councillors and Politicians Unreserved Reserved and Unreserved Reserved Pradhans Politicians politicians Pradhans N 5397 5269 5261 536 423 452 Notes: 1. OLS regressions reported with robust standard errors, clustered by village, in parentheses. All regressions include control for respondent age and age squared. Regressions in columns (1) – (3) include village �xed effects and in columns (4) – (6) GP �xed effects. 2.The dependent variable in columns (1) –(3) regressions is a dummy ¼ 1 if the respondent is a politician, and in columns (4)– (6) regressions is a dummy ¼ 1 if the respondent is a Pradhan. The explanatory variables are as de�ned in notes to Table 1. 3. Source: Authors’ analysis based on survey data described in the text. Besley, Pande and Rao 205 eligibility, to observe politicians being more likely to have a BPL card. Finally, a respondent belonging to a family with a history of political participation is 11 percent more likely to be a politician.10 In columns (2) and (3) we separately examine the propensity of being elected to an unreserved and reserved position respectively. In both cases we observe positive selection on education and family political history.11 However, reserved politicians are poorer as measured by land ownership. They are also signi�cantly more likely to belong to population groups that bene�t from reservation—female and SC/ST. In columns (4)-(6) of Table 3, we restrict the sample to Pradhan villages, and the dependent variable to whether the respondent is the Pradhan. We observe very similar patterns of selection. However, the results tend to be less signi�cant which could simply reflect the much smaller sample size. These results con�rm the impression formed in the raw data (reported in Table 1) that politicians are from a political and economic elite. However, this is somewhat less true for politicians elected from reserved seats.12 Between-Village Allocation of Public Goods To examine resource allocation between villages we estimate a regression of the form Yvgk ¼ bb þ rPvgk þ uXvgk þ 1vgk ; ð2Þ where Yvgk is the standardized measure of public good provision for public good k in village v in GP g. bb are block �xed effects, Pvgk is an indicator vari- able for the Pradhan’s village and Xvgk are controls for village demographics. We cluster standard errors by GP. The public good categories are roads, transport, water, education, health, sanitation, electricity, and irrigation. Our standardized measure—the construc- tion of which was discussed in the data section above—allows us to compare results across subcategories. Finally, following Kling and others (2007), we obtain an overall index by taking the average of equally weighted standardized components of these public good measures. To estimate the covariance matrix (for both subcategories and the overall index) we use a seemingly unrelated re- gression (SUR) model. The results are reported in Table 4. Column (1) of Table 4 shows that, as predicted by the proposed agenda setting model, public good provision is 0.2 standard deviation higher in the 10. We have also estimated these regressions including party af�liation variables. A respondent af�liated with the party in power in the state is roughly 7 percent more likely to be a politician. 11. Further disaggregation shows that family political history is positively correlated with selection only for women. The absence of a political history effect for SC/STs reflects the recent entry of these groups in politics on the back of reservation. 12. Village meeting data also shows that reservation signi�cantly reduces the likelihood that the Pradhan is an economic or political oligarch. 206 T A B L E 4 . Political Power and Public Good Provision Overall provision Roads Transport Water Electricity Sanitation Irrigation Education Health (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) Pradhan Village 0.206 0.168 0.130 0.285 0.180 0.217 0.057 0.118 -0.011 0.100 0.090 (0.048) (0.047) (0.047) (0.107) (0.113) (0.101) (0.082) (0.091) (0.101) (0.094) (0.074) Village population 0.092 0.193 -0.016 0.104 0.071 0.148 0.078 0.031 0.129 (0.051) (0.087) (0.102) (0.103) (0.093) (0.108) (0.087) (0.084) (0.066) Number of wards 0.025 0.038 0.057 0.024 0.019 0.052 -0.027 0.007 0.032 THE WORLD BANK ECONOMIC REVIEW in village (0.019) (0.029) (0.037) (0.029) (0.036) (0.045) (0.020) (0.031) (0.020) GP Headquarter 0.078 0.251 0.184 0.007 0.270 0.002 -0.018 0.074 -0.150 (0.064) (0.154) (0.141) (0.153) (0.131) (0.149) (0.140) (0.136) (0.114) Power 0.0035 -0.200 -0.076 -0.147 0.072 0.115 -0.088 0.213 0.139 (0.101) (0.194) (0.255) (0.228) (0.173) (0.250) (0.195) (0.182) (0.153) Fixed effect Block GP Block Block Block Block Block Block Block Block Block N 521 521 496 496 496 496 496 496 496 496 496 Notes: 1. Overall provision is the equally weighted average of the eight public good outcomes reported in columns (4)-(11). The covariance is esti- mated within a SUR framework. The standard errors are clustered by GP. 2. Village population is entered in logs. 3. Source: Authors’ analysis based on survey data described in the text. Besley, Pande and Rao 207 Pradhan’s village. We obtain a very similar estimate when we control for GP rather than block �xed effect (column 2). The remainder of Table 4 reports al- ternative speci�cations to see whether the Pradhan village effect is robust to the inclusion of village characteristics which influence a village’s probability of se- curing the Pradhan’s position as observed in Table 2. In column (3) we include other determinants of Pradhan village location within the GP. Supporting the idea that we are picking up the effect of political control, the Pradhan village effect remains positive and signi�cant. It is striking that, although the power variable strongly predicts which village will be the Pradhan’s village, it does not appear to influence policy outcomes. Columns (4)-(11) of Table 4 report results for different categories of public good provision. The Pradhan village effect is mainly being driven by provision of roads and water—two important areas of investment by GPs. In no case does the power variable predict public good provision (nor does being the GP headquar- ters). However, for roads we observe an effect of village size over and above the Pradhan village effect. Overall, the results in Table 4 is consistent with the Pradhan’s village enjoying a policy advantage in the GPs that we are studying. Since we only have cross-sectional data, we cannot directly compare public good provision in 2002 with that before the Panchayat system was instituted. However, as a baseline, in Appendix Table 1 we consider a set of 1961 and 1991 village public goods as measured in the censuses taken in these years. For consistency, we construct standardized z-scores for each subcategory following the procedure outlined above and estimate the regressions in a SUR framework. In no case do we �nd that the Pradhan village is doing better. Instead, the main positive predictor of public good provision appears to be village popula- tion. This further supports the notion that the Pradhan village effect is picking up something about the contemporary level of government provision. We have also checked whether the Pradhan village effect is influenced by either Pradhan or village characteristics. We �nd no evidence that Pradhan characteris- tics—as measured by whether he/she has a BPL card, years of education or reserva- tion status—influence public good allocation. Taken together, these results further underpin the proposition that purely agenda-setting power matters for policy. Table 5 looks at the issue from a different angle and examines whether being the Pradhan’s village is correlated with greater political activism and that this, rather than political control, underlies the results. Our survey asked various questions about villagers’ political involvement. If political control is what underlies public good provision, then we would not expect to see higher involvement by residents in the Pradhan’s village. In fact, none of newspaper readership (column 1), party af�liation (column 2), voting in the GP election (column 3) or attending village meetings (column 4) is higher in the Pradhan’s village. Thus political activism appears similar across the Pradhan’s and other villages in a GP. Column (5) con�rms that political knowledge is also similar across villages with the probability of being able to name one’s legislator being no higher in the Pradhan’s village than other villages. But when it comes to 208 THE WORLD BANK ECONOMIC REVIEW T A B L E 5 . Villager Political Involvement and Pradhan’s Village Voted in Attends Knows Reads Af�liated Last GP Gram Knows name name of Seen newspaper with Party Election Sabha of Legislator Pradhan Pradhan (1) (2) (3) (4) (5) (6) (7) Pradhan’s 0.014 0.007 0.017 0.007 0.023 0.238 0.240 village (0.012) (0.013) (0.013) (0.013) (0.017) (0.023) (0.020) N 5133 5133 5133 5133 5133 5133 5115 Mean for 0.325 0.277 0.866 0.239 0.420 0.430 0.506 non-Pradhan (0.466) (0.448) (0.340) (0.427) (0.493) (0.495) (0.500) villages Notes: 1. OLS regressions reported with robust standard errors clustered by GP in parentheses. All regressions include block �xed effects. 2. The sample consists of all respondents but excludes politicians. All regressions include as additional covariates: female, household size, age and age squared and the controls listed in column (1) of Table 6. 3. Source: Authors’ analysis based on survey data described in the text. knowing who the Pradhan is, and having seen him/her, the results are quite dif- ferent (columns 6 and 7). Members of the Pradhan’s village are signi�cantly more likely to be able to name the Pradhan and to have encountered him/her. Taken together, the results in Table 5 provide evidence against the view that the Pradhan village effect proxies for an omitted village-level political activism variable. Rather, it appears that the agenda-setting power conferred on the Pradhan provides an important source of policy advantage to the village in which he or she lives. Within-Village Allocation of BPL cards The basic intent of the BPL card program is to help poor households. The fact that politician households are wealthier than nonpolitician households (Tables 1 and 2) ought, therefore, to imply that politician households are less likely to have a BPL card. To investigate this empirically, we estimate a linear probability model: biv ¼ av þ g1 xiv þ g2 piv þ hiv : ð3Þ biv is an indicator variable for whether household i in village v has a BPL card. xiv is a vector of household characteristics that are relevant to whether the household is needy. It also includes a dummy for whether any household member currently or previously held a political position. piv is an indicator variable for whether the individual is a politician. The influence of village-level characteristics are sub- sumed in a village �xed effect av. The regression, therefore, only exploits within- village variation in individual and household characteristics to explain the alloca- tion of BPL cards. Standard errors are clustered at the village level. The results are in column (1) of Table 6. BPL cards are, on average, targeted towards disadvantaged groups. A SC/ST household is 15 percent more likely to Besley, Pande and Rao 209 T A B L E 6 . Targeting of BPL Cards Dependent variable: Household has BPL card (1) (2) (3) (4) (5) SC/ST 0.152 0.148 0.149 0.150 0.123 (0.019) (0.019) (0.020) (0.019) (0.020) Landless 0.063 0.065 0.065 0.064 0.062 (0.015) (0.015) (0.015) (0.015) (0.016) Landownership -0.001 -0.002 -0.002 -0.001 -0.002 (0.001) (0.001) (0.001) (0.001) (0.001) Education -0.004 -0.005 -0.005 -0.004 -0.004 (0.001) (0.001) (0.001) (0.001) (0.001) No assets 0.066 0.073 0.074 0.076 0.068 (0.014) (0.014) (0.014) (0.013) (0.013) Family political -0.004 -0.014 -0.017 -0.013 -0.017 history (0.020) (0.020) (0.020) (0.019) (0.020) Politician 0.095 0.184 0.089 0.091 (0.033) (0.047) (0.083) (0.033) Reserved politician -0.199 -0.101 (0.067) (0.071) Reserved politician is SC/ST 0.050 0.032 (0.099) (0.112) Politician*years of education -0.010 (0.007) Politician*No assets -0.044 (0.085) Politician*Pradhan decides BPL 0.268 (0.076) Pradhan’s village -0.019 (0.018) N 5397 5397 5397 5397 5397 Notes: 1. OLS regressions with standard errors clustered by village in parenthesis. All regres- sions also include controls for household size, respondent age and age squared. Regressions in columns (1)-(4) include village �xed effects, and regression in column (5) block �xed effects. 2. The dependent variable is a dummy variable ¼ 1 if the household has a BPL card. The ex- planatory variables are as de�ned in Notes to Table 1. Pradhan decides BPL ¼ 1 if the politician states that the �nal powers for selecting BPL household lies with Pradhan. 3. Source: Authors’ analysis based on survey data described in the text. get a BPL card and a landless household 7 percent more likely. Households with a more educated respondent are less likely to get a BPL card. In addition, asset-based eligibility matters. A household which reports none of the assets that make it BPL-ineligible is 6 percent more likely to get a BPL card. Finally, we observe no impact of family political history. Controlling for current eco- nomic status, households in which at least one member holds, or has previously held, political of�ce are no more likely to have a BPL card. Next, we ask whether current political control matters. In column (2), we include as a regressor whether a household member is a currently elected GP politician. Consistent with the view that holding public of�ce reduces the cost 210 THE WORLD BANK ECONOMIC REVIEW of access to such cards for politicians, we �nd that politician households are roughly 9.5 percent more likely to have a BPL card. In column (3), we ask whether politicians elected from unreserved and reserved positions differ in their propensity to hold BPL cards. We include two additional indicator variables as explanatory variables. First, a dummy for whether the pol- itician is elected from a reserved seat, and second, whether the politician is elected from a seat reserved for SC/ST. We �nd that the bene�ts of being a polit- ician (in terms of accessing a BPL card) are limited to unreserved politicians. This effect does not vary signi�cantly across SC/ST-reserved politicians and female-reserved politicians. It is, however, also the case that our limited sample of reserved politicians implies we lack power to disentangle these effects. An F-test shows that we cannot reject the hypothesis that a reserved and unreserved politician are equally likely to have access to a BPL card. The reason is demo- graphic (speci�cally, being SC/ST is a strong predictor of BPL card ownership). This suggests two explanations for the apparently limited political opportun- ism among reserved politicians. First, that reserved politicians are more likely to be eligible for BPL cards and this is captured by the demographic controls (the SC/ST dummy). Reserved politicians, therefore, do not need to exert further political influence to get BPL cards (since they are already eligible). Second, it may be that they are less experienced and therefore unable to work the system to their advantage. While we cannot rule out this explanation, the fact that family political history does not influence BPL card allocation is sug- gestive that the main reason may be differential eligibility of reserved and unre- served politicians (and therefore differential use of political power). In column (4) we examine whether other politician characteristics influence their propensity to get a BPL card. More educated politicians are weakly less likely to have BPL cards. However, a politician’s eligibility for a BPL card (as proxied by asset ownership) does not influence his/her likelihood of having a BPL card. In contrast, the greater access of politicians to BPL cards is concen- trated in GPs where the politician reports that the Pradhan (rather than villa- gers at the village meeting) decides the �nal BPL card allocation. Finally, in column (5) we show that belonging to the Pradhan’s village does not influence a villager’s likelihood of getting a private transfer.13 This suggests that there is no interaction between the two different aspects of resource allocation that we have been studying—between-village allocation and within-village allocation. The evidence in Table 6 suggests that while the BPL program does succeed in targeting the relatively disadvantaged households in a village (as measured by SC/ ST and landless status), politician households also bene�t from this program. We discussed above how BPL targeting might depend on politicians’ charac- teristics either due to a politician’s electoral strategy or to his/her underlying sympathy with particular groups. In Table 7 we investigate this by looking at how village and politician characteristics influence targeting to disadvantaged 13. Estimating this speci�cation as a probit leaves the results unchanged. Besley, Pande and Rao 211 T A B L E 7 . The Determinants of Targeting Characteristics Opportunism Pradhan decides BPL card Pradhans Education Reserved BPL card allocation village (1) (2) (3) (4) (5) SC/ST 0.077 0.115 0.152 0.168 0.162 (0.049) (0.028) (0.029) (0.035) (0.026) SC/ST*Characteristic 0.010 0.082 0.024 -0.035 -0.007 (0.005) (0.039) (0.064) (0.051) (0.040) Landless -0.008 0.077 0.063 0.084 0.076 (0.040) (0.020) (0.017) (0.019) (0.021) Landless*Characteristic 0.008 -0.033 -0.046 -0.089 -0.040 (0.004) (0.030) (0.045) (0.033) (0.029) N 4854 5133 5104 4854 5133 Notes: 1. OLS regressions reported with robust standard errors clustered by GP in parentheses. All regressions include village �xed effect. 2. Regressions include the individual controls included in regression in column (1), Table 4. All regressions exclude politicians. 3. Source: Authors’ analysis based on survey data described in the text. households. We do so by interacting Pradhan and village characteristics with being either an SC/ST or a landless household in the targeting equation.14 Column (1) of Table 7 considers Pradhan’s education. Both landless and SC/ ST households bene�t from having a more educated Pradhan. In contrast, having a Pradhan elected from a reserved position bene�ts SC/STs but not landless households. This is consistent with the idea that individuals bene�t when there are politicians in of�ce whose characteristics are more similar to their own. As a signi�cant fraction of caste-reserved positions for Pradhan are also reserved for women, we do not have the ability to statistically distinguish the effects of gender and caste reservation. In columns (3) and (4) we consider two alternative measures of politician oppor- tunism. The �rst is whether the Pradhan has a BPL card and the second is whether the Pradhan states that s/he has �nal discretion on BPL card allocation. Both sets of regressions suggest that landless households are less likely to get a BPL card in these circumstances. The effect is strongly signi�cant when we de�ne opportunism in terms of Pradhan having control over BPL card allocation (column 4). Finally, in column (5) we examine Pradhan village effects. Living in the Pradhan’s village leaves a household’s propensity to receive a BPL card 14. It is unclear whether villages face a binding budget constraint for BPL cards. To the extent that there is flexibility in the number of BPL cards that can be allocated at the village level, these results can be interpreted as the consequences of selecting politicians of different quality who care more or less about the poor. The theory could be extended to accommodate this using a political agency model with adverse selection where there is some probability of a politician in group R being a good type who cares about targeting the poor or a self-interested type who does not. 212 THE WORLD BANK ECONOMIC REVIEW T A B L E 8 . Pradhan and Village Characteristics and Villager Satisfaction Years of education Reserved Pradhan decides BPL card allocation BPL card Pradhan’s village (1) (2) (3) (4) (5) Dep. Variable: Pradhan looks after village needs 0.008 -0.085 0.046 -0.080 0.125 (0.002) (0.020) (0.025) (0.028) (0.021) Dep. Variable: Pradhan keeps election promises 0.006 -0.072 0.032 -0.098 0.119 (0.002) (0.018) (0.026) (0.023) (0.020) Dep. Variable: Village facilities better than neighboring village 0.002 -0.001 -0.018 -0.002 0.044 (0.002) (0.017) (0.019) (0.017) (0.014) Notes: 1. OLS regressions reported with robust standard errors clustered by GP in parentheses. All regressions include block �xed effects. 2. Each cell reports the coef�cient from a separate regression where the dependent variable is listed in the row above and the explanatory variable in the column. The sample in all regressions is the set of household respondents but excludes politician households. Regressions include as controls the set of explanatory variables listed in column (1), Table 4, and controls for being female, household size, age and age squared. 3. Source: Authors’ analysis based on survey data described in the text. unaffected—again con�rming the idea that the Pradhan village effect ought not to be important for this level of targeting. Evidence from Attitudes Finally, we consider whether household attitudes towards policy directly mirror the �ndings based on studying resource allocation. Table 8 documents perceptions of village residents on whether the Pradhan looks after village needs and keeps his/her election promises. We also look at villagers’ evaluation of facilities in their own village relative to those in neigh- boring villages. In order to study the impact of village-level characteristics, our regressions include block �xed effects. Formally, let qivgb be the probability that villager i in village v is satis�ed with his GP g’s performance. We model this with the following linear probabil- ity model: qivgb ¼ ab þ gxivgb þ dZgb þ hivgb ð4Þ where ab are block �xed effects, xivgb are individual and household characteris- tics, and Zgb are GP characteristics. Standard errors are clustered by GP. Each cell in Table 8 reports the d coef�cient from a separate regression. In all cases except for column (1), the point estimate can be read as the percent change in attitudes when the Pradhan has the speci�c characteristic. In column (1) the point estimate is the impact of one additional year of Pradhan’s education on atti- tudes. In line with our results above, respondents think well of educated Pradhans. For instance, one additional year of education makes it 0.8 percent more likely that the respondent believes that the Pradhan looks after village needs. In contrast, Besley, Pande and Rao 213 column (2) shows that reserved politicians are perceived as worse than unreserved politicians in terms of looking after needs and keeping election promises. Given that such politicians seemed less opportunistic than their unreserved counterparts and were equally good (as Pradhans) as policy-makers, this �nding is surprising. It could be that this �nding reflects more general negative attitudes towards reserva- tion that transcend performance while in of�ce (on this, see also Beaman and others (2009)). In line with this, we also observe no correlation between views about the quality of village public services and having a reserved Pradhan. Columns (3) and (4) consider measures of Pradhan control over BPL card and ownership of a BPL card. With regards to ownership of a BPL card we see that villagers are more dissatis�ed with the performance of the Pradhan if he has a BPL card. However, BPL card ownership has no bearing on whether survey respondents believed that village facilities were better than neighboring villages. Regarding being in the Pradhan’s village, a consistent pattern emerges across all three attitudinal measures with the Pradhan’s village having a more positive attitude towards the Pradhan and their perception of village facilities. These results support the idea that the agenda-setting effect underlies greater provi- sion in the Pradhan’s village. Taken together, our perception-based results reinforce the �ndings on policy outcomes. Opportunistic politicians are perceived as worse, a �nding which goes against the hypothesis that self-dealing politicians are also better at serving their constituents. V. C O N C L U D I N G C O M M E N T S India has far to go in improving the quality of its infrastructure and public service delivery, especially in rural areas (see, for example, Pritchett and others (2006)). The high incidence of poverty in rural India also places a premium on effective targeting of household transfers. In view of this, the 1993 amendment that strengthened local democracy in India promised to deal with both of these issues. Thus it is important to deepen our understanding of how local govern- ments allocate resources in practice. In this paper, we have examined how political influence is used to allocate public resources in a sample of south Indian villages. The analysis has investi- gated resource allocation both between and within villages. The patterns that we have found are robust and transparent—political influence is used exactly as one might expect if politicians enjoy considerable discretionary authority and use it to further their broad self interest. Politicians prove opportunistic in receiving household transfers, and use their agenda-setting power to allocate more resources to their own village. However, we caution against translating these �ndings about the importance of self-interest in resource allocation into unremitting cynicism about the Indian experiment with greater powers for local government. Without a coun- terfactual, we have no way of evaluating the current system relative to alterna- tives. Moreover, the analysis does suggest that political institutions have the 214 THE WORLD BANK ECONOMIC REVIEW potential to affect the extent and type of politician opportunism. Greater use of monitoring of politicians’ use of BPL cards is one possibility.15 But there is also a case for making sure that institutions are designed to rotate the Pradhan’s village so that the advantage evens out over time. More generally, the paper serves as a reminder that, before grander ques- tions about the merits of decentralization can be sorted out, it is necessary to understand the small-scale details of the resource allocation process in local government. Our �ndings suggest that institutional design influences the form of political incentives, and a promising research avenue is to understand how local institutions can be restructured in small, focused, and speci�c ways to make incentives work. APPENDIX: SAMPLING Besley and others (2004b) provides a full description of our sampling strategy. Below we describe the main elements of the sampling procedure relevant to our analysis. For each state pair, two districts (one per state) which shared a common state boundary were selected. Within each pair, the three most linguistically similar block pairs (de�ned in terms of households’ mother tongue using 1991 census block level language data) were selected. We purposely sampled 3 blocks per district, and randomly sampled six GPs per block, except in Kerala, where we sampled three GPs per block. Our sample consists of 201 GPs across 37 blocks. We sampled all villages in GPs with three or fewer villages, otherwise we sampled the Pradhan’s village and two other randomly selected villages. We excluded villages with less than 200 persons from our sampling frame and considered hamlets with population over 200 as independent villages. In every sampled village we conducted a detailed village meeting and a house- hold survey with one elected Panchayat of�cial. If the Pradhan lived in the village, then he/she was interviewed, otherwise a randomly selected village coun- cillor was interviewed. In a random subsample of 3 GPs per block, we con- ducted household interviews in all sample villages (259 villages).16 In Kerala we randomly selected 2 GPs in one block and one GP in the other block (the selec- tion of which block to sample how many GPs from was also random), and within sampled GPs we conducted household interviews in all sampled wards. Twenty households were sampled per village, of which four were SC/ST. 15. Besley, Pande and Rao (2005) showed that there is better targeting in villages that hold gram sabhas, but as the paper notes holding characteristics (which predict greater local control) may be correlated with holding a Gram Sabha. 16. The survey team leader walked the entire village to map it and identify total number of households. This determined what fraction of households in the village were to be surveyed. The start point of the survey was randomly chosen, and after that every Xth household was surveyed such that the entire village was covered (going around the village in a clockwise fashion). A P P E N D I X T A B L E 1 . Public Good Provision in 1961 and 1991 1961 public good provision 1991 public good provision Primary health Primary Medical Access Primary center/ Overall school facility road Village has Overall school dispensary Metalled Village provision present present present electricity provision present present access road has power (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Pradhan’s village 2 0.007 0.028 2 0.029 2 0.015 2 0.012 0.029 0.058 2 0.064 0.126 2 0.002 (0.012) (0.039) (0.026) (0.012) (0.025) (0.036) (0.076) (0.070) (0.080) (0.034) Village population 0.072 0.040 0.130 0.030 0.088 0.167 0.221 0.260 0.157 0.032 (0.020) (0.065) (0.052) (0.019) (0.044) (0.106) (0.200) (0.057) (0.098) (0.183) Number of wards 0.009 2 0.003 0.035 2 0.002 0.005 2 0.004 2 0.034 0.061 0.023 2 0.068 in village (0.005) (0.015) (0.011) (0.002) (0.013) (0.029) (0.049) (0.024) (0.021) (0.052) GP Headquarter 2 0.013 2 0.054 2 0.012 0.027 2 0.014 2 0.004 2 0.154 0.111 0.009 0.018 (0.019) (0.061) (0.047) (0.025) (0.043) (0.080) (0.157) (0.105) (0.113) (0.113) Power 2 0.032 2 0.016 2 0.106 2 0.055 0.049 0.147 0.195 0.160 0.130 0.101 (0.029) (0.099) (0.082) (0.030) (0.055) (0.141) (0.285) (0.134) (0.210) (0.248) N 446 446 446 446 446 496 496 496 496 496 Notes: All regressions include block �xed effects and standard errors clustered by GP are in parentheses. The Overall provision variable is the equally weighted average of the four public good outcomes. The covariance matrix is estimated within a SUR framework. Source: Authors’ analysis based on survey data described in the text. Besley, Pande and Rao 215 216 THE WORLD BANK ECONOMIC REVIEW REFERENCES Atanassova, Antonia, Marianna Bertrand, Sendhil Mullainathan, and Paul Niehaus. 2010. “Targeting with Agents: Theory and Evidence for India’s Below Poverty Line Cards.� UC San Diego. Processed. Bardhan, Pranab, and Dilip Mookherjee. 2010. “Determinants of Redistributive Politics: An Empirical Analysis of Land Reforms in West Bengal, India.� American Economic Review 100(4): 1572– 1600. Baron, David. 1991. “Majoritarian Incentives, Pork Barrel Programs and Procedural Control.� American Journal of Political Science 35(1): 57– 90. Beaman, Lori, Esther Duflo, Rohini Pande, and Petia Topalova. 2010. “Political Reservation and Substantive Representation: Evidence from Indian Village Councils.� Forthcoming in India Policy Forum 2010, Brookings Institute. Beaman, Lori, Raghabendra Chattopadhyay, Esther Duflo, Rohini Pande, and Petia Topalova. 2009. “Powerful Women: Does Exposure Reduce Bias?� Quarterly Journal of Economics 124(4): 1497– 1540. Besley, Timothy. 2006. Principled Agents? The Political Economy of Good Government. Oxford: Oxford University Press. Besley, Timothy, and Stephen Coate. 1997. “An Economic Model of Representative Democracy.� Quarterly Journal of Economics 112(1): 85–114. Besley, Timothy, Rohini Pande, Lupin Rahman, and Vijayendra Rao. 2004a. “The Politics of Public Good Provision: Evidence from Indian Local Governments.� Journal of the European Economics Association 2(2– 3): 416–426. ———. 2004b. “Decentralization in India: A Survey of South Indian Panchayats.� Unpublished typescript. Besley, Timothy, Rohini Pande, and Vijayendra Rao. 2005. “Participatory Democracy in Action: Survey Evidence from India.� Journal of the European Economics Association 3(2–3): 648–657. Chattopadhyay, Raghabendra, and Esther Duflo. 2004. “Women as Policy Makers: Evidence from a India-Wide Randomized Policy Experiment.� Econometrica 72(5): 1409–1443. Chattopadhyay, Raghabendra, Esther Duflo, and Greg Fischer. 2006. “Ef�ciency and Rent-seeking in Local Governments.� MIT, Cambridge, MA. Processed. Das Gupta, Monica, Karla Hoff, and Priyanka Pandey. 2011. “Can Democracy Be Imposed in a Highly Unequal Society?.� Forthcoming in Journal of Development Economics. Faguet, Jean Paul. 2004. “Does Decentralization Increase Responsiveness to Local Needs? Evidence from Bolivia.� Journal of Public Economics 88(3–4): 867–894. Foster, Andrew, and Mark Rosenzweig. 2004. “Democratization, Decentralization and the Distribution of Local Public Goods in a Poor Rural Economy.� Unpublished typescript. Kling, Jeffrey, Jeffrey B. Liebman, and Lawrence F. Katz. 2007. “Experimental Analysis of Neighborhood Effects.� Econometrica 75(1): 83–119. Osborne, Martin J., and Al Slivinski. 1996. “A Model of Political Competition with Citizen Candidates,� Quarterly Journal of Economics 111(1): 65–96. Pande, Rohini. 2003. “Minority Representation and Policy Choices: The Signi�cance of Legislator Identity.� American Economic Review 93(4): 1132– 1151. Persson, Torsten, and Tabellini Guido. 2000. Political Economics: Explaining Economic Policy. Cambridge, MA: MIT Press. Pritchett, Lant, Rinku Murgai, and Marina Wes. 2006. “Building on Success: Service Delivery and Inclusive Growth.� World Bank India Development Policy Review. New Delhi: MacMillian Press. Riker, William. 1962. The Theory of Political Coalitions. New Haven, CT: Yale University Press. Romer, Thomas, and Rosenthal Howard. 1978. “Political Resource Allocation, Controlled Agendas, and the Status Quo.� Public Choice 33(4): 27 –43. Weingast, Barry, Kenneth Shepsle, and Christopher Johnsen. 1981. “The Political Economy of Bene�ts and Costs: A Neoclassical Approach to Distributive Politics.� Journal of Political Economy 89(4): 642 –664. An Axiomatic Approach to the Measurement of Corruption: Theory and Applications ´ ndez James E. Foster, Andrew W. Horowitz, and Fabio Me No generally accepted framework exists for constructing and evaluating measures of corruption. This article shows how the axiomatic approach of the poverty and in- equality literature can be applied to the measurement of corruption. A conceptual framework for organizing corruption data is developed, and three aggregate corrup- tion measures consistent with axiomatic requirements are proposed. The article also provides guidelines for empirical applications of corruption measures and discusses data requirements. A brief empirical example illustrates how each of the measures cap- tures a distinct view of corruption that yields a different ranking. To the authors’ knowledge, this article provides the �rst analysis of corruption measurement using an axiomatic framework. JEL codes: K42, O17, P37 Several recent articles have identi�ed new data sources that allow the measure- ment of corruption to be based on actual episodes rather than perceptions of corruption. Seligson (2006) uses victim surveys to obtain quantitative data on the prevalence of bribery. Reinikka and Svensson (2006) use public expenditure tracking surveys to quantify embezzlement of public funds and enterprise surveys to quantify bribery at the micro level. Olken (2009) and Ferraz and Finan (2008) rely on external audits to measure fraud in local governments. Gorodnichenko and Sabirianova-Peter (2007) use gaps between the incomes and consumption of public of�cials for similar purposes. These evolving data sources will allow researchers and policymakers to pose new questions about corruption and create targeted policies to address it. James E. Foster (fosterje@gwu.edu) is a professor of economics and international affairs at the George Washington University and Research Associate at OPHI, Oxford University. Andrew W. Horowitz (horowitz@uark.edu) is a professor of economics in the Sam M. Walton College of Business at the University of Arkansas. Fabio Me ´ ndez (corresponding author; fmendez@uark.edu) is an associate professor of economics in the Sam M. Walton College of Business at the University of Arkansas. The authors are grateful to participants at the Latin American Econometric Society– Latin American and Caribbean Economic Association conference in 2008, the Brazilian Econometric Society meeting in 2008, the Ponti�cal Catholic University of Rio de Janeiro, Midwest Economic Association meetings, and the Walton College of Business. Foster gratefully acknowledges research support provided by the Institute for International Economic Policy (IIEP) of the George Washington University. THE WORLD BANK ECONOMIC REVIEW, VOL. 26, NO. 2, pp. 217 –235 doi:10.1093/wber/lhs008 Advance Access Publication March 1, 2012 # The Author 2012. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 217 218 THE WORLD BANK ECONOMIC REVIEW But improved data on corruption will not automatically resolve questions or guide policy. There are many ways of translating data into corruption mea- sures, and results from any study could depend on the measurement lens that is being used.1 Moreover, corruption is de�ned differently in different contexts (Bardhan 2006); this inconsistency of de�nition appears in both empirical and theoretical applications.2 It would seem to be an appropriate moment for ex- ploring how corruption might be measured. This article presents a framework for assessing and comparing measures of corruption, enabling more effective use of existing data and providing guidance for data collection. It adapts the axiomatic structure of poverty and inequality measurement to organize corruption data and generate aggregate corruption measures. The axiomatic approach involves de�ning potentially important properties of corruption measures—axioms—and classifying the measures based on those properties. Specifying the axiomatic properties of corruption measures provides criteria that researchers and policymakers can use to evalu- ate and classify corruption measures and interpret empirical �ndings. The article is organized as follows. Section I introduces the general concep- tual framework and terminology. Section II discusses axioms that could plaus- ibly form the foundations for measuring corruption and shows that some common measures are incompatible with those axioms. Section III presents corruption measures that are compatible with the basic axioms and develops additional properties that classify them further. Section IV de�nes data require- ments for the proposed corruption measures and provides an illustrative example using data from the Business Environment and Enterprise Performance Survey developed by the World Bank and European Bank for Reconstruction and Development.3 Section V concludes and suggests directions for future research. I. TE R M I N O LO GY AND CONCEPTUAL FRAMEWORK There are two types of individuals in the model: of�cials and clients. Of�cials are public servants who perform functions such as selling government goods or services, allocating government transfers and funds, issuing permits or levying penalties related to government regulations, and similar tasks. All such func- tions are called services, and the set of public of�cials associated with a speci�c 1. See Me ´ lveda (2010) for a theoretical model that illustrates the dif�culties of using ´ ndez and Sepu alternative corruption measures to resolve a question. 2. For additional examples, see Wolfers (2006), Svensson (2003), and Clarke and Xu (2004), who all use as their measure the number of corrupt transactions. In contrast, Olken (2009), Gorodnichenko and Sabirianova-Peter (2007), Shleifer and Vishny (1993), and Choi and Thum (2005) use the amount of money involved in corrupt transactions. And Cadot (1987) and C ¸ ule and Fulton (2005) use the percentage of government of�cials who are party to corrupt transactions. 3. Related research, including research that utilizes the survey, is available at www.worldbank.org/ wbi/governance/pubs_statecapture. ´ ndez Foster, Horowitz and Me 219 F I G U R E 1. Data Array D Source: Authors’ construction. service is called a department. Clients are members of the public who conduct business and use public services directly or indirectly. Services and departments are indexed by s and clients by i. The total number of departments is de�ned as S and the number of clients as I. Transactions between departments and clients vary by purpose and size de- pending on the type of service provided. Examples include a legal payment for a passport application, a bribe paid for a driver license, and illegal appropri- ation of public funds allocated to a department. The analysis focuses on trans- actions between departments and clients because it is easier to obtain data on corrupt dealings by departments than by individual of�cials. Still, because the methodology treats of�cials and departments the same way, it could be applied to individual of�cials if the data were available. Transactions are recorded in a TxIxS data array D containing T many trans- action reports, denoted by dt. The data array has two types of entries: dtis can be the monetary value of a transaction observed between client i and depart- ment s in report dt, or it can be an empty cell—indicating that no transaction was observed between i and s in report dt. Every transaction report dt contains at least one transaction among its SxI many cells, but can also list several trans- actions between different clients and departments. Multiple transactions between a speci�c client and department are recorded in different reports. In general, T and t refer to numbers of transaction reports, not to speci�c periods of time. As an example, consider the 2x4x4 array D in �gure 1. This array consists of two reports, d1 and d2, each covering four clients and four departments. The entry d113 ¼ 7 indicates that a transaction with a value of 7 is recorded in report 1 between client 1 and department 3. Other entries are missing—so, for example, d111 indicates that report 1 records no transaction between depart- ment 1 and client 1.4 Also note that the transaction amount between a client and department might be zero. For example, a police of�cer might provide pro- tection services to clients without receiving any direct payments from them. Two additional aspects of the above framework are noted. First, the pro- posed framework does not track transactions between private parties. The 4. It is natural for each transaction report dt to have only one transaction. The examples use numerous transactions for each report for illustrative purposes. 220 THE WORLD BANK ECONOMIC REVIEW focus is on corruption involving public servants. Second, the framework is designed to record actual transactions, not implicit or counterfactual ones. Thus it does not address passive corrupt acts such as bribe offers (or demands) that are not accepted. Though such passive acts would ideally be captured by an aggregate measure of corruption, data limitations may make it impractical to assess. I I . A N A X I O M AT I C A P P ROAC H TO MEASURING CORRUPTION The axiomatic approach to measuring poverty was pioneered by Sen (1976, 1983) with early application by Foster, Greer, and Thorbecke (1984). The methodology involves two steps: identi�cation and aggregation. When measur- ing poverty, identi�cation determines who is poor and aggregation maps the data into an overall level of poverty. Similarly, measuring corruption requires explicit identi�cation criteria for determining when a particular transaction is a corrupt transaction and a method of aggregating the resulting data into an overall, scalar measure of corruption. Identifying poverty typically involves specifying a poverty line based on income or consumption, or on dimension-speci�c cutoffs in capability, where persons having attainments below the poverty line are considered poor. Analogously, a transaction can be identi�ed as corrupt if the payment received for the government service exceeds an allowable threshold. The question then becomes: what is the appropriate threshold? In some circumstances, it may be sensible to set the threshold at the legal price of the service, so that a transaction is considered to be prima facie corrupt if the payment to the of�cial exceeds this price by any margin. In general, though, identifying a corrupt transaction might not be so straight- forward. For example, in countries it is acceptable for of�cials to accept a gift as long as it can be consumed in one day. In others it is acceptable to give cookies to government clerks on their birthdays—a practice that could be indistinguishable from a small bribe. Whether the payment received or the preferential treatment granted warrants de�ning a transaction as corrupt depends on several factors including the legal price of the service, local culture and habits, type of service provided, and institutional framework.5 Each context could lead to a different threshold representing the associated tolerance for excess payments. This paper assumes that thresholds vary across services or departments, but are otherwise �xed, and hence can be represented by a vector Z of tolerance thresholds, one for each department. Threshold value zs is the payment amount beyond which a transaction for a service sis considered 5. Such examples bear some analogy to Sen’s (1976, 1983) discussion of the absolute and relative nature of poverty. ´ ndez Foster, Horowitz and Me 221 corrupt. Thus, transaction dtis from D is considered corrupt if dtis . zs but not if dtis zs.6 Using thresholds to identify a corrupt transaction means that only the total amount paid is needed to construct corruption measures. So, when gathering information from clients, investigators do not have to ask how much clients spent on bribes or illegal payments—just how much they spent on speci�c services (a question that the clients might be more willing to answer truthfully). This approach also allows researchers to use tolerance thresholds of zero (zs ¼ 0) to cover cases where clients had to pay for services that they should have received for free. Aggregation is the next step in constructing a corruption measure. This step maps transactions in D and thresholds in Z into an aggregate level of corrup- tion. Additional information on client income levels, as contained in a client re- source vector Y with individual levels yi, might also be used in the aggregation process. Consequently, in this paper, a corruption measure is represented as a scalar valued mapping C (D; Z, Y). The next question considered is: what are the basic properties that a corruption measure should exhibit? Axioms for Measuring Corruption The �rst task in applying the axiomatic approach is to specify a set of proper- ties for corruption measures. Existing and proposed measures can then be eval- uated and classi�ed based on the properties they satisfy. This section begins with a set of basic axioms that all corruption measures might be expected to satisfy. Admittedly, any set of basic axioms could be challenged as being too exacting or too lenient. A primary goal of this article is to foster debate that might lead to consensus on a set of basic axioms for measuring corruption. Consider a generic corruption measure C(D; Z, Y) that uses tolerance vector Z to convert the data in D and client resources vector Y into a corruption metric. In addition, assume that all transactions are expressed in the same real monetary units, eliminating concerns about inflation or units of measure. Four de�nitions lay the groundwork for stating the basic axioms: † D’ is obtained from D by a reordering of client observations if, for some pair j and k of distinct clients, the following holds for all t and s: d’tjs ¼ dtks and d’tks ¼ dtjs, while d’tis ¼ dtis for all i = j, k. In other words, all observations are the same except for two rows ( j, k) whose elements have been switched on all T records. † D’ is obtained from D by a replication of observations if there is an integer m ! 2 such that T’ ¼ mT and D’ ¼ (D, . . . , D) where D’ is an mTxIxS array. 6. The identi�cation threshold can be de�ned as the difference between a payment and zs to accommodate cases where a client (such as a relative) pays less than the standard price for a service. 222 THE WORLD BANK ECONOMIC REVIEW In other words, D’ has m copies of the records in D. † D’ is obtained from D by an increment if d’tis . dtis for a given index combination (t, i, s) and d’ujr ¼ dujr for all (u, j, r) = (t, i, s). The incre- ment is within tolerance if zs ! d’tis. It is frequency increasing if d’tis . zs ! dtis. It is excess payment increasing if dtis . zs. An increment occurs when a single payment is increased and all other entries are unchanged. If the payment begins and ends below the tolerance threshold, it is a within tolerance increment. If the payment begins below the threshold and ends above (and hence is now considered a bribe), the increment is frequency increasing. If the payment begins and ends above the threshold, so the size of the bribe is increased, then it is an excess payment increasing increment. † (D’; Z’, Y’) is obtained from (D; Z, Y) by a proportionate change if (D’; Z’, Y’) ¼ a(D; Z, Y) for a . 0. A proportionate change scales up or down all observations, incomes, and thresholds by the same factor. Basic Axioms With those de�nitions in mind, a set of basic axioms can be de�ned that all corruption measures would be expected to satisfy: † Client anonymity. If D’ is obtained from D by a reordering of client observations, then C(D’; Z, Y) ¼ C(D; Z, Y). Client anonymity assures that the client index number does not impact the corruption measure. That is, it ensures that the identity of the private agents involved in corrupt transactions does not affect the resulting corruption measure; in line with the stated goal of capturing of�cial corruption only. Notice, however, that Client Anonymity does not preclude the use of depart- mental weights among different government services. † Replication invariance. If D’ is obtained from D by a replication of observations, then C(D’, Z, Y) ¼ C(D, Z, Y). Replication invariance ensures that the measure does not depend on the abso- lute number of corrupt transactions, but rather on the number of corrupt transac- tion relative to total transactions. That way, the measure does not treat environments with fewer transactions more favorably, but instead measures corrupt transactions relative to the total number of government services provided. † Focus. If D’ is obtained from D by a within-tolerance increment, then C(D’; Z, Y) ¼ C(D; Z, Y). The focus axiom ensures that the measure is unresponsive to payment amounts for transactions not involving corruption. ´ ndez Foster, Horowitz and Me 223 † Frequency monotonicity. If D’ is obtained from D by using a frequency-increasing increment, then C(D’; Z, Y) . C(D; Z, Y). The frequency monotonicity axiom requires a corruption measure to in- crease when the value of a transaction crosses the tolerance threshold. This simple set of axioms disquali�es many potential measures of corrup- tion. For example, a department headcount ratio that measures corruption as the percentage of departments (or of�cials) that had accepted at least one bribe would violate the frequency monotonicity axiom.7 Similarly, measures that merely sum all identi�able corruption—such as the amount of money paid for bribes or the number of transactions identi�ed as corrupt—would violate the replication invariance axiom. If corruption were measured as all payments above the legal price, the focus axiom could be violated, while measures that weight the transactions of different clients differently would violate the client anonymity axiom. Supplementary Axioms This section introduces a set of supplementary axioms that may be desirable in certain contexts. The axioms can help distinguish among different corruption measures and de�ne more clearly what each measure is capturing. † Bribery monotonicity. If D’ is obtained from D by an increasing excess payment increment, then C(D’; Z, Y) . C(D; Z, Y). The bribery monotonicity axiom ensures that the corruption measure increases when the value of a bribe increases. † Client enrichment. If D contains at least one transaction with positive excess value and Y’ . . Y, then C(D; Z, Y’) , C(D; Z, Y). The client enrichment axiom implies that measured corruption should fall if clients become unambiguously richer and the values of bribes remain the same. † Decomposability. Let D’ be a data array with the same number of clients and departments as array D and let E ¼ (D, D’) be the array obtained by combining them. If n(D’), n(D), and n(E) are the respective numbers of (non-missing) transactions they contain, then C(E; Z, Y) ¼ [n(D’)/n(E)] C(D’; Z, Y) þ [n(D)/n(E)] C(D; Z, Y). In some cases it is useful to decompose corruption measures by subsets of transactions. For example, policymakers might want to analyze corruption by regions or departments or based on client characteristics. The decomposability axiom requires the overall corruption level to be the weighted sum of subset corruption levels, where the weights are the shares of transactions in the subsets. 7. Foster (2009) and Alkire and Foster (2011) describe a similar issue in the measurement of chronic and multidimensional poverty. 224 THE WORLD BANK ECONOMIC REVIEW † Scale invariance. If (D’; Z’, Y’) is obtained from (D; Z, Y) by a propor- tionate change, then C(D’; Z’, Y’) ¼ C(D; Z, Y). This property requires the measure to view corruption in relative terms, so that if all thresholds, transactions sizes and resource levels were doubled, the measured level of corruption would be unchanged. III. PROPOSED CORRUPTION MEASURES The previous section demonstrated that the application of even the most basic axioms disquali�es a number of corruption measures. This section introduces a family of corruption measures consistent with all the basic axioms and reason- ably compatible with available cross-sectional data. The measures are de�ned by �xing a corruption function f(dtis; zs, yi), which indicates the corruption level of a single transaction dtis given the departmental tolerance threshold zs and client resource level yi. This mapping identi�es and evaluates corruption at the transaction level. An associated corruption array Df replaces each transaction dtis with its associated corruption level f(dtis; zs, yi) while leaving untouched the empty cells in array D. A corruption measure Cf can then be constructed by taking the simple unweighted mean value of all the nonempty entries in the corruption array, or Cf(D; Z, Y) ¼ m(Df ). In other words, Cf is the sum of corruption levels f(dtis; zs, yi) for all transactions divided by the number of transactions. Three possible forms of the corruption function and their associated corruption measures are analyzed below. Frequency Measure of Corruption The �rst measure is based on the simplest corruption function, which takes a value of 1 when a transaction is corrupt and 0 when it is not. Formally, de�ne f1 by f1(dtis; zs, yi) ¼ 1 if dtis . zs and f1(dtis, zs, yi) ¼ 0 if dtis zs, and denote the associated corruption array by D1. The frequency measure of corruption C1(D;Z,Y) ¼ m(D1) measures corruption as the fraction of transactions that are corrupt. C1 is clearly bounded between 0 and 1, with higher numbers indica- tive of greater prevalence; it is analogous to the simple head-count ratio from the poverty literature. As an example, consider the 2x4x4 array D in �gure 1 and suppose that the threshold vector is Z ¼ [$0, $0, $2, $0]. The corruption F I G U R E 2. Corruption Array D1 Source: Authors’ construction. ´ ndez Foster, Horowitz and Me 225 function f1 is applied to the transactions in D to obtain the corruption array D1 in �gure 2, which contains 1 for every transaction higher than its respective cutoff and 0 for every transaction that is not. The number of corrupt transac- tions (8) is divided by the overall number of transactions (30), to obtain the frequency of corruption C1 ¼ 0.27. Excess Value Measure of Corruption One disadvantage of using frequency measures of corruption is that they do not account for the amount of resources captured by corruption, or the depth of corruption. For example, an economy where 1 in 10 transactions is corrupt has a frequency measure of C1 ¼ 0.1 regardless of whether the typical corrupt transaction involves a bribe of $10 or $1 million. The depth of corruption can be incorporated into a measure by letting the corruption function be the excess value of a transaction. De�ne f2(dtis; zs, yt) ¼ (dtis –zs) if dtis . zs, and f2(dtis; zs, yt) ¼ 0 if dtis zs, and let D2 be the asso- ciated corruption array. Then the excess value measure of corruption C2(D; Z, Y) ¼ m(D2) evaluates corruption as the extent to which the average transaction exceeds its threshold level. In other words, C2 is the aggregate amount of money paid in bribes divided by the total number of transactions. This measure takes on nonnegative values and is analogous to a poverty gap measure. In the example, the corruption function f2 and the threshold vector Z ¼ [$0, $0, $2, $0] yield the corruption matrix D2 given in �gure 3, which contains the excess payments to government of�cials beyond the tolerance thresholds. C2 is then the sum of the entries (20) divided by the number of entries (30) or C2 ¼ 0.66. Relative Burden Measure of Corruption C2 measures the average depth of corruption, but does not take into account the varying resources of clients or the size of the economy. Arguably, a given- sized bribe imposes a larger burden on a client with fewer resources. Similarly, a country in which 10 percent of GDP is spent on corruption may be viewed as more corrupt than one that spends 2 percent, even if the total amount of bribes is the same. C2 and C1 cannot make this distinction. F I G U R E 3. Corruption Array D2 Source: Authors’ construction. 226 THE WORLD BANK ECONOMIC REVIEW To account for the relative burden of corrupt transactions, corruption func- tion f3 measures excess payments relative to client resources: f3(dtis; zs, yi) ¼ (dtis - zs)/yi if dtis . zs, and f3(dtis; zs, yi) ¼ 0 if dtis zs. D3 is the associated corruption array. The relative burden measure of corruption C3 is de�ned as C3(D; Z, Y) ¼ m(D3), or the sum of the relative burdens divided by the total number of transactions. The calculation of this measure is similar to the previ- ous examples, and so a separate illustration is omitted. Note that under reason- able assumptions C3 is bounded between 0 and 1. Weighted Measures of Corruption The measures presented so far implicitly consider each department to be equally important for measuring corruption. Thus it does not matter for C1, C2, or C3 whether corruption occurs in the department guarding nuclear mate- rials or in a public library. Arguably, corruption measures should have the po- tential of weighting some departments more heavily than others based on their relative importance in a country’s institutional hierarchy.8 Departments can be differentially weighted using a vector w that assigns weights ws based on criteria developed by researchers or policymakers. Weights might be determined by subjective evaluations; or by objective indica- tors, such as the percentage of government workers in a department, the per- centage of the government budget allocated to a department, or the percentage of government transactions processed by a department. Each measure devel- oped in this section can be modi�ed to incorporate weights during aggregation as follows: let Ckw(D; Z, Y) ¼ mw(Dk), where mw is the weighted mean asso- ciated with vector w. Table 1 indicates the basic and supplementary axioms satis�ed by the pro- posed measures C1, C2, C3, and their weighted counterparts. Each of the mea- sures satis�es the basic axioms. C2 and C3 take the size of bribes into account and thus satisfy the bribery monotonicity axiom, while C1 ignores the extent of excess payments and violates it. Neither C1 nor C2 takes into account client incomes, so both violate the client enrichment axiom, while C3 expresses excess payments as a percentage of client income and so satis�es that axiom. Because C1, C2, and C3 are aggregated as means, they all satisfy the decompo- sability axiom, and can all be used for analysis.9 Finally, C1 and C2 satisfy the scale invariance axiom while C2 does not. The weighted versions C1w, C2w, and C3w satisfy the same properties as their unweighted counterparts. 8. See the analogous discussion in Alkire and Foster (2011). It may also make sense to use different weights for different classes of clients, though that is not done here. 9. This feature suggests that each measure is subgroup consistent. For example, if corruption in each region of a country falls, then this must be reflected in a lower national corruption level. See Foster and Sen (1997) for a related discussion in the context of measuring poverty and inequality. ´ ndez Foster, Horowitz and Me 227 T A B L E 1 . Do the Corruption Measures Satisfy the Axioms? Supplementary axioms Basic Bribery Client Scale Measure axioms monotonicity enrichment Decomposability invariance Corruption Yes No No Yes Yes frequency, C1 Absolute costs of Yes Yes No Yes No corruption, C2 Relative costs of Yes Yes Yes Yes Yes corruption, C3 Weighted All C2w and C3w C3w All C1w and C3w Source: Authors’ analysis. I V. A N E M P I R I C A L A P P L I C A T I O N This section applies data (described below) to estimate the measures of corruption—C1, C2, and C3—developed in the previous section and to show how divergent results from speci�c measures can be linked to their underlying axiomatic properties. The goal is to show how a measure’s underlying proper- ties determine the results that it yields, not to defend the country corruption rankings obtained. When considering these three measures, two questions must be addressed. First, what data are required to construct these measures? Second, does each measure have independent value, or do they all correlate regardless of which axioms they satisfy? Data Requirements The framework proposed in this article and the measures de�ned in the previ- ous section require basic data on individual interactions between clients (the public, including representatives of �rms) seeking services from government of�cials or departments (again, used interchangeably in this analysis), and the government of�cials associated with those services. These data include the number of interactions that occurred in a speci�c period, the amounts paid by clients (if any), and the services involved. This type of information can be obtained through surveys of the public and of company managers, such as those currently conducted by the World Bank’s Enterprise Surveys and the Latin American Public Opinion Project (Seligson 2006; Reinikka and Svensson 2006). These surveys measure corruption based on personal experience, but the questions are phrased so that respondents can avoid potential self-incrimination. For example, a survey question might ask: “How often do �rms like yours have to make unof�cial payments to public of�cials to obtain permits?� This type of question makes it possible to obtain a reasonable proxy for the share of transactions with excess payments. The 228 THE WORLD BANK ECONOMIC REVIEW framework proposed in this article, however, makes clear that it is also desir- able for these surveys to collect information on the total number of transac- tions and interactions with the public service provider. Beyond that basic level, one could also incorporate data on graft and em- bezzlement of public funds. While such data are dif�cult to obtain, public ex- penditure tracking surveys are a good source of information. These surveys are designed to track flows of resources in bureaucracies and thus are ideal for identifying and quantifying political and bureaucratic capture and leakages of funds (Reinikka and Svensson 2006). In addition, external audits can be used to measure the extent of fraud in local governments (Olken 2009; Ferraz and Finan 2008), as can gaps between the incomes and consumption of public of�- cials (Gorodnichenko and Sabirianova-Peter 2007). Information on graft, embezzlement, and fraud could easily be incorporated into a data array like D (see �gure 1). These types of corruption do not directly involve speci�c clients but can still be accounted for in the proposed framework by adding a state auditor to the client vector. A similar approach could be used to incorporate information about corruption uncovered by criminal investiga- tions. As data collection on corruption improves, the quality and variety of the data that can be used to create arrays like �gure 1 should expand. Are Measures of Corruption Correlated? Data from the Business Environment and Enterprise Performance Survey (World Bank and European Bank for Reconstruction and Development 2000) were used to determine whether measures of corruption correlate with one another. The survey covered 4,000 �rms in 26 Central and Eastern European countries in 1999–2000, examining a wide range of interactions between �rms and government. Based on interviews with �rm managers and owners, the survey is designed to generate comparative measurements on topics such as cor- ruption, state capture, lobbying, and the quality of the business environment. The data can then be linked to speci�c �rm characteristics and performance. Information from this survey was used to calculate values for C1, C2, and C3. Data from the survey are not ideal for constructing these measures, but the survey did make it possible to reconstruct important elements of the D data array—including information on the frequency and monetary value of corrupt acts. For example, question 28 of the survey asks how often �rms like the respondents’ need to make unof�cial payments to public of�cials in relation to seven government functions. These functions are: to get connected to utilities like electricity and telephony, obtain licenses and permits, deal with taxes, be awarded government contracts, manage customs and importing requirements, deal with courts, and influence laws and regulations. These functions are treated as the seven departments of the complete D arrays. In question 28, respondents were asked to estimate how often they had to make unof�cial payments (always, mostly, frequently, sometimes, seldom, and never). For the purposes of this example, numerical values were assigned to ´ ndez Foster, Horowitz and Me 229 these answers: 100 percent to always, 80 percent to mostly, 60 percent to fre- quently, 40 percent to sometimes, 20 percent to seldom, and 0 for never. Similarly, question 27 asked businesses what percentage of revenues (on average) �rms like theirs typically pay annually in unof�cial payments to public of�cials. Possible answers were 0, less than 1 percent, 1 –1.99 percent, 2–9.99 percent, 10 –12 percent, 13 –25 percent, and more than 25 percent. The answers were given numerical values equal to the medians of each group of ranges (except for more than 25 percent, which was capped at 26 percent). Together with question 27, questions 29 and 51 provide greater details about the magnitude of unof�cial payments. Question 51 asked respondents to estimate their �rm’s annual sales, assets, and debt to the nearest range, with possible answers ranging from less than $250,000 to $500 million or more. Again, the medians of these ranges were used. In turn, question 29 asked respondents to estimate the share of unof�cial payments made at the different government departments. Combining the information from these questions made it possible to estimate the total amount spent on bribes at each department. Finally, additional information regarding �rms’ behaviors and perceptions was incorporated. In particular, question 24 was used, which asked �rms what percentage of senior managers’ time was spent dealing with government of�- cials about the application and interpretation of laws and regulations. Other questions asked respondents to report on the likelihood of �nding an honest of- �cial, the predictability of public policies, and how much of an obstacle cor- ruption was to doing business. To derive the proposed corruption measures, information on the number of all transactions for speci�c pairs of clients and departments is needed. This information is not available in the dataset, so it was assumed to be the same for all pairs of clients and departments. The dataset reports how often corrupt payments were made on average, so by averaging these values for all respondents and departments, it was possible to obtain the measure C1. A similar process was used to construct C2 and C3. In the case of C3, each surveyed �rm has reported the total excess payments as a percentage of total revenues and the percent of total excess payments going to each de- partment. From this, the excess payment to department s as a share of �rm i’s revenue can be obtained, which is interpreted as St (dtis - zs)/yi. The mean value of these aggregate relative burdens is used as the �nal estimate of C3.10 The process for computing C2 is identical to that for C3, except that total payments St (dtis - zs) are used. 10. The C2 and C3 measures described in the previous section use the mean over all transactions. The empirically constructed values in this section use the mean over IxS aggregates. This simpli�cation does not cause any loss of generality because it was assumed that each pair of clients and departments had the same number of transactions. The calculated values for C2 and C3 use a constant (the number of transactions per pair) multiplied by the original values and so preserve the rankings. 230 THE WORLD BANK ECONOMIC REVIEW The orderings that result from the calculations of the three corruption mea- sures were then compared. As a benchmark, the comparison used Transparency International’s Corruption Perceptions Index for 1999 and 2000. (For two countries the index was not available for 2000.) Perception indexes are often used to assess aggregate bureaucratic corruption and so offer a natural comparison to the three proposed measures.11 The index ranges from 0 to 10, with a higher number indicating less corruption. The analysis here inverted the index, so a higher number indicates more corruption. A Spearman rank correlation matrix of the resulting country rankings is shown in table 2, with countries ranked from the most to least corrupt. Country rankings were used because the scales of the three proposed measures are not directly comparable, though similar conclusions are obtained if their levels are used instead. The table shows a positive and signi�cant rank correl- ation between C1 (corruption frequency measure) and C3 (relative burden of corruption), and a negative but insigni�cant rank correlation between C1 and C2 (absolute costs of corruption measure). Rankings from the Corruption Perceptions Index are positively and signi�- cantly correlated with those of C1 and C3 but negatively and signi�cantly cor- related with those of C2. Given that C2 is the only proposed measure that violates the scale invariance axiom, the negative correlations between C2 and all the other measures might be associated with its scale properties. In addition, the positive correlations between C1, C3, and the Corruption Perceptions Index suggest that corruption perceptions may be processed independently of scale. More generally, table 2 suggests that the three measures provide different per- spectives on corruption despite having only minor differences in axiomatic properties. Complete rankings for the entire country sample are shown in table 3, which splits the sample into four groups based on their rankings on the Corruption Perceptions Index: low (0–5.9), lower middle (5.9 –6.7), upper middle (6.7–7.6), and high (above 7.6). One aspect of the table is striking: The three measures reveal different corruption patterns for countries with similar rankings on the perceptions index. Consider Armenia and Romania. Both have upper-middle rankings on the Corruption Perceptions Index, but while Armenia is ranked low by C1 and high by C3, Romania is ranked high by C1 and low by C3. These �ndings suggest that although the Index’s measure of perceived corruption would rank both countries similarly, the types of corrup- tion affecting these countries may differ. A detailed analysis of why corruption perceptions deviate from the �ndings of the axiom-based measures is beyond the scope of this article. The objective here is simply to note that these mea- sures deviate signi�cantly from each other and from perception-based measures 11. Other frequently cited perception indexes include the International Country Risk Guide from Political Risk Services and the Institute for Management Development index of corruption. Both are closely correlated with the Corruption Perceptions Index. ´ ndez Foster, Horowitz and Me 231 T A B L E 2 . Spearman Correlations between Corruption Measures and Perceptions Corruption Absolute costs of Relative costs of Corruption Measure frequency, C1 corruption, C2 corruption, C3 Perceptions Index Corruption 1 frequency, C1 Absolute costs of – 0.27 1 corruption, C2 Relative costs of 0.52** – 0.26 1 corruption, C3 Corruption 0.63** – 0.37* 0.67** 1 Perceptions Index ** Signi�cant at the 5 percent level. * Signi�cant at the 10 percent level. Source: Authors’ analysis is based on data from World Bank and EBRD (2000) and Transparency International using data from 1999 and 2000. and that axiomatic criteria can illuminate the underlying sources of these discrepancies. But can individual measures provide speci�c insights on corruption? The �ndings here suggest that axiom-based measures might contribute to key issues in the literature. Consider the debate on whether corruption facilitates com- merce by enabling businesses to circumvent bureaucratic delays or undermines it by weakening public institutions and worsening delays. Some authors have found support for the second hypothesis. Kauffman and Wei (1999) �nd a positive and signi�cant relationship between �rms’ perceptions of corruption and the amount of time they report wasting with bureaucracy (see Meon and Sekkat 2005 for further support of the second hypothesis). But because the authors used measures of perceived corruption, they cannot identify speci�c factors that shape managerial decisions about time allocation. The analysis also �nds a signi�cant positive correlation between levels of perceived corruption and the time wasted by �rms’ managers dealing with gov- ernment of�cials, as shown in table 4. But a more detailed picture emerges when reviewing the correlation coef�cients for the three measures. Time wasted has a positive and signi�cant correlation with C2 and C3, and their coef�cients are similar to that for the Corruption Perceptions Index. However, the correlation coef�cient between time wasted and C1 is only 0.07 and is not signi�cantly different from zero. Thus it appears that the prevalence or frequency of corruption has less influ- ence on managerial time allocation decisions than do measures of the depth of corruption. In other words, that corruption related to frequent but petty processes—such as paying utility bills or complying with traf�c regulations— may be less harmful than, say, corruption related to rigging contracts. And recall that C2 and C3 satisfy the bribery monotonicity axiom and C1 does not. 232 THE WORLD BANK ECONOMIC REVIEW T A B L E 3 . Rankings for Eastern and Central European Countries Based on Corruption Measures, 1999–2000 Corruption Absolute costs of Relative costs of Groupa Country frequency, C1 corruption, C2 corruption, C3 Low Slovenia 3 24 9 (0 –5.9) Estonia 4 22 10 Hungary 2 18 7 Belarus 1 2 6 Poland 13 21 3 Lithuania 17 10 14 Lower middle Latvia 8 6 2 (6.0– 6.7) Croatia 6 19 4 Bosnia and 14 7 15 Herzegovina Slovakia 15 15 19 Czech Republic 7 20 16 Turkey 21 26 8 Macedonia, FYR 23 4 17 Bulgaria 16 13 5 Upper middle Kazakhstan 10 25 11 (6.8– 7.6) Uzbekistan 20 17 25 Romania 24 8 12 Moldova 19 16 23 Armenia 9 12 21 High Russian Federation 11 23 13 (7.7– 10.0) Albania 25 14 18 Ukraine 22 11 26 Georgia 18 9 24 Azerbaijan 26 5 20 Kyrgyz Republic 12 3 22 Source: Authors’ analysis based on data from Transparency International using data from 1999 and 2000. Note: Groups are de�ned based on their rankings in Transparency International’s Corruption Perceptions Index. a The Corruption Perceptions Index, from 0 to 10, is inverted in this table, so a higher number indicates a greater perception of corruption. T A B L E 4 . Correlations between Business Variables and Corruption Measures and Perceptions in Eastern and Central Europe Corruption Relative burden of Absolute costs of Corruption Indicator frequency, C1 corruption, C2 corruption, C3 Perceptions Index Time wasted on 0.07 0.39* 0.34* 0.36* bureaucracy Investment – 0.35* 0.02 – 0.52* – 0.55* * Signi�cant at the 10 percent level. Source: Author’s analysis based on data from World Bank and EBRD (2000) and Transparency International using data from 1999 and 2000. ´ ndez Foster, Horowitz and Me 233 Another issue often addressed in the literature is whether corruption hinders investment and therefore growth. Mauro (1995), for example, reports a nega- tive relationship between aggregate investment levels and aggregated indexes of corruption perceptions. In the sample, �rm managers were asked to estimate how much their investments had increased over the previous three years. The correlations between their answers and the three corruption measures, as well as the Corruption Perceptions Index, are shown in table 4. The results con�rm a negative correlation between investment and corrup- tion perceptions but provide a nuanced assessment of that relationship. The relative burden of corruption, C3, has a negative and signi�cant correlation with investment, and the magnitude is about the same as that for the Corruption Perception Index. The corruption frequency measure, C1, has a smaller (though still signi�cant) correlation with investment decisions than does the index. The absolute costs of corruption measure, C2, has essentially no correlation with investment. The implication is that given two otherwise identical countries with the same Corruption Perceptions Index rating, the country with a higher C3 will experience less investment. Such analyses hold the promise of improving our understanding of the corruption-growth relation, by identifying the speci�c aspects of corruption that hinder investment, and our understanding of properties of the measures that yield these results. V. C O N C L U S I O N To the authors’ knowledge, this article is the �rst application of an axiomatic framework to corruption measurement. The main goal is to initiate debate on the explicit properties that are desirable when measuring corruption. To this end, four such properties—the basic axioms—were proposed: client anonymity, replication invariance, focus, and frequency monotonicity. In addition, four supplementary axioms specify properties that may be desirable in speci�c con- texts: bribery monotonicity, client enrichment, decomposability, and scale in- variance. The article proposed three measures of corruption and classi�ed them based on their axiomatic properties. Available data do not permit exact calculations of these measures, but they do allow approximations. These approximations revealed signi�cant discrepan- cies between measures with distinct axiomatic properties. Indeed, the empirical exercise in the previous section showed that reasonable corruption measures with distinct axiomatic properties may exhibit negative correlations. Such dis- crepancies highlight the fact that corruption is a multidimensional phenomenon that may be plausibly measured in several distinct ways. These results call for theoretical and empirical researchers alike to clearly specify their de�nitions of corruption, the measurement criteria associated with those de�nitions, and the robustness of their results using alternative measures—because their �ndings may depend on the de�nition of corruption used. As the inventory of corruption data expands, our framework for 234 THE WORLD BANK ECONOMIC REVIEW organizing the data, constructing corruption measures and assessing their axio- matic properties, provides criteria for evaluating alternative measures that do not yet exist. In addition, the framework in this article suggests additional survey questions that can make corruption measures more useful and comparable. The corruption measures in this article are de�ned for a given period. They do not focus on trends in corruption for speci�c clients or departments over time. For example, the data from the Business Environment and Enterprise Performance Survey provide information on the number of bribes paid during the period covered, but no indication of how they are distributed over time. In addition, the time interval of surveys on corruption tends to be relatively short (often a year). Such data make it dif�cult to differentiate between corruption that appears randomly throughout departments and is eradicated and chronic corruption engrained in institutions. Yet the two types may require different policy responses. Subsequent work will extend the framework presented here to include measures and axioms that distinguish transient from chronic corruption. REFERENCES Alkire, Sabina, and James E. Foster. 2011. “Counting and Multidimensional Poverty Measurement.� Journal of Public Economics 95(7-8): 476–87. Bardhan, Pranab. 2006. “The Economist’s Approach to the Problem of Corruption.� World Development 34(2): 341–8. Cadot, Oliver. 1987. “Corruption As a Gamble.� Journal of Public Economics 33(2): 223– 44. Choi, Jay P., and Marcel Thum. 2005. “Corruption and the Shadow Economy.� International Economic Review 46(3): 817– 36. Clarke, George R. G., and Lixin Colin Xu. 2004. “Privatization, Competition, and Corruption: How Characteristics of Bribe Takers and Payers Affect Bribes to Utilities.� Journal of Public Economics 88(9– 10): 2067–97. ¸ ule, Monika, and Murray Fulton. 2005. “Some Implications of the Unof�cial Economy– Bureaucratic C Corruption Relationship in Transition Countries.� Economics Letters 89(2): 207–11. Ferraz, Claudio Frederico Finan. 2008. “Exposing Corrupt Politicians: The Effects of Brazil’s Publicly Released Audits on Electoral Outcomes.� Quarterly Journal of Economics 123(2): 703 –45. Foster, James E. 2009. “A Class of Chronic Poverty Measures.� In ed., T. Addison, D. Hulme, and R. Kanbur, Poverty Dynamics: Interdisciplinary Perspectives. New York: Oxford University Press. Foster, James E., and Amartya Sen. 1997. “On Economic Inequality: After a Quarter Century.� Annex to the expanded edition of Amartya Sen, On Economic Inequality. Gloucestershire, UK: Clarendon Press. Foster, James E., and Anthony F. Shorrocks. 1988. “Poverty Orderings.� Econometrica 56(1): 173– 7. Foster, James, Joel Greer, and Erik Thorbecke. 1984. “A Class of Decomposable Poverty Measures.� Econometrica 52(3): 761 –6. Gorodnichenko, Yuriy, and Klara Sabirianova-Peter. 2007. “Public Sector Pay and Corruption: Measuring Bribery from Micro Data.� Journal of Public Economics 91(5–6): 963–91. Kaufmann, Daniel, and Shang-Jin Wei. 1999. Does Grease Money Speed Up the Wheels of Commerce? NBER Working Paper 7093. Cambridge, Mass.: National Bureau of Economic Research. ´ ndez Foster, Horowitz and Me 235 Mauro, Paolo. 1995. “Corruption and Growth.� Quarterly Journal of Economics 110(3): 681– 712. ´ ndez, Fabio, and Facundo Sepu Me ´ lveda.2010. “What Do We Talk About When We Talk About Corruption?� Journal of Law, Economics and Organization 26(3): 493–514. Meon, Pierre-Guillaume, and Khalid Sekkat. 2005. “Does Corruption Grease or Sand the Wheels of Growth?� Public Choice 122(1– 2): 69 –97. Olken, Benjamin A. 2009. “Corruption Perceptions vs. Corruption Reality.� Journal of Public Economics 93(7–8): 950–64. Reinikka, Ritva, and Jakob Svensson. 2006. “Using Micro-Surveys to Measure and Explain Corruption.� World Development 34(2): 359– 70. Seligson, Mitchel A. 2006. “The Measurement and Impact of Corruption Victimization: Survey Evidence from Latin America.� World Development 34(2): 381– 404. Sen, Amartya. 1976. “Poverty: An Ordinal Approach to Measurement.� Econometrica 44(2): 219–31. ———. 1983. “Poor, Relatively Speaking.� Oxford Economic Papers 35(2): 153–69. Shleifer, Andrei, and Robert W. Vishny. 1993. “Corruption.� Quarterly Journal of Economics 108(3): 599–617. Svensson, Jakob. 2003. “Who Must Pay Bribes and How Much? Evidence from a Cross-section of Firms.� Quarterly Journal of Economics 118(1): 207– 30. Transparency International. Various years. http://www.transparency.org. Berlin. Wolfers, Justin. 2006. “Point Shaving: Corruption in NCAA Basketball.� American Economic Review 96(2): 279–83. World Bank and EBRD (European Bank for Reconstruction and Development). 2000. Business Environment and Enterprise Performance Surveys. http://data.worldbank.org/data-catalog/BEEPS. How Much of Observed Economic Mobility is Measurement Error? IV Methods to Reduce Measurement Error Bias, with an Application to Vietnam Paul Glewwe Research on economic growth and inequality inevitably raises issues concerning eco- nomic mobility because the relationship between long-run inequality and short-run inequality is mediated by income mobility; for a given level of short-run inequality, greater mobility implies lower long-run inequality. But empirical measures of both inequality and mobility tend to be biased upward due to measurement error in income and expenditure data collected from household surveys. This paper examines how to reduce or remove this bias using instrumental variable methods, and provides conditions that instrumental variables must satisfy to provide consistent estimates. This approach is applied to panel data from Vietnam. The results imply that at least 15 percent, and perhaps as much as 42 percent, of measured mobility is upward bias due to measurement error. The results also suggest that measurement error accounts for at least 12 percent of measured inequality. JEL codes: I30, D30, C81, J60 I. INTRODUCTION The distribution of income has attracted the attention of economists for centur- ies. Yet the distribution of income at one point in time may not be the key issue. Instead, long-run or life-cycle inequality may be the object of primary concern. Long-run income is usually more equally distributed than short-run income because over time individuals or households often change their relative position in the short-run distribution of income. This leads to economic mobil- ity, the topic of this paper. Paul Glewwe (pglewwe@umn.edu) is a professor in the Department of Applied Economics, University of Minnesota, St. Paul, MN 55108. I would like to thank Angus Deaton, Gary Fields, Andrew Foster, Hanan Jacoby, and seminar participants at Columbia University, Fe ´ ration Paris- ´ de Jourdan, University College London, University of Wisconsin, and the World Bank for useful discussions and comments. I am also grateful for comments from Elisabeth Sadoulet and three ´ de anonymous reviewers. Finally, I would also like to thank INRA-LEA (Fe ´ ration Paris-Jourdan) for hospitality in the fall of 2004. A supplemental appendix to this article is available at http:// wber.oxfordjournals.org/. THE WORLD BANK ECONOMIC REVIEW, VOL. 26, NO. 2, pp. 236– 264 doi:10.1093/wber/lhr040 Advance Access Publication November 15, 2011 # The Author 2011. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 236 Glewwe 237 Short-run inequality, long-run inequality and mobility are closely related. To see why, consider a common inequality index, the variance of the log of income, and a mobility index that examines income at two time periods: m( y1, y2) ; 1 2 r(ln( y1), ln( y2)), where y1 and y2 are income in the �rst and second periods, respectively, and r( ) is the correlation coef�cient. The intuition for this mobility index is that high correlation of income over time reduces income mobility. Suppose there are only two time periods, so that Var(ln( y1 þ y2)) is long-run inequality. If changes in income over time are modest, so that y2/y1 is not far from 1, and inequality is fairly stable, so that Var(ln( y1)) % Var(ln( y2)), then: Varðlnðy1 þ y2 ÞÞ % 1 À mðy1 ; y2 Þ=2: Varðlnðy1 ÞÞ (This is proved in Appendix 1.) In this simple two-period example, the ratio of long-run inequality over short-run inequality is (approximately) a function of mobility, indeed, only of mobility; for any level of short-run inequality, greater mobility reduces long-run inequality. Thus, shifting concern for equity from short-run to long-run inequality leads directly to mobility. Since economic mobility examines changes in incomes over time, empirical work requires panel data. A serious problem with empirical work on mobility is that household survey data are likely to measure income with error, which exaggerates the extent of both mobility and inequality at any point in time. This is the case even in developed countries; see Bound and others (2001) for a detailed discussion of the problem, and of potential solutions, that focuses on the U.S. and other developed countries. While mobility studies a decade ago usually ignored measurement error (Fields and Ok 1999a; Gardiner and Hills 1999; Gottschalk, 1997; Gottschalk and Spolaore 2002; Maasoumi and Trede 2001), more recent empirical work does address it.1 First, there is the earnings dynamics literature. Yet this litera- ture has its own limitations. For example, both Abowd and Card (1989) and Meghir and Pistaferri (2004) assume that measurement error is serially uncor- related and uncorrelated with earnings and hours. While the latter uses a less restrictive model of earnings dynamics, it obtains only an upper bound of the effect of measurement error on estimated mobility. Second, a few papers use employer data to assess measurement error in survey data on earnings (Pischke 1995; Dragoset and Fields 2006; Gottschalk and Huynh 2006). But they are limited to U.S. data, and it is unclear whether their results apply to other countries, other types of income, or expenditure data. Indeed, in developing countries many workers are self-employed; for 1. A related literature focuses on the impact of measurement error on indicators of inequality and poverty; see Chesher and Schluter (2002) for a theoretical analysis and an application to Indonesian data. 238 THE WORLD BANK ECONOMIC REVIEW them no employer data exist to assess the extent of measurement error in their reported incomes. Finally, three very recent papers examine mobility in developing countries. Two use instrumental variable (IV) methods to address measurement error bias in household survey data from Latin America (Antman and McKenzie 2007 and Fields and others 2007). The third examines transitions in and out of poverty, which is closely related to mobility, using panel data from South Korea (Lee and others 2010). This paper adds to the literature in four ways. First, it presents IV methods that can be used for correlation-based mobility measures, which are consistent with axioms that mobility measures should satisfy. IV methods do have limita- tions, but they require neither explicit models of income dynamics nor data that are amenable to validation from employers or other sources. Second, the IV methods presented in this paper can be implemented using only two periods of panel data, while other methods require three or more time periods. For example, the method of Lee and others (2010) requires panel data that have at least four time periods. Third, this paper characterizes the conditions under which IV methods reduce measurement error bias in estimates of mobility and inequality, including bounds on those estimates when some assumptions do not hold. Fourth, this paper provides estimates of mobility for a low income Asian nation: Vietnam. The paper �rst discusses how to measure mobility, and then shows how to reduce measurement error bias for mobility indices based on the correlation of income over time. The method is then applied to data from Vietnam. The results suggest that at least 15 percent, and perhaps much more, of measured mobility in expenditures per capita is due to measurement error, and the same is true for at least 12 percent of measured inequality. II. ECONOMIC MOBILITY: CONCEPTS AND MEASUREMENT Economic mobility focuses on changes in individual or household incomes over time, yet the term “mobility� has many meanings (Fields and Ok 1999b). This paper focuses on relative income mobility. Relative mobility indices focus on changes in income shares, not changes in income. Thus, growth in every- one’s income at the same rate yields the same (relative) mobility as no change in anyone’s income (in each case, shares do not change): no mobility at all. Shorrocks (1993) presents axioms that relative mobility indices should satisfy. The key axiom has the following intuition: For a group of people observed at two periods of time, mobility increases if one person whose income is higher than another’s in both periods switches income with the other person in one of the two periods. (Switching incomes in both periods is point- less; it yields the original situation of one person being richer than the other in both periods.) This axiom, proposed by Atkinson and Bourguignon (1982), focuses on mobility over time, not the distribution of income in one time Glewwe 239 period; indeed, switching cannot change the income distribution in either period. The intuition for this “Atkinson-Bourguignon condition� is that the switch equalizes the distribution of life cycle income, just as a Pigou-Dalton transfer (from a richer to a poorer person) reduces inequality at one point in time. Relative mobility indices that satisfy the Atkinson-Bourguignon condition are either derived from inequality indices or social welfare functions or are based on the correlation coef�cient of a function of income. Shorrocks (1993) discusses the former. Regarding the latter, relative mobility is de�ned as 1 2 r( f( y1), f( y2)), where r( ) is the correlation coef�cient and f( ) is any func- tion that is increasing in income ( f 0 ( ) . 0). Examples are one minus the correl- ation coef�cient (that is, f( y) ¼ y), the Hart (1981) index ( f( y) ¼ ln( y)), and one minus the rank correlation coef�cient ( f( y) ¼ rank( y)). Any mobility measure de�ned as 1 2 r( f( y1), f( y2)), where f0 ( ) . 0, satis�es the Atkinson- Bourguignon condition (see Appendix 1). Different mobility measures may give different results because they emphasize different aspects of mobility, such as mobility among the poor, or among the rich. To check for robustness in empirical work, one should use several measures, examples of which were given in the previous paragraph. One can also use an “exponential family� of mobility measures: m( y1, y2) ¼ 1 2 r( ya a 1, y2), with a . 0. These indices satisfy the Atkinson-Bourguignon condition, since f0 ( y) ¼ ay a21 . 0, and as a increases, this family of measures is increasingly sen- sitive to mobility at high incomes (see Appendix S1 in the supplemental material, available at http://wber.oxfordjournals.org/). A �nal issue is: How do correlation-based mobility measures differ from the measure used in Antman and McKenzie (2007) and Fields and others (2007)? Those papers use b in the equation y2 ¼ a þ by1 þ u2, where Cov( y1, u2) ¼ 0. A higher b indicates less mobility so, like r( y1, y2), b measures immobility. If Var( y1) ¼ Var( y2) then b ¼ r( y1, y2), so b is equal to the correlation-based measure (when both use the same transform of y, e.g., ln( y)). But if Var( y1) , Var( y2) then b , r( y1, y2), and Var( y1) . Var( y2) implies that b . r( y1, y2). While b does satisfy the Atkinson-Bourguignon condition, it may give mislead- ing results when Var( y1) = Var( y2). The intuition is that, if y2 ¼ a þ by1 þ u2, then for a given b (and a given Var( y1)), an increase in Var(u2) will not change measured mobility, since b is �xed, even though it reduces r( y1, y2) and thus implies more off-diagonal observations in transition matrices for income of the type shown below for Vietnam (Table 2). A more formal exposition is in the online Appendix S1. III. MEASURING MOBILITY IN THE PRESENCE OF MEASUREMENT ERROR All indices of relative mobility tend to exaggerate mobility if income is mea- sured with error. Fortunately, one can use IV methods to address this problem 240 THE WORLD BANK ECONOMIC REVIEW for correlation-based measures. This section explains the problem, and how to reduce or remove it. A. Bias Due to Measurement Error. Studies of economic mobility typically use income or expenditure data from household surveys. Anyone who has seen how such data are collected realizes that they contain many errors, and valid- ation studies (Bound and Krueger 1991; Pischke 1995) have veri�ed this. Intuitively, measurement error causes measured mobility to overestimate true mobility because fluctuations in measured income due to measurement error are treated as actual income fluctuations. More formally, consider correlation-based mobility indices. The goal is to estimate m( y1*, y2*) ¼ 1 2 r( f( y1*), f( y2*)), where asterisks denote “true� income, so the task at hand is to estimate r( f( y1*), f( y2*)). For simplicity, let f( y*) ¼ y*.2 The correlation coef�cient of two variables is equal to the covari- ance of those variables divided by their standard deviations. Thus r( y1*, y2*) can be expressed as sy1*,y2*/(sy1*,y2*), where sy1*,y2* denotes covariance and sy* 1 and sy1 * denote standard deviations. Let observed income in time periods 1 and 2 be y1 ¼ y1* þ u þ ey1 and y2 ¼ y2* þ u þ ey2, respectively, where ey1, ey2 and u are white noise errors; note that u allows the measurement error in y to be correlated over time. This implies that the correlation of observed income, which can be denoted by r( y1, y2), satis�es the following approximation: sy1 Ã;y2 à þ s2 u sy Ã;y à þ s2u rðy1 ; y2 Þ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi % 2 1 22 2 Þ : ð1Þ 2 2 2 ðsy1 à þ su þ sey1 Þðsy2 à þ su þ sey2 Þ 2 2 2 ð sy1 à þ s u þ s ey1 If s2u ¼ 0 (measurement error is uncorrelated over time), the second term in (1) is less than r( y1*, y2*). Intuitively, ey1 and ey2 add noise to y1* and y2*, raising observed mobility. If s2 u . 0, so that measurement error is correlated over time, the numerator and the denominator in the second term in (1) exceed their respective terms in r( y1*, y2*), so that r( y1, y2) can overestimate r( y1*, y2*). Yet this occurs only if measurement errors are more correlated than income itself (see online Appendix S1), which is very doubtful. In fact, U.S. validation studies (Bound and Krueger 1991; Pischke 1995) �nd that measurement error is much less cor- related over time than earnings. Finally, if measurement errors are linearly cor- related with unobserved income the correlation of observed income still underestimates the true correlation (online Appendix S1).3 Henceforth, all measurement errors are assumed to be uncorrelated with y1* and y2*; this im- plicitly includes linearly correlated errors. B. Instrumental Variable Estimation of r( y1, y2). IV methods can provide esti- mates of r( y1*, y2*) that remove, or at least reduce, measurement error bias. To 2. The analysis applies to any function f( y*) if measurement error in y* causes observed f( y*) to equal f( y*) plus an additive error. 3. Intuitively, the component of the error that is linearly correlated with y1* (or y2*) amounts to multiplying y1* ( y2*) by a constant, which does not affect r( y1*, y2*). Glewwe 241 see how this works, note that, for an ordinary least squares (OLS) regression of a variable x1 on a constant and another variable, x2, the OLS coef�cient for x2 is a consistent estimate of the covariance of the two variables divided by the variance of x2, which can be denoted as sx1,x2/s2 x2. Similarly, regressing x2 on x1 consistently estimates sx1,x2/s2 x1. Thus, for any two variables, OLS regression can be used to consistently estimate their correlation coef�cient. This implies that one can estimate r( y1*, y2*) using OLS regression to be pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi plim½ b1LS b2LS Š ¼ rðyà à 1 ; y2 Þ ð2Þ where b1LS is the OLS (slope) coef�cient from regressing y1* on y2*, and b2LS is the OLS coef�cient from regressing y2* on y1*. Of course, OLS regressions using y1 and y2 yield estimates of r( y1, y2), not r( y1*, y2*), since measurement errors in the observed variables lead to biased estimates of those coef�cients. Yet IV methods can be used to remove bias due to measurement error. More precisely, if credible instruments can be found one can use IV methods to con- sistently estimate b1LS, b2LS, and thus consistently estimate r( y1*, y2*). As explained above, this approach requires two regressions, one of y1* on y2* and another of y2* on y1*. In fact, if the variance of income is the same in both time periods (that is, if sy1* ¼ sy1*), one regression is suf�cient. This is evident from noting that, for an OLS regression of y2* on y1* and a con- stant, the coef�cient on y1* (b2LS) consistently estimates sy1*,y2*/sy1*2. If the variance of income is unchanged over time, so that sy* 1 equals sy*1 , then this coef�cient also consistently estimates sy1*,y2*/sy1*,y2*, which is the correlation co- ef�cient for y1* and y2*. Indeed, one regression may be suf�cient even if sy* 1 = sy* 2 ; the regression coef�cient, which equals sy1*,y2*/sy1*2, can be transformed into the correlation coef�cient (sy1*,y2*/(sy1*sy2*)) by multiplying it by sy1*/sy2*. While one cannot estimate sy1*/sy2* directly, since y1* and y2* are unobserved, one plausible assumption is that the measurement errors in y1* and y2* are a �xed proportion of their true variances. In this case sy1/sy2 ¼ sy1*/sy2*, so sy1*/sy2* can be consistently estimated using the variances of observed income in each time period. The IV approach consistently estimates r( y1*, y2*) only if suitable instru- ments can be found. To see the issues involved in �nding suitable instruments, consider estimation of b1 and b2 in the following two equations: y1 à ¼ a1 þ b1 y2 à þ u1 ð3Þ y2 à ¼ a2 þ b2 y1 à þ u2 ð4Þ where, by de�nition, Cov( y2*, u1) ¼ Cov( y1*, u2) ¼ 0. Let z1 and z2 be the in- strumental variables for y1 and y2 (the observed values of y1* and y2*), re- ffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi spectively. The IV estimate of r( y1*,y2*), denoted rIV( y1, y2), is b1IV b2IV , where b1IV and b2IV are the IV estimates of b1 and b2 (see Bowden and Turkington 1984). But a disturbing result appears if one attempts to estimate 242 THE WORLD BANK ECONOMIC REVIEW the covariance of z1 and z2 using y1 and y2 as the instruments for z1 ffiand z2: it pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi is easy to show that this IV estimate of r(z1, z2) also equals b1IV b2IV . So does rIV( y1, y2) estimate r( y1*, y2*) or r(z1, z2)? The answer depends on the nature of instruments. The following subsections consider three distinct types of instruments: second measurements of y1* and y2*, variables that “cause� y1* and y2*, and variables “caused by� y1* and y2*. C. Instruments that are Second Measurements. Turning to the �rst possi- bility, second measurements can be de�ned as second efforts to measure a variable of interest. For household survey data, the ideal approach would be to return to the household to administer some or all of the questions in the household questionnaire a second time. If instruments are second measure- ments, it does not matter that the IV estimates for r( y1*, y2*) and r(z1, z2) are identical: the correlation of z1 and z2 simply reflects the correlation of y1* and y2*. Consistent IV estimation of r( y1*, y2*) using second measurements requires strong, even unrealistic, assumptions on the measurement errors. Fortunately, less restrictive assumptions can provide informative bounds on r( y1*, y2*). Formally, let z1 and z2 be second measurements, with error, of y1* and y2*. The measurement errors in y1, y2, z1 and z2 can be expressed as follows: y1 ¼ y1 à þ uf þ ut1 þ um1 þ ey1 ð5Þ y2 ¼ y2 à þ uf þ ut2 þ um1 þ ey2 ð6Þ z1 ¼ y1 à þ uf þ ut1 þ um2 þ ez1 ð7Þ z2 ¼ y2 à þ uf þ ut2 þ um2 þ ez2 : ð8Þ Assume that all nine u and e error terms are white noise. The �rst, uf, is house- hold speci�c and does not vary over time or by the type of measurement. There are also two time speci�c measurement errors: ut1 is common to both measure- ments in the �rst time period, and ut2 is analogously de�ned for the second period. Moreover, there are measurement errors that pertain to the type of measurement ( y is one type and z is the other); um1 affects the �rst type of measurement in both time periods, and um2 is analogously de�ned for the second measurement. Finally, the four e terms are purely idiosyncratic errors. This decomposition allows a wide variety of correlation across the (aggregate) measurement errors of these four variables. The variances and covariances of y1, y2, z1 and z2 can be used to solve for the variances of um1, um2 and the four e terms, but not the variances of y1* and y2* (sy1*2 and sy2*2), nor for their covariance (sy1*,y2*).4 Indeed, one can 4. See Appendix 2 for the proof. Glewwe 243 show that the IV estimator removes these “solved� terms: sy1 Ã;y2 à þ s2 uf plim½rIV ðy1 ; y2 ފ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 ðsy1 à þ suf þ sut1 Þðsy2 à þ s2 2 uf þ sut2 Þ 2 ð9Þ sy Ã;y à þ s2uf % 2 1 22 : ðsy1 à þ suf þ s2ut1 Þ The middle expression in (9) shows how different assumptions regarding s2 uf, s2 2 2 ut1, and sut2 lead to bias in rIV( y1, y2). If suf ¼ 0, that is, there is no “�xed� measurement error that is constant across both time and the two measure- ments, and if s2 2 ut1 ¼ sut2 ¼ 0, that is, there is no common error for measure- ments at the same time, then rIV( y1, y2) consistently estimates r( y1*, y2*). Alternatively, if s2 uf ¼ 0 but there is an error that is speci�c to one or both time periods (s2 ut1 . 0 and/or sut2 . 0), then rIV( y1, y2) underestimates r( y1*, y2*). Conversely, if s2 2 uf . 0 but there are no time-speci�c errors (sut1 ¼ sut2 ¼ 0), 2 2 2 2 then rIV( y1, y2) overestimates r( y1*, y2*). Finally, if suf, sut1, and sut2 are all positive, then rIV( y1, y2) could underestimate or overestimate r( y1*, y2*). This last case is most plausible. Some households may underreport (or exaggerate) income or expenditure at any time for any measurement (s2 uf . 0), and real eco- nomic or social conditions in any year could affect all measurements in that year, which implies s2 ut1 . 0 and sut2 . 0. In fact, rIV( y1, y2) is very likely to underestimate r( y1*, y2*), but perhaps not by as much as r( y1, y2) underestimates r( y1*, y2*). To see why underesti- mation is likely, consider the alternative possibility, that rIV( y1, y2) overesti- mates r( y1*, y2*). From equation (9), this occurs only if s2 2 2 uf/(sut1 þ suf) . r( y1*, y2*). Recall that the correlation coef�cient for observed income almost certainly underestimates the true correlation, that is r( y1*, y2*) . r( y1, y2). Thus overestimation occurs only if s2 2 2 uf/(sut1 þ suf) . r( y1, y2), which implies (after rearranging terms) that s2 2 uf . [r( y1, y2)/(1 2 r( y1, y2))]sut1. U.S. earnings data suggest that autocorrelation of the aggregate error is , 1/3; Pischke’s (1995) validation study �nds a value of only 0.094 for two periods four years apart. Assuming that the variances of ut1 and ut2 are approximately equal, and that the same is true of the variances of ey1 and ey2, 5 implies that s2 uf , s2ut1/2 þ s 2 ey1 /2 2 s 2 um1. Both s 2 um1 and s 2 ey1 are identi�ed, and one can estimate r( y1, y2) from the data. Thus one can plot both inequalities (s2 uf . [r( y1, y2)/ 2 2 2 2 2 2 2 (1- r( y1, y2))]sut1 and suf , sut1/2 þ sey1/2 2 sum1) into suf 2 sut1 space to see the values of s2 2 uf and sut1 that satisfy them. It may be that no values of suf and 2 2 sut1 satisfy them (as seen below, for Vietnam they hold only for very small values), in which case rIV( y1, y2) cannot overestimate r( y1*, y2*). This check is simple to apply. If it indicates that rIV( y1, y2) is unlikely to overestimate 5. Assuming that s2 2 2 2 ut1 % sut2 and sey1 % sey2, and that the autocorrelation of the aggregate 2 measurement error is , 1/3, implies that (suf þ s2 2 2 2 2 um1)/(suf þ sut1 þ sum1 þ sey1) , 1/3. Rearranging terms gives the result in the text. 244 THE WORLD BANK ECONOMIC REVIEW r( y1*, y2*), and if r( y1, y2) , rIV( y1, y2), then rIV( y1, y2) provides a lower bound on r( y1*, y2*) that is higher than that given by (the sample estimate of ) r( y1, y2). Finally, the second measurement approach has another advantage, which is that it can be used to remove at least part of the measurement error in observed inequality at a given point in time. To see how, note that equations (5) and (6) 2 imply that Var( y1) ¼ sy1* þ s2 2 2 2 2 uf þ sut1 þ sum1 þ sey1 and Var( y2) ¼ sy2* þ 2 2 2 2 suf þ sut2 þ sum1 þ sey2. More importantly, equations (5)–(8) imply that 2 Cov( y1, z1) ¼ sy1* þ s2 2 2 2 uf þ sut1, and that Cov( y2, z2) ¼ sy2* þ suf þ sut2. 2 Comparing these two sets of equations, Var( y1) and Var( y2) overestimate Var( y1*) and Var( y2*) by the sum of the variances of all four components of the measurement error, yet the covariance terms exclude the contribution to this bias of the last two of these four components. The variance of (the log of) income (or expenditures) is a useful inequality index,6 so these covariance terms provide an upper bound estimate on inequality that is lower than the (over)estimate of in- equality obtained from measuring the variance of (the log of) observed income. D. Instruments that “Cause� Income: A Dubious Choice. Now turn to the second case, where z1 “causes� y1* and z2 “causes� y2*. Suppose y1* and y2* are generated by y1* ¼ g1 þ d1z1 þ v1 and y2* ¼ g2 þ d2z2 þ v2, where v1 is uncorrelated with z1 and v2 is uncorrelated with z2. Using these z variables as instruments is very unlikely to lead to a consistent estimate of r( y1*, y2*).7 Indeed, adding the fairly innocuous assumption that Cov(v1, z2) ¼ Cov(v2, z1) ¼ 0 implies that rIV( y1, y2) estimates r(z1, z2), not r( y1*, y2*), and nothing is gained by relaxing this covariance assumption. See Appendix 2 for details. The intuition here is that v1 and v2 in the processes that generate y1* and y2* add to the variance, and perhaps the covariance, of y1* and y2*, but neither the variances nor the covariance of v1 and v2 is captured by z1 and z2. Thus all variables that “cause� y1* and y2* lack crucial information that is needed to estimate r( y1*, y2*). E. Instruments that Are “Caused� by Income. Finally, consider the third case, where the instruments z1 and z2 are “caused by� y1* and y2*. Assume that the process generating the data is z1 ¼ k1 þ p1 y1 à þ w1 ð10Þ z2 ¼ k2 þ p2 y2 à þ w2 ð11Þ 6. While the variance of the log of income is a convenient measure of inequality, it does have one theoretical flaw, which is that it does not satisfy the Pigou-Dalton condition (that transfers from one person with a high income to another with a lower income reduce inequality) in the upper tail of the income distribution. Nevertheless, in practice it yields results very similar to those based on measures that do not have this flaw, and so it is a useful indicator of inequality that is often used in the earnings dynamics literature. 7. This is similar to a result in Antman and McKenzie (2007). They also show that using instruments obtained from a process in which instruments “cause� income (see their equation (8)) yields inconsistent estimates of mobility. Glewwe 245 where w1 is uncorrelated with y1* and w2 is uncorrelated with y2*. If Cov( y1*, w2) ¼ Cov( y2*, w1) ¼ 0, then pffiffiffiffiffiffiffiffiffiffiffiffi p1 p2 sy1 Ã;y2 à plim½rIV ðy1 ; y2 ފ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiqffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ rðy1 Ã; y2 ÃÞ: ð12Þ p1 p2 s2 2 y1 à sy2 à Appendix 2 proves this, and Appendix S2 shows that the zero correlation assumptions imply that z1 and z2 in (10) and (11) meet the requirement that instruments be uncorrelated with the error in the equation of interest: Cov(u1, z2) ¼ Cov(u2, z1) ¼ 0. The intuition here is that, unlike variables that cause y1* and y2*, variables caused by y1* and y2* are potentially valid instru- ments because they include all the variation and covariation of y1* and y2*. Of course, other variables may also cause z1 and z2; they are in the error terms w1 and w2. These “omitted variables� may be correlated with y1* and y2*; if so, this leads to biased estimates of r( y1*, y2*). Yet if the correlation of w1 and y1* (and w2 with y2*) is linear, omitting these variables does not cause any bias; it simply changes the estimates of p1 and p2, which cancel out in the estimate of r( y1*, y2*), as seen in (12). The assumption that Cov( y1*, w2) ¼ 0 and Cov( y2*, w1) ¼ 0 is dif�cult to test. Yet if the impacts of y1* and y2* on z1 and z2, respectively, do not persist over time, it may be that Cov( y1*, w2) ¼ 0, since lack of persistence implies that w2 does not reflect past values of y2* (one of which is y1*). In contrast, if there is persistence it follows that Cov( y1*, w2) . 0, which causes overesti- mation of b1, and so of r( y1*, y2*). Even so, such persistence does not imply that Cov( y2*, w1) = 0 unless z1 has a “causal� effect on y2*. A �nal problem with variables caused by y1* and y2* as instruments is that (10) and (11) are linear. What if equation (11) were, say, quadratic in y2*? Then IV estimates of b1 in (3) using z2 in (11) to instrument y2* are inconsist- ent if, in (3), Cov(u1, y2*2) = 0 (see Appendix S2). Similarly, if equation (10) is non-linear, then IV estimates of b2 in (4) using z1 to instrument y1* are in- consistent if Cov(u2, y1*2) = 0. Thus if E[ y1*j y2*] is non-linear in y2* and the causal impact of y2* on z2 is non-linear, or if E[ y1*j y2*] is non-linear in y1* and the causal impact of y1* on z1 is non-linear, then plim[rIV( y1, y2)] = r( y1*, y2*). This implies that all four relationships must be checked for non-linearity. If both of the �rst pair, that is equations (3) and (11), are non-linear, then z2 should be transformed so that equation (11) becomes linear, and if both of the second pair, i.e. equations (4) and (10), are non-linear, then z1 should be trans- formed so that (10) is linear. How can one check for non-linearity, since y1*, y1*2, y2* and y2*2 are un- observed, and using their observed counterparts leads to attenuation bias? Fortunately, under certain conditions one can check linearity using observed variables. Speci�cally, if the coef�cient on y1*2 ( y2*2) in a regression of any variable on y1* and y1*2 ( y2* and y2*2) is zero, then regressing that variable 246 THE WORLD BANK ECONOMIC REVIEW on the observed values, of y1 and y2 2 1 ( y2 and y2) will yield a zero coef�cient on 2 2 y1 ( y2) if both the measurement error ey1 (ey2) and y1* ( y2*) are symmetric. Also, regardless of whether ey1 (ey2) and y1* ( y2*) are symmetric, if the coef�- cient on y1*2 ( y2*2) is not zero then the coef�cient on y2 2 1 ( y2) is also nonzero; and if ey1 (ey2) and y1* ( y2*) are symmetric and the coef�cient on y1*2 ( y2*2) is not zero, then the coef�cient on y2 2 1 ( y2) has the same sign as the coef�cient 2 2 on y1* ( y2* ). These symmetry conditions can be checked via y1 ( y2), since if y1* and ey1 ( y2* and ey2) are symmetric, so is y1 ( y2). Of course, symmetry of y1 ( y2) does not guarantee that both y1* and ey1 ( y2* and ey2) are symmetric, but it is dif�cult to imagine a scenario where neither of the unobserved vari- ables is symmetric but their sum is symmetric. To summarize, instrumental variables should be either second measurements of the income variable or variables caused by income.8 Instruments that cause income yield inconsistent estimates. Estimates using second measurements may be biased, but such bias is very likely to underestimate the correlation of income in the two times periods, that is underestimate r( y1*, y2*), and so is very likely to overestimate mobility. When using instruments caused by income, if the impact of income on the instrument persists over time the IV es- timate is likely to overestimate r( y1*, y2*), and thus it would underestimate mobility. Finally, when using instruments caused by income one should also check for non-linearity in the key relationships; it leads to inconsistency, but this problem can be addressed by transforming the instrument to generate a more linear relationship in the equations that show how income “causes� that instrument (i.e., equations (10) and (11)). I V. M O B I L I T Y IN VIETNAM IN THE 1990’S A. Background. Vietnam provides a good opportunity to study mobility. It was one of the world’s poorest countries in the 1980s. In the 1990s, its high rate of GDP growth (8 percent) reduced its poverty rate from 58 percent in 1992–93 to 37 percent in 1997–98 (Glewwe, Agrawal and Dollar 2004). Yet Vietnam also experienced a small increase in inequality; the Gini coef�cient on per capita expenditure rose from 0.33 to 0.35. Another reason to study Vietnam is data availability. The 1992–93 Vietnam Living Standards Survey (VLSS) covered a nationwide sample of 4,800 house- holds. The 1997–98 VLSS surveyed 6,000 households, including 4,305 of those in the 1992–93 VLSS. Both collected extensive data on many topics. This paper uses the consumption expenditure data to measure welfare, as Deaton (1997) recommends. The VLSS also collected income data, but such data tend to be less accurate, and microeconomic theory measures utility in 8. Lewbel (1997) proposed a method to generate instruments to address problems of measurement error bias, but his method works only for estimating structural relationships, and equations (3) and (4) are not structural. Glewwe 247 terms of consumption, not income. Finally, the VLSS also collected data on the height and weight of all household members. For more details on the VLSS, see World Bank (1995, 2000). One concern is attrition bias. All but 96 (2.0 percent) of the 4,800 house- holds surveyed in 1992–93 were to be included in 1997–98 (Table 1). In 1997–98, interviewers returned to the dwellings of these 4,704 households and interviewed households that remained within their villages. They did not attempt to reinterview those that left their villages. Of the 4704 households, 4305 were reinterviewed in 1997–98, a retention rate of 91.5 percent. However, some of these 4,305 have weak links to the original household. For example, for 21 dwellings, the head of the household in 1992–93 was not a member in 1997–98 and the head in 1997–98 was not a member in 1992–93. These are excluded from the panel, yielding a retention rate of 91.1 percent. A stricter de�nition of panel households is that at least half the people who were members in either year were members in both years; this removes 436 house- holds, yielding an 81.8 percent retention rate. The second panel of Table 1 shows that, for the variables used in the analysis, the means (in 1992–93) for the panel households are very similar to the means for all households (including attriters), which suggests that attrition bias is unlikely to be a serious problem. B. Mobility Uncorrected for Measurement Error. Mobility indices summar- ize in one number the joint distribution of income or expenditure at two points in time. Their values are not intuitive, so Table 2 shows (relative) transition matrices for Vietnam from 1992–93 to 1997–98. It groups households by per capita expenditure quintiles (poorest 20 percent, next poorest 20 percent, etc.). For robustness, both VLSS panel data samples (that is, using the two different de�nitions of a panel household) are shown. Table 2 seems to show substantial mobility. For each sample, only 41 percent of the population did not change quintiles after �ve years; 40 percent moved up or down by one quintile and 19 percent moved two or more quintiles. Thus, ignoring measurement error, one could argue that Vietnam’s modest rise in inequality in the 1990’s is of less concern because low expenditure levels seem to be temporary for many house- holds. Indeed, half of those in the poorest quintile in 1992–93 appear to have left that quintile by 1997–98. Table 3 quanti�es the apparent mobility in Table 2 with mobility indices based on correlations of functions of per capita expenditure.9 If expenditure is positively correlated over time, measured mobility will lie between 0 (no mobil- ity) and 1 (complete mobility; expenditure is uncorrelated over time). All but one of the indices give similar results, from 0.278 to 0.331. Recalling the tran- sition matrices, this range indicates substantial mobility. As explained below, the second measurement method to reduce measure- ment error bias cannot be applied to a few components of expenditure (housing, utilities, health, education, and in kind wages). These components 9. In Tables 3, 4 and 5, the household is the unit of observation. 248 T A B L E 1 . Panel Attrition from 1992–1993 to 1997–1998 Households Individuals 1992– 93 households 4800 23,838 Excluded from 1997– 98 survey 96 (2.0%) 421 (1.8%) All household members moved 399 (8.3%) 1,769 (7.4%) Remaining households 4305 (89.7%) 21,648 (90.8%) Among remaining 4305 households: Head is the same in both years 4284 (89.3%) 21,571 (90.5%) 50% or more members are the same in both years, 3848 (80.2%) 19,145 (80.3%) plus 6 “natural� cases Means (in 1992-93) of Variables Used in the Analysis All Households Panel Households Only THE WORLD BANK ECONOMIC REVIEW Head same 50 þ % members same Per capita expenditures 1393.6 1355.6 1340.7 Log(per capita expenditures) 7.051 7.033 7.022 Household size 4.97 5.04 4.98 Average body mass index (adults) 19.46 19.44 19.45 Notes: 1. The 96 households excluded from the 1997– 98 survey were all from the Red River Delta region. Those households were dropped because the 1997– 98 survey oversampled some regions, but not the Red River Delta, so the 1997– 98 survey required slightly fewer households from that region than did the 1992– 93 survey. 2. The six natural cases refer to households in which no one moved in or out of the household in the past �ve years, but death or birth led to cases where the number of household members present in both years was less than 50 percent of the individuals who were members in either year. Examples are a household with three adults in 1992–93 of whom two had died by 1997– 98, and a household with a married couple in 1992– 93 that had had three children by 1997– 98. 3. The individual �gures are the number of household members in 1992– 93 in the associated groups of households. When individuals who were no longer household members in 1997– 98 are excluded, the number of individuals who were members in the 3,848 households in both years is 16,750, which is 70.3 percent of the individuals originally surveyed in all 4,800 households in 1992– 93. T A B L E 2 . Transition Matrix of Per Capita Expenditures: Vietnam, 1992–93 to 1997–98 1997– 98 Quintile Head Is the Same 1 2 3 4 5 Row Total 1 2186 (10.2%) 1148 (5.3%) 689 (3.2%) 332 (1.5%) 45 (0.20%) 4400 (20.4%) 1992– 93 Quintile 2 1069 (5.0%) 1366 (6.3%) 1182 (5.5%) 613 (2.9%) 146 (0.7%) 4376 (20.3%) 3 501 (2.3%) 936 (4.4%) 1169 (5.4%) 1244 (5.8%) 501 (2.3%) 4351 (20.2%) 4 163 (0.8%) 569 (2.6%) 1038 (4.8%) 1471 (6.8%) 1073 (5.0%) 4314 (20.0%) 5 48 (0.2%) 153 (0.7%) 440 (2.0%) 937 (4.3%) 2544 (11.8%) 4122 (19.1%) Column Total 3967 (18.4%) 4172 (19.3%) 4513 (21.0%) 4597 (21.3%) 4309 (20.0%) 21,563 (100.0%) 50% or More of Members Are the Same 1997– 98 Quintile 1 2 3 4 5 Row Total 1 2007 (10.5%) 1058 (5.5%) 620 (3.3%) 242 (1.3%) 33 (0.2%) 3960 (20.7%) 1992– 93 Quintile 2 909 (4.8%) 1302 (6.8%) 1088 (5.7%) 566 (3.0%) 113 (0.6%) 3978 (20.8%) 3 463 (2.4%) 874 (4.6%) 1077 (5.6%) 1127 (5.9%) 402 (2.1%) 3943 (20.6%) 4 131 (0.7%) 492 (2.6%) 924 (4.8%) 1333 (6.9%) 876 (4.6%) 3756 (19.6%) 5 36 (0.2%) 111 (0.6%) 385 (2.0%) 800 (4.2%) 2168 (11.3%) 3500 (18.2%) Column Total 3546 (18.6%) 3837 (20.0%) 4094 (21.4%) 4068 (21.2%) 3592 (18.8%) 19,137 (100.0%) Notes: 1. All numbers and percentages are in terms of individuals, not households. The number of individuals refers to household members in 1992– 93, which is slightly smaller than the number in Table 1 because three households are missing expenditure data either in 1992– 93 or 1997– 98. 2. Column and row totals are not exactly 20 percent because the quintile classi�cation is de�ned with respect to all households, not just the panel households. Glewwe 249 250 THE WORLD BANK ECONOMIC REVIEW T A B L E 3 . Estimated Mobility in Per Capita Expenditures, Ignoring Measurement Error Mobility index Head same sample 50% threshold sample 1 2 r( y1, y2) 0.309 0.299 pffiffiffiffiffi pffiffiffiffiffi 1 2 r( y 1 ; y2 ) 0.292 0.278 1 2 r( y2 2 1, y2) 0.395 0.394 1 2 r(rank( y1), rank( y2)) 0.331 0.316 1 2 r(ln( y1), ln( y2)) 0.298 0.282 Number of Households 4281 3845 T A B L E 4 . Estimated Mobility, Ignoring Measurement Error, Using an Expenditure Variable for which Two Measurements Are Available Mobility index: 1 2 r(ln( y1),ln( y2)) 1992– 93 1997– 98 Head same 50% threshold Expenditure Variable (log of sample sample per capita expenditure) Mean Variance Mean Variance Full measurement (sum over 6.868 0.300 7.532 0.281 0.341 0.327 all items) 1st measurement only (half of 6.824 0.397 7.445 0.345 0.435 0.432 items) 2nd measurement only (other 6.839 0.380 7.540 0.331 0.428 0.414 items) Number of Households 4281 3845 Note: The expenditure variable used here differs from that in Tables 1, 2 and 3 in that it excludes housing, utilities, health, education, and in kind wages. account for about 20 percent of total expenditures. When using the second measurement approach to assess how much r( y1, y2) underestimates r( y1*, y2*), the estimates of r( y1, y2), and m( y1, y2) must use an expenditure variable that excludes these components. Information on this version of the expenditure variable is shown in the �rst line of Table 4, which for simplicity shows only the results based on the log functional form.10 The estimated mobility, 0.341 (“head same� sample), is 14 percent higher than that in Table 3 based on “full� total expenditure (0.298); this small difference implies that the excluded components have somewhat lower mobility. Table 4 also shows mobility esti- mates using the two separate measurements (the details of which are explained below). They show much higher mobility; intuitively, they are noisier estimates of (log) household expenditure. The variance of the (log) expenditure variable 10. Note that in Table 3, the log functional form gave results similar to those of most of the other functional forms. The log functional form also has the advantage that its distribution is almost symmetric, which is useful for checking the linearity of equations (3), (4), (10), and (11). Henceforth, the analysis will focus on this functional form. Glewwe 251 (Table 4) is also of interest since it is an inequality index. It almost certainly overestimates inequality, so it is useful to see how much this bias can be reduced. C. Corrected Correlation Coef�cients. The mobility estimates in Tables 2, 3, and 4 ignore measurement error and so very likely overestimate mobility. This subsection uses instrumental variable (IV) methods to minimize this bias for the mobility index 1 2 r(ln( y1), ln( y2)), using two types of instruments, second measurements and adult body mass index (BMI).11 How can one construct a second measurement of expenditure? The VLSS ex- penditure variable is the sum of ten separate components. Five of these ten components are themselves the sum of many items: food expenditures on 18 items during major holidays, food expenditures on 45 items during the rest of the year, nonfood expenditures on 14 small items ( past two weeks), nonfood expenditures on 50 larger items (last year), and estimated rental values for 26 types of durable goods. The other �ve components, which together constitute only 20 percent of total expenditure, are housing, utilities, health, education, and in kind wages. The method for obtaining a second measurement can be used only on components that are the sum of many items, so the latter �ve components are excluded from the expenditure variable for the analysis based on IVs that are second measurements. The procedure to obtain two distinct measurements of expenditures for four of the �ve categories was, for each category, to divide all items into two com- parable subgroups. Intuitively, each subgroup is a (noisy) measurement of ex- penditure for that category (after inflating expenditure on items in each subgroup by the inverse of the ratio of spending on that subgroup over spend- ing on both subgroups). Subgroups were created as follows. Within each cat- egory, rank all items by mean expenditure. Assign the item with the most expenditure to subgroup 1, assign the next two (second and third highest) to subgroup 2, assign the next two (fourth and �fth) to subgroup 1, and so on. The intuition for why these constitute separate measurements is that, in each category, each of the two subgroups is a “sample� of the expenditures for that category. The one exception to this procedure, for “non-holiday� food expenditure, reflects both a problem applying it to Vietnam and additional information available in the VLSS. For non-holiday food, two questions were asked: 1) Amount spent, by item, since the �rst interview (about two weeks earlier); and 2) A set of questions (how many months out of the last 12 months the item was purchased, frequency of purchase in those months, and value of a typical purchase) that approximate the past 12 months’ expenditure. Thus two 11. One shortcoming of using adult body mass index as an instrument is that most household surveys do not collect height and weight data. Yet more surveys now do so, including some of the World Bank’s LSMS surveys, the Rand Corporation’s Family Life Surveys, and Oxford University’s Young Lives surveys. 252 THE WORLD BANK ECONOMIC REVIEW separate measurements already exist for non-holiday food expenditure. This is useful since rice is by far the dominant food staple in Vietnam, so assigning it to one measurement yields a far higher variance in the other measurement. Thus for non-holiday food purchases the two measurements are the two-week and 12-month recalls. This was not done for food items consumed from own production; this has only a 12-month recall, so such consumption was divided into two subgroups, using the method described above.12 IV mobility estimates using the second measurement approach are in the top half of Table 5. To estimate b1 in equation (3), the observed values of y1 and y2 are those for one type of measurement, and the other measurement for y2 is used as the instrument for y2 in that speci�cation; b2 in equation (4) is estimated analogously. (Recall from subsection III.B that reversing which measurement is used in the equations and which is used as the instrument does not change esti- mated mobility.) Estimated mobility is 0.291 for the head same sample; this is 15 percent lower than the estimate in Table 4 using observed expenditure (0.341). The result for the stricter de�nition of panel households is similar. Such estimates are very likely upper bounds on true mobility,13 so more than 15 percent of observed mobility in Vietnam is just measurement error. As explained in Section III, the variances and covariances of the two measure-ments of expenditures also provide upper bounds on inequality. Applying this to Vietnam, observed inequality in Table 4 (“full� measurement) was 0.300 in 1992–93 and 0.281 in 1997–98. Yet Cov( y1, z1) is 0.265 and Cov( y2, z2) is 0.248. The former is 12 percent less than 0.300 and the latter is 12 percent less than 0.281, so at least 12 percent of observed inequality, and perhaps much more, is upward bias due to measurement error. The second instrument, the body mass index (BMI) of adults age 18 and older, is de�ned as weight (kilograms) over height (meters) squared. It is argu- ably caused by expenditure since it measures a person’s weight given his or her height; wealthier people eat more and so are heavier. In the VLSS data, 65 –70 percent of adults are of normal weight or overweight, and another 25–30 percent are moderately underweight. Only 4 percent are severely underweight. This suggests little causal feedback from BMI to current expenditure.14 12. The two week vs. 12 month approach was not used for expenditure on infrequently purchased non-food items since about one third of the households report not purchasing such items in the last 2 weeks, and no non-food item dominates the way rice dominates food expenditures. Also, the division of all items into two subgroups was repeated using a different assignment rule (highest expenditure to group 1, second highest to group 2, third highest to group 1, etc.). The results were very similar to those presented in Tables 4 and 5. 13. The two inequalities presented in subsection III.C, combined with estimates of 0.023 for s2um1 and 0.104 for s2 2 2 ey1 imply that rIV( y1, y2) overestimates r( y1*, y2*) only if suf , 0.033 and sut1 , 0.016. This is not only a small range for those variances but also implies very small measurement errors. 14. Feedback from weight to income could occur if low nourishment lowers adults’ work capacity (the ef�ciency wage hypothesis). This claimed lack of feedback does not rule out a causal effect of adult height on household income. Height reflects nutrition in early childhood, while BMI reflects current nutritional status. Glewwe 253 T A B L E 5 . Estimated Mobility of Per Capita Expenditures Using Instrumental Variables Instrumental Variable Head Same Sample 50% Threshold Sample Second Measurements of Expenditure b1 0.770 (0.030) 0.802 (0.031) b2 0.653 (0.031) 0.657 (0.030) pffiffiffiffiffiffiffiffiffiffiffi b1 b2 0.709 (0.022) 0.726 (0.022) m( y1, y2) 0.291 [0.853] 0.274 [0.838] Sample size 4281 3845 Body Mass Index (BMI): b1 1.001 (0.047) 1.006 (0.046) b2 0.801 (0.093) 0.836 (0.100) pffiffiffiffiffiffiffiffiffiffiffi b1 b2 0.895 (0.053) 0.917 (0.056) m( y1, y2) 0.105 [0.352] 0.083 [0.294] Sample size 4274 3834 Notes: 1. All results set f( y) ¼ ln( y), so the mobility index is 1 2 r(ln( y1), ln( y2)). pffiffiffiffiffiffiffiffiffiffiffi 2. Numbers in parentheses are standard errors (delta method used for b1 b2 ). 3. Numbers in brackets show estimated mobility as a fraction of the estimate of mobility obtained when measurement error is ignored (as given in Table 3 for the BMI results and Table 4 for the second measurement results). 4. For the estimates based on Body Mass Index (BMI), BMI has strong predictive power in the �rst-stage regressions. In particular, in the regression of 1992– 93 log expenditures on the 1993 BMI variable (and a constant), the t-statistic on the BMI variable is 5.42, which implies an F-test of 29.4. For the analogous regression for 1998, the t-statistic for 1998 BMI is 11.02, which implies an F-statistic of 121.5. Moreover, any measurement error in BMI is unlikely to be correlated with ex- penditure measurement errors. Indeed, the VLSS height and weight data were collected by a person different from the one who �lled out the household ques- tionnaire. Yet if BMI is a stock of health, y1* could be positively correlated with w2 (because w2 would include past BMI, which is partly determined by y1*), leading to overestimation of b1 in equation (3). A �nal issue is the linearity of equations (3), (4), (10), and (11). Recalling Section III, consistent estimation of b1 using causal instruments requires that either (3) or (11) be linear, and to estimate b2 consistently either (4) or (10) must be linear. Observed log per capita expenditure is close to symmetric in both years (see Figure 1), which implies that linearity can be checked by using observed values of expenditure. Unfortunately, adding squared terms yields statistically signi�cant coef�cients for equations (4), (10), and (11). Thus z1 (BMI in 1992–93) in (10) must be transformed to obtain a linear relationship. A nonparametric regression of that equation yields a slightly convex function (see the upper graph in Figure 2). It was made more linear by transforming BMI in a way that reduces all values above its median.15 Regressing the 15. This was done in two steps. First, BMI was rescaled as 19.5 þ 0.2*(BMI-19.5) if BMI . 19.5. Then this new variable was also rescaled as 20 þ 0.5*(BMI-20) if the new variable was . 20. This rescaling was based on the shape of the nonparametric regression of BMI on per capita expenditures. 254 THE WORLD BANK ECONOMIC REVIEW F I G U R E 1. Density of Observed Log Per Capita Expenditures, 1992–93 and 1997–98 transformed BMI on log per capita expenditure and its square yielded a t- statistic of 1.83 for the squared term, indicating a more linear function. Figure 2 shows kernel (nonparametric) regressions of BMI, and transformed BMI, on log per capita expenditures. Both graphs show that BMI gradually increases with per capita expenditures, with transformed BMI (the lower graph) showing a somewhat more linear relationship. pffiffiffiffiffiffiffiffiffiffiffi Table 5 presents estimates of b1, b2, b1 b2 , and mobility using BMI (aver- aged over adults) as an instrument. The log function implies that Var( y1*) should be close to Var( y2*), so both b1 and b2 should be approximately equal to r( y1*, y2*). Yet the estimate of b1, 1.00, is doubtful; indeed, it implies zero mobility. Recall from subsection III.C that this may reflect a persistent effect of y1* on BMI, which implies that Cov( y1, w2) . 0, which in turn leads to over- estimation of b1. Ignoring this, mobility is estimated at 0.105 for the “head same� sample and 0.087 for the “50 percent threshold� sample, so that two thirds of measured mobility is measurement error. The �nding that most of measured mobility is measurement error is doubt- ful, as is the result that b1 is much larger than b2. It is likely that past Glewwe 255 F I G U R E 2. Kernel Regressions of Body Mass Index on the Log of Per Capita Expenditures, 1992–93 household expenditures affect current BMI since a person’s weight is a stock that reflects previous weight; this implies upward bias in b1 (recall the discus- sion in subsection III.B). Yet it does not imply upward bias in estimating b2, which estimates r( y1*, y2*) if Var( y1*) % Var( y2*). The estimates of b2 suggest that r( y1*, y2*) is between 0.801 and 0.836, so mobility is between 0.l64 and 0.199. This implies that 33 percent to 42 percent of estimated mobility from observed expenditure (as given in Table 3) is measurement error. Instruments caused by y1* and y2* also yield an upper bound on inequality. First, note that equations (5) and (6) imply that Cov( y1, y2) ¼ Cov( y1*, y2*) þ Var(uf ) þ Var(um1). Dividing both sides byffi r( y1*, y2*) implies that Cov( y1, y2)/ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi r( y1*, y2*) is equal to Varðyà 1 ÞVarðy2 ÃÞ þ (Var(uf ) þ Var(um1))/r( y1*, y2*). While this expression still overestimates inequality (averaged over both years), Cov( y1, y2)/r( y1*, y2*) for the “head same� sample is p 0.287, which is about ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 12 percent less than the estimate of 0.328 based on Varðy1 ÞVarðy2 Þ. The 256 THE WORLD BANK ECONOMIC REVIEW result for the “50 percent threshold� sample indicates a 14 percent reduction. Thus 12 –14 percent of measured inequality, and perhaps much more, is due to measurement error. In summary, estimates that use second measurements as instruments �nd that at least 15 percent of observed mobility, and at least 12 percent of mea- sured inequality, in Vietnam is due to measurement error. Estimates using BMI as an instrument, which require somewhat stronger assumptions, indicate that 33–42 percent of observed mobility, and at least 12 –14 percent of observed in- equality, is measurement error. These �ndings are not in conflict, but if they were, one should probably place more con�dence in the second measurement results, since that method requires weaker assumptions. V. C O N C L U S I O N Vietnam’s rapid economic growth in the 1990s coincided with a modest rise in inequality. One could downplay the higher inequality by noting that panel data show substantial economic mobility, so that Vietnam’s long-run distribution of expenditure is more equal than its distribution in any year. Yet household survey data almost certainly overestimate mobility due to substantial measure- ment error in observed income or expenditures. This paper shows how instru- mental variable methods can provide estimates of economic mobility that reduce bias due to such measurement error. Application to Vietnamese data shows that at least 15 percent, and perhaps even a third, of observed mobility is due to measurement error. While these reduced estimates of mobility may disappoint those concerned about long-run inequality in Vietnam, there is one encouraging result: measurement error also implies that observed inequality overestimates actual inequality. Indeed, analysis of the Vietnamese data sug- gests that at least 12 percent, and perhaps much more, of observed inequality is only measurement error. While the instrumental variable methods proposed here can be used to esti- mate the impact of measurement error on measured mobility and inequality, any estimates are only as reliable as the assumptions required for valid instru- ments. For example, some may disagree that BMI is caused by per capita expenditures in the simple way shown in equations (10) and (11). Future work should develop better methods for assessing the quality of instruments “caused by� income, and future surveys should attempt to collect data that are second measurements of household income or expenditure. Finally, it is worth noting that more accurate measurement of mobility pro- vides important information for policy formulation. If mobility is very low, then the poor have low incomes every year, and many of them may be caught in a “poverty trap.� This implies that attempts to reduce poverty should focus on policies that increase the economic assets of the poor, two examples of which are land and human capital. In contrast, if mobility is high, the incomes of the poor (and of the nonpoor) fluctuate widely from year to year, which Glewwe 257 would reduce their welfare if they are unable to smooth their consumption over those years. In this case policies to reduce poverty should focus on redu- cing income fluctuations, such as designing interventions that could improve the operation of insurance and credit markets (with particular attention to the poor’s access to those markets), and providing “safety nets� for households that experience unanticipated negative shocks in their incomes. APPENDIX 1: PROOFS OF PROPOSITIONS OF R E L AT I V E M O B I L I T Y INDICES Proposition 1.Var(ln( y1 þ y2))/Var(ln( y1)) % 1 – m( y1, y2)/2 If Var(ln( y1 þ y2)) % Var(ln( y1))/4 þ Var(ln( y2))/4 þ Cov(ln( y1), ln( y2))/2, and Var(ln( y1)) % Var(ln( y2)), then the following holds: Varðlnðy1 þ y2 ÞÞ Varðlnðy2 ÞÞ=4 þ Varðlnðy1 ÞÞ=4 þ Covðlnðy1 Þ; lnðy2 ÞÞ=2 % Varðlnðy1 ÞÞ Varðlnðy1 ÞÞ % ð1=2Þ½1 þ rðlnðy1 Þ; lnðy2 Þފ ¼ ð1=2Þ½2 Àð1 À rðlnðy1 Þ; lnðy2 ÞÞފ ¼ 1 À mðy1 ; y2 Þ=2 To show that Var(ln( y1 þ y2)) % Var(ln( y1))/4 þ Var(ln( y2))/4 þ Cov(ln( y1), ln( y2))/2, de�ne p ¼ y2/y1. Then: Varðlnðy1 þ y2 ÞÞ ¼ Varðlnðy1 ð1 þ pÞÞÞ ¼ Varðlnðy1 Þ þ lnð1 þ pÞÞ ¼ Varðlnðy1 ÞÞ þ Varðlnð1 þ pÞÞ þ 2Covðlnðy1 Þ; lnð1 þ pÞÞ ¼ Varðlnðy1 ÞÞ þ Varðlnð1 þ pÞ À lnð2ÞÞ þ 2Covðlnðy1 Þ; lnð1 þ pÞ À lnð2ÞÞ ¼ Varðlnðy1 ÞÞ þ Varðlnðð1 þ pÞ=2ÞÞ þ 2Covðlnðy1 Þ; lnðð1 þ pÞ=2ÞÞ % Varðlnðy1 ÞÞ þ Varðð p À 1Þ=2Þ þ 2Covðlnðy1 Þ; ð p À 1Þ=2Þ ¼ Varðlnðy1 ÞÞ þ Varð p À 1Þ=4 þ Covðlnðy1 Þ; ð p À 1ÞÞ % Varðlnðy1 ÞÞ þ Varðlnð pÞÞ=4 þ Covðlnðy1 Þ; lnð pÞÞ ¼ Varðlnðy1 ÞÞ þ Varðlnðy2 Þ À lnðy1 ÞÞ=4 þ Covðlnðy1 Þ; lnðy2 Þ À lnðy1 ÞÞ ¼ Varðlnðy1 ÞÞ þ Varðlnðy2 ÞÞ=4 þ Varðlnðy1 ÞÞ=4 À Covðlnðy1 Þ; lnðy2 ÞÞ=2 þ Covðlnðy1 Þ; lnðy2 ÞÞ À Varðlnðy1 ÞÞ ¼ Varðlnðy2 ÞÞ=4 þ Varðlnðy1 ÞÞ=4 þ Covðlnðy1 Þ; lnðy2 ÞÞ=2: The third line holds because adding a constant to a variable affects neither its variance nor its covariance. The �fth and seventh lines use the approximation that for any number r close to zero ln(1 þ r) % r. Assume that y2/y1 ( ¼ p) is close to 1; then ( p 2 1)/2 is close to zero. 258 THE WORLD BANK ECONOMIC REVIEW Proposition 2.The mobility measure 1 2 r(f(y1),f(y2)), where r is the correlation coef�cient and f is a monotonically increasing function, satis�es the Atkinson-Bourguignon condition. Consider N persons with positive incomes in each of two time periods, 1 and 2. Let yi1 and yi2 denote the income of person i (i ¼ 1, 2, . . . N) in time periods 1 and 2, respectively. The Atkinson-Bourguignon condition states that for any two persons, i and j, such that the income of one is greater than the income of the other in both time periods, that is, ( yi1-yj1)( yi2-yj2) . 0, mobility increases if individuals i and j switch incomes in one of the two time periods. More formally, m( y10 ,y20 ) . m( y1, y2) if (i) yk1 ¼ yk10 and yk2 ¼ yk20 for all k = i, j; and (ii) either ( yi1 ¼ yj10 , yj1 ¼ yi10 , yi2 ¼ yi20 , yj2 ¼ yj20 ), i.e. a switch in income in the �rst period, or ( yi1 ¼ yi10 , yj1 ¼ yj10 , yi2 ¼ yj20 , yj2 ¼ yi20 ), an income switch in the second period. Without loss of generality, assume the switch is in time period 1, so y2 ¼ y20 and the only difference between y1 and y10 is that yi1 ¼ yj10 , yj1 ¼ yi10 . The cor- relation of f( y1) and f( y2) is: Covð f ðy1 Þ; f ðy2 ÞÞ r ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Varð f ðy1 ÞÞ Varð f ðy2 ÞÞ Clearly, Var( f( y1)) ¼ Var ( f( y10 )) and Var( f( y2)) ¼ Var ( f( y20 )) since the distributions of y1 and y2 are unchanged. Thus one need compare only Cov( f( y1), f( y2)) and Cov( f( y10 ), f( y2)). The only difference between Cov( f( y1),f( y2)) and Cov( f( y10 ),f( y2)) due to the income switch is that the term ( f( yi1) 2 f ðy1 Þ)( f( yi2) 2 f ðy2 Þ) þ ( f( yj1) 2 f ðy1 Þ)( f( yj2) 2 f ðy2 Þ) is in the former while ( f( yj1) 2 f ðy1 Þ)( f( yi2) 2 f ðy2 Þ) þ ( f( yi1) 2 f ðy1 Þ)( f( yj2) 2 f ðy2 Þ) is in the latter. Therefore, Covð f ðy1 Þ; f ðy2 ÞÞÀCovð f ðy01 Þ; f ðy2 ÞÞ ¼ ½ð f ðyi1 Þ À f ðy1 ÞÞð f ðyi2 Þ À f ðy2 ÞÞ þ ð f ðyj1 Þ À f ðy1 ÞÞð f ðy j2 Þ À f ðy2 Þފ À ½ð f ðyj1 Þ À f ðy1 ÞÞð f ðyi2 Þ À f ðy2 ÞÞ þ ð f ðyi1 Þ À f ðy1 ÞÞð f ðyj2 Þ À f ðy2 Þފ ¼ f ðyi1 Þf ðyi2 Þ À f ðy1 Þf ðyi2 ÞÀ f ðyi1 Þ f ðy2 Þþ f ðyj1 Þf ðyj2 Þ À f ðy1 Þf ðyj2 ÞÀ f ðyj1 Þ f ðy2 Þ À f ðyj1 Þf ðyi2 Þ þ f ðy1 Þf ðyi2 Þ þ f ðyj1 Þ f ðy2 ÞÀ f ðyi1 Þf ðyj2 Þ þ f ðy1 Þf ðyj2 Þþ f ðyi1 Þ f ðy2 Þ ¼ ð f ðyi1 Þ À f ðyj1 ÞÞð f ðyi2 Þ À f ðyj2 ÞÞ . 0: The inequality holds since the Atkinson-Bourguignon condition implies ( yi1 2 yj1)( yi2 2 y2j ) . 0, and monotonic transformations of y1 and y2 do not Glewwe 259 alter the signs of yi1 2 yj1 and yi2 2 yj2. This then implies that r(( f( y1),f( y2)) . r( f( y10 ),f( y2)), and so 1 2 r( f( y1),f( y2)) , 1 2 r( f( y10 ),f( y2)). APPENDIX 2: PROOFS OFPROPOSITIONS REG ARD ING INSTR UMENTAL VA R I A B L E E S T I M A T I O N Proposition 1. (Second Measurements as Instruments). If the instrumental vari- ables z1 and z2 are second measurements of y1 * and y2 *, and the measurement errors have the following structure, allowing them to be correlated over time and across measurements: y1 ¼ y1 à þ uf þ ut1 þ um1 þ ey1 ðA:1Þ y2 ¼ y2 à þ uf þ ut2 þ um1 þ ey2 ðA:2Þ Ã z1 ¼ y1 þ uf þ ut1 þ um2 þ ez1 ðA:3Þ z2 ¼ y2 à þ uf þ ut2 þ um2 þ ez2 ðA:4Þ where all the components of each measurement error are uncorrelated with y1 *, y2 * and all other components, then the variances of um1, um2, ey1, ey2, ez1 and ez2 are all identi�ed, but the remaining variances, and r( y1*, y2*), are not identi�ed. Equations (A.1)–(A.4) have 11 unobserved variables ( y1*, y2*, uf, ut1, ut2, um1, um2, ey1, ey2, ez1, and ez2) and four observed variables ( y1, y2, z1 and z2). All measurement errors are uncorrelated with each other and with y1* and y2*, so there are 11 unobserved variances but only one nonzero unobserved covari- ance, Cov( y1*, y2*). The four observed variances and six observed covariances are related to the unobserved variances and covariances as Varðy1 Þ ¼ sy1à 2 þ suf 2 þ sut1 2 þ sum1 2 þ sey1 2 ðA:5Þ Varðy2 Þ ¼ sy2à 2 þ suf 2 þ sut2 2 þ sum1 2 þ sey2 2 ðA:6Þ Varðz1 Þ ¼ sy1à 2 þ suf 2 þ sut1 2 þ sum2 2 þ sez1 2 ðA:7Þ Varðz2 Þ ¼ sy2à 2 þ suf 2 þ sut2 2 þ sum2 2 þ sez2 2 ðA:8Þ Covðy1 ; z1 Þ ¼ sy1à 2 þ suf 2 þ sut1 2 ðA:9Þ Covðy2 ; z2 Þ ¼ sy2à 2 þ suf 2 þ sut2 2 ðA:10Þ Covðy1 ; y2 Þ ¼ sy1à ;y2à þ suf 2 þ sum1 2 ðA:11Þ Covðy1 ; z2 Þ ¼ sy1à ;y2à þ suf 2 ðA:12Þ Covðz1 ; z2 Þ ¼ sy1à ;y2à þ suf 2 þ sum2 2 ðA:13Þ Covðz1 ; y2 Þ ¼ sy1à ;y2à þ suf 2 : ðA:14Þ 260 THE WORLD BANK ECONOMIC REVIEW Equations (A.12) and (A.14) are equal, so there are only nine independent equations and 11 independent variables. The solutions for the variances that are identi�ed are sum1 2 ¼ Covðy1 ; y2 Þ À ½Covðy1 ; z2 Þ or Covðz1 ; y2 ފ ðA:15Þ sum2 2 ¼ Covðz1 ; z2 Þ À ½Covðy1 ; z2 Þ or Covðz1 ; y2 ފ ðA:16Þ sey1 2 ¼ Varðy1 Þ À Covðy1 ; z1 Þ À Covðy1 ; y2 Þ þ ½Covðy1 ; z2 Þ or Covðz1 ; y2 ފ ðA:17Þ sey2 2 ¼ Varðy2 Þ À Covðy2 ; z2 Þ À Covðy1 ; y2 Þ þ ½Covðy1 ; z2 Þ or Covðz1 ; y2 ފ ðA:18Þ sez1 2 ¼ Varðz1 Þ À Covðy1 ; z1 Þ À Covðz1 ; z2 Þ þ ½Covðy1 ; z2 Þ or Covðz1 ; y2 ފ ðA:19Þ sez2 2 ¼ Varðz2 Þ À Covðy2 ; z2 Þ À Covðz1 ; z2 Þ þ ½Covðy1 ; z2 Þ or Covðz1 ; y2 ފ: ðA:20Þ 2 2 One cannot solve for sy1*,y2*, sy1* , sy2*, s2 2 2 uf, sut1 and sut2, because these 2 2 2 2 always appear as the following three sums, sy1* þ suf þ sut1, sy2* þ s2 2 uf þ sut2 2 and sy1*,y2* þ suf, in the equations in which they appear, and knowledge of these sums does not allow one to solve for any of these six components. Proposition 2. (Instruments that “Cause� y1 * and y2 *). If z1 causes y1 * in the sense that y1 * ¼ g1 þ d1z1 þ v1, where z1 is uncorrelated with v1, and z2 causes y2 * in the sense that y2 * ¼ g2 þ d2z2 þ v2, where z2 is uncorrelated with v2, and Cov(z1, v2) ¼ Cov(z2, v1) ¼ 0, then d1 and d2 are identi�ed, but the var- iances of y1 *, y2 *, u, ey1, ey2, u1 and u2 are all not identi�ed, Cov( y1 *, y2 *) and Cov(v1, v2) are not identi�ed, and plim[rIV( y1, y2)] ¼ r(z1, z2). The basic equations in the case where the instruments cause y1* and y2* are: y1 ¼ y1 à þ u þ ey1 ðA:21Þ y2 ¼ y2 à þ u þ ey2 ðA:22Þ y1 à ¼ g1 þ d1 z1 þ v1 ; which implies z1 ¼ Àg1 =d1 þ y1 à =d1 À v1 =d1 ðA:23Þ y2 à ¼ g2 þ d2 z2 þ v2 ; which implies z2 ¼ Àg2 =d2 þ y2 à =d2 À v2 =d2 : ðA:24Þ Assume that u, ey1 and ey2 have constant variances and are uncorrelated with all the other variables, and that Cov(z1, v1) ¼ Cov(z2, v2) ¼ Cov(z1, v2) ¼ Cov(z2, v1) ¼ 0. Then the four observable variables y1, y2, z1 and z2 have the Glewwe 261 following four variances and six covariances: Varðy1 Þ ¼ sy1à 2 þ su 2 þ sey1 2 ðA:25Þ Varðy2 Þ ¼ sy2à 2 þ su 2 þ sey2 2 ðA:26Þ Varðz1 Þ ¼ ð1=d1 2 Þsy1à 2 þ ð1=d1 2 Þsv1 2 À ð2=d1 2 ÞCovðy1 à ; v1 Þ ¼ ð1=d1 2 Þsy1à 2 À ð1=d1 2 Þsv1 2 ðA:27Þ Varðz2 Þ ¼ ð1=d2 2 Þsy2à 2 þ ð1=d2 2 Þsv2 2 À ð2=d2 2 ÞCovðy2 à ; v2 Þ ¼ ð1=d2 2 Þsy2à 2 À ð1=d2 2 Þsv2 2 ðA:28Þ Covðy1 ; y2 Þ ¼ sy1à ;y2à þ su 2 ðA:29Þ Covðy1 ; z1 Þ ¼ ð1=d1 Þsy1à 2 À ð1=d1 ÞCovðy1 à ; v1 Þ ¼ ð1=d1 Þsy1à 2 À ð1=d1 Þsv1 2 ðA:30Þ Covðy1 ; z2 Þ ¼ ð1=d2 Þsy1à ;y2à À ð1=d2 ÞCovðy1 à ; v2 Þ ¼ ð1=d2 Þsy1à ;y2à À ð1=d2 Þsv1;v2 ðA:31Þ Covðy2 ; z1 Þ ¼ ð1=d1 Þsy1à ;y2à À ð1=d1 ÞCovðy2 à ; v1 Þ ¼ ð1=d1 Þsy1à ;y2à À ð1=d1 Þsv1;v2 ðA:32Þ Covðy2 ; z2 Þ ¼ ð1=d2 Þsy2à 2 À ð1=d2 ÞCovðy2 à ; v2 Þ ¼ ð1=d2 Þsy2à 2 À ð1=d2 Þsv2 2 ðA:33Þ Covðz1 ; z2 Þ ¼ ð1=ðd1 d2 ÞÞ½sy1à ;y2à À Covðy1 à ; v2 Þ À Covðy2 à ; v1 Þ þ sv1;v2 Š ¼ ð1=d1 d2 Þ½sy1à ;y2à À sv1;v2 Š: ðA:34Þ Combining (A.27) and (A.30) solves for d1, and combining (A.28) and (A.29) yields d2: d1 ¼ Covðy1 ; z1 Þ=Varðz1 Þ; d2 ¼ Covðy2 ; z2 Þ=Varðz2 Þ: ðA:35Þ 2 2 One cannot solve for anything else as there are 9 unknowns (sy1* , sy2* , s2 u, 2 2 2 2 sey1, sey2, sv1, sv2, sy1*,y2*, sv1,v2) but only 6 independent equations. There are only 6 independent equations because once d1 and d2 are known, (A.30) repeats (A.27), (A.32) and (A.34) repeat (A.31) and (A.33) repeats (A.28). Thus the only independent equations are (A.25)–(A.29) and (A.31), and all but one, (A.29), contain a variable not found in any of the other equations, along with at least on other unknown variable, which precludes solving for any subset of equations in the system (all possible subsets will have more unknowns 262 THE WORLD BANK ECONOMIC REVIEW than equations). Finally, inserting (A.30)–(A.33) into equation (6) in the text yields sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1=d2 Þ½sy1 Ã;y2 à À sv1;v2 Šð1=d1 Þ½sy1 Ã;y2 à À sv1;v2 Š plim½rIV ðy1 ; y2 ފ ¼ 2 2 ð1=d2 Þ½s2 y2 à À sv2 Šð1=d1 Þ½sy1 à À sv1 Š 2 ðA:36Þ Covðz1 ; z2 Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Varðz1 ÞVarðz2 Þ Proposition 3. (Instruments that are “Caused By� y1 * and y2 *). If the instru- ment z1 is caused by y1 * in the sense that z1 ¼ k1 þ p1y1* þ w1, where y1 * is uncorrelated with w1, and the instrument z2 is caused by y2 * in the sense that z2 ¼ k2 þ p2y2* þ w2, where y2 * is uncorrelated with w2, and Cov( y1 *,w2) ¼ Cov( y2 *,w1) ¼ 0, then plim[rIV( y1, y2) ¼ r( y1*,y2*). The measurement error structure for y1 and y2 is the same as in (A.21) and (A.22). The causal equations are z1 ¼ k1 þ p1 y1 à þ w1 ðA:37Þ z2 ¼ k2 þ p2 y2 à þ w2 : ðA:38Þ Assume that u, ey1 and ey2 are uncorrelated with each other and with y1*, y2*, w1, and w2. There are seven unobserved variables, so there are seven unob- served variances but only two unobserved covariances, Cov( y1*, y2*) and Cov(w1, w2). The relationships between the observed variances and covariances and the unobserved variances and covariances are: Varðy1 Þ ¼ sy1à 2 þ su 2 þ sey1 2 ðA:39Þ Varðy2 Þ ¼ sy2à 2 þ su 2 þ sey2 2 ðA:40Þ Varðz1 Þ ¼ p1 2 sy1à 2 þ sw1 2 ðA:41Þ Varðz2 Þ ¼ p2 2 sy2à 2 þ sw2 2 ðA:42Þ Covðy1 ; y2 Þ ¼ sy1à ;y2à þ su 2 ðA:43Þ Covðy1 ; z1 Þ ¼ p1 sy1à 2 ðA:44Þ Covðy1 ; z2 Þ ¼ p2 sy1à ;y2à ðA:45Þ Covðy2 ; z1 Þ ¼ p1 sy1à ;y2à ðA:46Þ Covðy2 ; z2 Þ ¼ p2 sy2à 2 ðA:47Þ Covðz1 ; z2 Þ ¼ p1 p2 sy1à ;y2à þ sw1;w2 ðA:48Þ Using (A.44)–(A.47), one can estimate r( y1*, y2*). Indeed, inserting them in equation (6) in the text shows that plim[rIV( y1, y2) ¼ r( y1*, y2*). Glewwe 263 This result is serendipitous since none of the unobserved terms can be solved for without further assumptions. For example, if s2 u ¼ 0, or sw1,w2 ¼ 0, then one can solve for all of the unobserved variances and covariances, yet neither of these assumptions is credible. REFERENCES Abowd, John, and David Card. 1989. “On the Covariance Structure of Earnings and Hours Changes.� Econometrica 57(2):411–445. Antman, Francisca, and David McKenzie. 2007. “Earnings Mobility and Measurement Error: A Pseudo-Panel Approach.� Economic Development and Cultural Change 56(1):125 –161. Atkinson, Anthony, and Franc ¸ ois Bourguignon. 1982. “The Comparison of Multidimensional Distributions of Economic Status.� Review of Economic Studies 49(2): 183–201. Bound, John, Charles Brown, and Nancy Mathiowetz. 2001. “Measurement Error in Survey Data.� In J. Heckman, and E. Leamer, eds., Handbook of Econometrics: Volume 5. Amsterdam: Elsevier. Bound, John, and Alan Krueger. 1991. “The Extent of Measurement Error in Longitudinal Earnings Data: Do Two Wrongs Make a Right?� Journal of Labor Economics 9(1):1– 24. Bowden, Roger, and Darrell Turkington. 1984. Instrumental Variables. New York: Cambridge University Press. Chesher, Andrew, and Christian Schluter. 2002. “Welfare Measurement and Measurement Error.� Review of Economic Studies 69(2): 357– 378. Deaton, Angus. 1997. The Analysis of Household Surveys: A Microeconometric Approach to Development Policy. Baltimore, MD: Johns Hopkins University Press. Dragoset, Lisa, and Gary Fields. 2006. “U.S. Earnings Mobility: Comparing Survey-Based and Administrative-Based Estimates.� ECINEQ Working Paper 2006-55. Society for the Study of Economic Inequality. Fields, Gary, and Efe Ok. 1999a. “The Measurement of Income Mobility: An Introduction to the Literature.� In J. Silber, ed., Handbook of Inequality Measurement. New York: Springer. ———. 1999b. “Measuring Movement of Incomes.� Economica 66: 455– 471. Fields, Gary, Robert Duval Hernandez, Samuel Freije Rodrı ´guez, and Maria Laura Sa ´ nchez Puerta. 2007. “Earnings Mobility in Argentina, Mexico and Venezuela: Testing the Divergence of Earnings and the Symmetry of Mobility Hypotheses.� ILR School. Cornell University, Ithaca, NY. Gardiner, Karen, and John Hills. 1999. “Policy Implications of New Data on Economic Mobility.� Economic Journal 109(453): F91– F111. Glewwe, Paul, Nisha Agrawal, and David Dollar. 2004. Economic Growth, Poverty, and Household Welfare in Vietnam. The World Bank, Washington, D.C. Gottschalk, Peter. 1997. “Inequality, Income Growth and Mobility: The Basic Facts.� Journal of Economic Perspectives 11(2):21–40. Gottschalk, Peter, and Minh Huynh. 2010. “Are Earnings Inqequality and Mobility Overstated? The Impact of Non-Classical Measurement Error.� Review of Economics and Statistics 2(2): 302– 315. Gottschalk, Peter, and Enrico Spolaore. 2002. “On the Evolution of Economic Mobility.� Review of Economic Studies 69(1):191–208. Hart, Peter. 1981. “The Statics and Dynamics of Income Distributions: A Survey.� In N. Klevmarken, and J. Lybeck, eds., The Statics and Dynamics of Income. Tieto: Clevedon. Lee, Nayoung, Geert Ridder, and John Strauss. 2010. “Estimating Poverty Transition Matrices with Noisy Data.� Department of Economics. University of Southern California. Lewbel, Arthur. 1997. “Constructing Instruments for Regressions with Measurement Error when no Additional Data are Available, with an Application to Patents and R & D.� Econometrica 65(5):1201 –1214. 264 THE WORLD BANK ECONOMIC REVIEW Maasoumi, Esfanidar, and Mark Trede. 2001. “Comparing Income Mobility in Germany and the United States Using Generalized Entropy Mobility Measures.� Review of Economics and Statistics 83(3):551–559. Meghir, Costas, and Luigi Pistaferri. 2004. “Income Variance Dynamics and Heterogeneity.� Econometrica 72(1):1–32. ¨ rn-Steffen. 1995. “Measurement Error and Earnings Dynamics: Some Estimates from the Pischke, Jo PSID Validation Study.� Journal of Business and Economic Statistics 13(3):305–314. Shorrocks, Anthony. 1993. “On the Hart Measure of Income Mobility.� In M. Casson, and J. Creedy, eds., Industrial Concentration and Economic Inequality. Cheltenham, U.K.: Edward Elgar. World Bank. 1995. “Vietnam Living Standards Survey: Basic Information Document.� Development Research Group.The World Bank, Washington, D.C. http://www.worldbank.org/lsms/lsmshome. html. ———. 2000. “1997-98 Vietnam Living Standards Survey: Basic Information Document.� Development Research Group. The World Bank, Washington, D.C. http://www.worldbank.org/lsms/ lsmshome.html Inequality of Opportunity in Egypt Nadia Belhaj Hassine The article evaluates the contribution of inequality of opportunity to earnings inequal- ity in Egypt and analyzes its evolution across three time periods and different popu- lation groups. It provides parametric and nonparametric estimates of a lower bound for the degree of inequality of opportunity for wage and salary workers. On average, the contribution of opportunity-shaping circumstances to earnings inequality declined from 22 percent in 1988 to 15 percent in 2006. Levels of inequality of opportunity were fairly stable while earnings differentials widened markedly, leading to a decline in the share of inequality attributable to opportunities. Father’s background and geo- graphic origins had the largest effect on earnings, although the impact of mother’s edu- cation has risen in recent years. The degree of inequality of opportunity did not differ signi�cantly by gender or rural –urban area, although the incidence was lower for men and for rural areas. The results indicate an increase in inequality of opportunity across age groups, but there is some evidence that opportunity differentials have been declin- ing for the oldest generation. JEL codes: D31, D63 Political demands for greater equity in Arab societies reach beyond poverty reduction to the entire spectrum of income and wealth distribution. Popular concerns for fairness and justice are generally about inequality of outcomes more than inequality of opportunity, with social inequalities often measured by examining the degree of income inequality. However, strategies for directly equalizing outcomes may come at the cost of weakening incentives for individ- ual effort, investment, and innovation. Inequality of outcomes, such as in income or education, reflects differences in effort and circumstances. Inequality stemming from circumstances, such as gender, ethnicity, family background, and place of birth, is widely considered unfair and deserving of attention from policymakers (Roemer 1998; Roemer and others 2003; Peragine 2004).1 Constraints on access to basic services and Nadia Belhaj Hassine (nbelhaj@erf.org.eg) is senior economist at the Economic Research Forum, Cairo, Egypt. The author thanks Elisabeth Sadoulet and three anonymous referees for insightful and helpful comments and suggestions. 1. According to Roemer (1998), outcomes are a consequence of at least two sets of factors: “circumstances,� which are factors beyond a person’s control, and “effort,� which are the actions a person takes and can be held accountable for. Inequality of opportunity occurs when the distribution of outcomes depends on the individual’s circumstances. THE WORLD BANK ECONOMIC REVIEW, VOL. 26, NO. 2, pp. 265 –295 doi:10.1093/wber/lhr046 Advance Access Publication October 28, 2011 # The Author 2011. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 265 266 THE WORLD BANK ECONOMIC REVIEW resources that are beyond an individual’s control perpetuate the lack of capa- bilities and opportunities for large parts of society (Bourguignon, Ferreira, and Walton 2007; Elbers and others, 2008). Such disparities in opportunity may discourage effort by individuals, waste productive potential, and contribute to social instability and institutional frailty, possibly dampening economic growth prospects (Ali 2007). The pursuit of greater equity through greater equality of opportunities could enhance economic ef�ciency. Equality of opportunity is broadly concerned with equal rewards for individual effort irrespective of prior circumstances and could lead to more ef�cient use of human and physical resources, improve social cohesion, and contribute to sustainable development (Roemer 1998). Empirical work on inequality of opportunity, though comparatively recent, is developing rapidly. Several parametric and nonparametric techniques have been proposed to measure it. This literature, initially concerned mainly with Western Europe and Latin America, has been extended recently to sub-Saharan Africa and Turkey (see, for example, Bourguignon, Ferreira, and Mene ´ ndez 2007; Lefranc, Pistolesi, and Trannoy 2008; Cogneau and Mesple-Somps 2008; Barros and others (2009), Checchi and Peragine 2010; Checchi, Peragine and Serlenga 2010; and Ferreira, Gignoux, and Aran 2011). These studies helped establish empirically the extent to which people in a given society face different opportunities. However, empirical applications of the concept of inequality of opportunity are scarce, and there is little or no research addressing inequalities of opportunity in Arab countries. This lack of research is attributable in large part to the limited availability of household income and expenditure surveys and to the paucity of observations on individ- uals’ circumstances. To �ll some of this knowledge gap, this article assesses the degree of oppor- tunity inequality in earnings inequality in Egypt, drawing on data from the 1988 Labor Force Sample Survey, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. These are among the few surveys in the Arab region with information on family background. While World Bank estimates show that inequality is moderate in Egypt com- pared with other Arab countries, it is persistent.2 Unevenness in the distri- bution of opportunities across regions, professional categories, or socioeconomic classes could contribute to the inequality and explain some of its persistence. Analysis of inequality of opportunity in Egypt can improve understanding of the institutional and economic mechanisms underpinning inequality and thus inform public actions to compensate for circumstances- based disadvantages, eliminate inequality traps, and foster development. Reducing inequality of opportunity would contribute to both social improve- ment and greater equality in income and wealth distribution. 2. See Povcal website: http://iresearch.worldbank.org/PovcalNet/povcalNet.html. Belhaj Hassine 267 The study applies the parametric model proposed by Bourguignon, Ferreira, and Mene ´ ndez (2007) and the nonparametric methodology suggested by Checchi and Peragine (2010) to measure the contribution of inequality of opportunity to earnings inequality in Egypt. Inequality of opportunity indices are computed for Egyptian wage and salary workers for 1988, 1998, and 2006. The sample is also split by gender, area, and age. The results reveal that the share of earnings inequality attributable to circumstances fell, on average, from 22 percent in 1988 to 15 percent in 2006. The decline reflects stability in the levels of inequality of opportunity combined with rising total inequality over 1988–2006. These are lower bound estimates of the true share of opportunity inequality. They would likely be much higher if data for more circumstance variables were available and if other indicators of economic welfare, such as household consumption or income, were used as a base. The analysis of area differences reveals a somewhat lower incidence of inequality of opportunity in rural than in urban areas. Disaggregation by gender suggests similar opportunity inequality for men and women, but with a higher incidence for women. Younger cohorts experienced a higher incidence of opportunity inequality than did the oldest ones in recent years, but a lower incidence in the 1990s. The article is organized as follows: section I describes the empirical model and the procedures used to infer inequality of opportunity. Section II provides an overview of the data. Section III reports the main results, and section IV summarizes the essential �ndings and conclusions. I. THE EMPIRICAL MODEL Estimation of the degree of inequality of opportunity associated with a given distribution of earnings (outcomes) is based on the frameworks of Bourguignon, Ferreira, and Mene ´ ndez (2007) and Checchi and Peragine (2010). The determinants of an individual’s earnings, yi, are separated into a set of circumstance variables, denoted by the vector Ci; efforts variables, denoted by the vector Ei; and unobserved factors vi. The earnings function can be speci�ed as ð1Þ yi ¼ f ðCi ; Ei ; vi Þ i : 1 . . . N: The circumstance variables are economically exogenous since they are outside an individual’s control, but effort factors may be endogenous to circumstances since an individual’s actions may be influenced by ethnicity, parental back- ground, and so on. Equality of opportunity occurs, in Roemer’s (1998) sense, when earnings are independently distributed from circumstances. This independence implies that circumstances have no direct causal effect on earnings and no causal impact on 268 THE WORLD BANK ECONOMIC REVIEW effort. The degree of opportunity inequality can therefore be determined by the extent to which the conditional distribution of earnings on circumstances, F(yjC), differs from F(y). Nonparametric Method Parametric and nonparametric methods can be used to estimate inequality of opportunity indexes. The nonparametric approach, suggested by Checchi and Peragine (2010), is based on two alternative partitions of the total population, based on two alternatives for computing inequality of opportunity.3 The �rst partition divides the population into groups by circumstance categories, with the members of each group, named type, endowed with similar circumstances. The second partition, based on effort, splits the population into subsets (tranches) of individuals who exert the same degree of effort. Since effort cannot be observed, a person’s effort is measured, following Roemer (1998), by his or her quantile in the income or earnings distribution for the individual’s type subgroup. So, all individuals at the same quantile of their types distri- butions of earnings are considered to be exerting the same level of effort. Although both methods are plausible for modeling equality of opportunity, they can yield different results. Since there is no obvious reason for preferring one approach over another, estimates are provided using both methods.4 The nonparametric approach has substantial advantages for predicting the share of inequality due to opportunities, including its computational simplicity and flexibility due to the absence of a functional form speci�cation. Its main drawback is that it requires large data sets for accuracy. The greater the set of circumstances, the higher the number of cells in the partition and the higher the number of cells with zero or few observations. Moreover, this approach does not permit estimating partial effects of circumstances, holding all else con- stant (Ferreira and Gignoux forthcoming; Checchi, Peragine, and Serlenga 2010). TYPES. In the �rst partition, inequality of opportunity is given by inequality between types.5 This inequality can be assessed by applying a smoothing trans- formation using a constant reference value of effort E  , namely, f ðCi ; E  Þ 8i. The smoothed distribution can be represented by the average income, fmcg, of a given type, identi�ed by c. All within-type inequality is eliminated in the 3. In this approach, the unobservable term v is confounded with E, and the individual is considered responsible for any random component that is not included in the vector of circumstances that may affect his/her outcome (Checchi and Peragine 2010). 4. See Checchi and Peragine (2010) and Checchi, Peragine, and Serlenga (2010) for details on the types and tranches approaches. 5. This measure is related the ex ante view of equality of opportunity, which focuses on the differences between the outcome prospects of individuals with similar circumstances as opposed to the second method, which is related to the ex post view of equality of opportunity and focuses on outcome inequalities among individuals who exert the same effort (Flaurbaey and Peragine 2009; Checchi, Peragine, and Serlenga 2010) Belhaj Hassine 269 smoothed distribution {mc} by replacing each individual’s earnings with type- speci�c mean earnings mc. Thus the inequality in {mc} captures the inequality due to circumstances only. Then, given an inequality measure I, the opportu- nity share of earnings inequality can be de�ned as: Iðfmc gÞ ð2Þ ud types ¼ : IðFðyÞÞ Inequality of opportunity can also be measured indirectly using a standardized m c distribution obtained by replacing each person’s earnings yc c i with zi ¼ mc yi , where yc i is the earnings of individual i in type c and m is overall mean earnings. The standardization removes all between-types inequality, leaving only within-type inequality, or inequality due to effort. Hence, the share of inequal- ity due to unequal opportunities can be computed residually by r utypes ¼ 1 À Iðfzc i gÞ=IðFðyÞÞ. The direct and residual methods can yield different results; the only inequal- ity measure for which the two methods give the same results is the mean log deviation (GE(0)), which has a path-independent decomposition when the arithmetic mean is used as the reference income or earnings (Foster and Shneyerov 2000). TRANCHES. In the second partition, inequality of opportunity can be assessed by focusing on inequality within groups with similar effort levels. As pre- viously, a smoothing transformation is applied to eliminate all inequality within tranches. The part of inequality due to unequal opportunities can be expressed as r Iðfme gÞ ð3Þ utranches ¼1À IðFðyÞÞ where {me} is a smoothed distribution in which each individual’s earnings is replaced by tranche-speci�c mean earnings. Inequality of opportunity can also be computed directly by suppressing all between-tranches inequality. As pre- viously, a standardized distribution is obtained by reweighting all tranche dis- tributions to equalize the means of the different effort groups. Each person’s e;c e;c earnings within a tranch e of a type c, ye,c i , is replaced by zi ¼ m=me yi . Inequality of opportunity can then be captured directly by: ÀÈ e;c É� ud tranches ¼ I zi =IðFðyÞÞ. As previously stated, when the mean log deviation inequality index is used, the residual and direct methods yield the same opportunity inequality measures. Parametric Method The parametric method, which is less data-demanding, can be used to measure inequality of opportunity and the effect of individual circumstances. Evaluating the extent of inequality of opportunity using parametric and nonparametric 270 THE WORLD BANK ECONOMIC REVIEW decompositions and comparing the estimates allows checking the consistency of the results and also indicates the plausible range of true opportunity inequality. The parametric analysis follows the work of Bourguignon, Ferreira, and Mene ´ ndez (2007), estimating opportunity inequality as the difference between observed earnings inequality and the inequality that would prevail if there were no differences in circumstances. Let F~ ð~ yÞ be the counterfactual earnings distribution when circumstances are identical for all individuals. The opportunity share of earnings inequality can be de�ned as À � I F~ ð~ yÞ ð4Þ QP ¼ 1 À : IðFðyÞÞ The �rst step in computing QP is to estimate a speci�c model of equation (1). Following Bourguignon, Ferreira, and Mene ´ ndez (2007), the earnings function is expressed in the following log-linear form: lnðyi Þ ¼ Ci a þ Ei b þ vi ð5Þ Ei ¼ ACi þ 1i where a and b are vectors of coef�cients, A is a matrix of coef�cients specify- ing the effects of the circumstance variables on effort, and 1i is an error term. Model (5) can be expressed in reduced form as ð6Þ lnðyi Þ ¼ Ci d þ hi where d ¼ a þ bA and hi ¼ vi þ 1i b. Inequality of opportunity can be measured using equation (4), where the counterfactual distribution is obtained by replacing yi with its estimated value from equation (6), which can be expressed as ~yi ¼ expðCd ^ þh ^i Þ.6 The parametric approach allows estimation of the partial effects of one or some circumstance variables on earnings, while controlling for the others by simulating distributions such as   yij ¼ exp C ~  jd ^ h=j þ h ^ j þ Ch=j d ^i ; ~ ð~ where F y j Þ is the counterfactual earnings distribution obtained by keeping cir- cumstance C j constant. 6. Checchi, Peragine, and Serlenga (2010) computed parametric counterparts for the nonparametric measures of opportunity inequality calculated using both the types and tranches approaches. Here the analysis is limited to estimation of the parametric alternatives to inequality of opportunity indexes measured by the types approach, assuming implicitly all unobserved variance in h as the only true source of effort. Belhaj Hassine 271 The inequality share speci�c to circumstance j can be computed by: À À j �� I F~ ~ y j QP ¼ 1 À : IðFðyÞÞ I I . D ATA The empirical analysis uses data from the Egypt Labor Force Sample Survey of 1988 (LFSS 88), the Egypt Labor Market Survey of 1998 (ELMS 98), and the Egypt Labor Market Panel Survey of 2006 (ELMPS 06), carried out by the Economic Research Forum and the Central Agency for Public Mobilization and Statistics.7 The surveys were conducted on nationally representative samples of households, and methodology and data were selected to ensure comparability. The surveys include information on household characteristics; individual earnings, education, and employment status; and parents’ education, occupation, and employment status. Individuals’ current earnings, measured as real monthly earnings from all occupations, was the measured outcome. The analytical sample was restricted to individuals aged 15 –65 years old with posi- tive earnings. Computing the opportunity share of earnings inequality for the entire country is important to the design of equal-opportunity policies, but it fails to capture the differential intensity of opportunity inequality across areas and population groups. Disparities in labor market participation between gender and age groups and differences in labor market conditions between rural and urban areas influence the distribution of earnings and could affect inequality of opportunity measures. Table 1 reports the labor market participation and the rates of employment with positive earnings by area gender, and age group for each survey round. Large gender differences are observed in labor force partici- pation. Men’s participation greatly exceeds women’s. Labor force participation is also larger in urban areas and for the mid-age cohort.8 The percentage of individuals of working age in employment with positive earnings is much higher in urban than in rural areas and is much larger for men than for women. The labor market participation rate increases between 1998 and 2006, particularly for women, and there is a slight rise in the employment rate with positive earnings for women and a decline for the other population subgroups. Because heterogeneity in population composition and in labor force partici- pation may distort the aggregate picture of inequality of opportunity, opportu- nity inequality indices are also computed for population subgroups. Each 7. For more details about the surveys, see Assaad (2002). 8. Labor market participation decisions may depend on an individual’s circumstances. Sample selection bias, especially likely for women, is not addressed here because of the complexity of the procedure for correcting for this bias. 272 THE WORLD BANK ECONOMIC REVIEW T A B L E 1 . Rates of Labor Force Participation and Employment (percent) Labor force Employment rate with Survey year Subgroup participation rate positive earnings 1988 Rural na 36.2 Urban na 72.2 Women na 28.4 Men 78.9 62.8 Ages 15– 29 na 53.2 Ages 30– 44 na 56.7 Ages 45– 65 na 39.2 Total na 50.8 1998 Rural 48.1 42.7 Urban 50.3 71.9 Women 22.1 27.4 Men 76.2 70.3 Ages 15– 29 41.2 54.0 Ages 30– 44 61.1 59.2 Ages 45– 65 49.2 47.4 Total 49.1 54.0 2006 Rural 54.1 42.0 Urban 53.9 71.7 Women 27.8 28.9 Men 80.9 66.8 Ages 15– 29 44.9 53.1 Ages 30– 44 68.2 58.0 Ages 45– 65 55.3 46.4 Total 54.0 53.1 na is not available; only data on labor force participation rate with extended market de�nition are available. Source: Author’s calculation based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. survey sample is partitioned by area of birth (urban, rural), gender, and age (15–29, 30–44, and 45–65).9 Parametric and nonparametric decompositions are applied for each popu- lation subgroup and for the entire population in each survey year. The data from the three surveys are also pooled to form a single dataset, and the same procedures are applied to the entire sample. This might clarify the extent and evolution of inequality of opportunity and its importance in shaping earnings differences across population subgroups and time. Sample sizes are 4,258 for LFSS 88, 4,740 for ELMS 98, and 7,501 for ELMPS 06. Missing information on father’s occupation and mother’s employ- ment status reduced the samples to 4,135, 4,048, and 6,499 economically active individuals who are representative of the Egyptian workforce. 9. Region of birth is included because region of residence might be endogenous. However, the number of migrants is quite small, suggesting a limited potential bias. The number of individuals living in a different region than they were born in was 134 for LFSS 88, 323 for ELMS 98, and 503 for ELMPS 06. Belhaj Hassine 273 Nonresponses on family background are likely to be nonrandom and there- fore to introduce sample selection bias. Missing information on parental employment and occupation status was 7–9 percent. It was lower for the oldest age group and for LFSS 88. The effect of selective nonresponse is investigated by comparing the composition of the �nal sample with the sample including indi- viduals with missing information. Although nonresponse was found to be stat- istically selective, the two samples were highly similar in almost every respect. As a robustness check, the results were compared with the coef�cient esti- mates from an earnings regression on all the circumstance variables for which there are no missing observations, run in both the full sample and the �nal sample used in the empirical analysis. The results, reported in table 2, suggest that selective nonresponse did not introduce large biases, since the coef�cients do not differ statistically (at the 95 percent con�dence level) between the �nal and full sample in each survey year and in the pooled survey data. The circumstance variables available in the three surveys are father’s and mother’s education and employment, father’s occupation status when the indi- vidual was age 15, and region of birth (Metropolitan, Lower Egypt, or Upper Egypt).10 Using all these variables in the nonparametric analysis is problematic because of an insuf�cient number of observations, which would result in a large number of empty or small cells. The quality of the nonparametric inequality of opportunity measures depends on the quality of the estimates for the type/tranche-speci�c means. The sampling variance of these means could be very large for cells with few observations and would cause an upward bias in the nonparametric estimates of opportunity inequality (Ferreira and Gignoux forthcoming). Therefore, both the parametric and the nonparametric decompositions consider only father’s and mother’s education, father’s occupation status, and the individual’s region of birth. The number of categories for each variable was restricted to three or fewer in order to reduce the number of circumstance groups. Gender is also used as a circumstance variable when the sample is not subdivided by gender. Father’s education is coded into three categories (none, primary and prepara- tory, and secondary and tertiary) and mother’s into two (none, and primary and more).11 Father’s occupation is coded as skilled agricultural workers and other. Region of birth is coded as Metropolitan (Greater Cairo, Alexandria, and Suez), Lower Egypt, and Upper Egypt.12 In the tranches approach, and based on the hypothesis that individuals at the same quantile of the earnings distribution have expended the same degree of effort, the distribution of earnings, conditional on circumstances, was divided into 10 deciles. 10. Information on mother’s occupation is also available, but this variable was disregarded because of the large number of missing entries. 11. Only two categories were used for mother’s education because of the small number of observations in the category secondary and tertiary. 12. Place of birth may be capturing effort as well as circumstances for individuals who migrate since the age of migration is unknown. 274 T A B L E 2 . Effects of Selective Nonresponse on Earnings Regression Coef�cients 1988 1998 2006 All years Variable Sample used Full sample Sample used Full sample Sample used Full sample Sample used Full sample Male dummy variable 0.398*** 0.398*** 0.238*** 0.2445*** 0.352*** 0.362*** 0.334*** 0.338*** (0.030) (0.030) (0.021) (0.020) (0.023) (0.021) (0.014) (0.014) Age 0.021*** 0.021*** 0.021*** 0.021*** 0.020*** 0.021*** 0.021*** 0.021*** (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) (0.000) Mother’s years of education 0.031*** 0.031*** 0.025*** 0.023*** 0.030*** 0.032*** 0.029*** 0.029*** (0.004) (0.004) (0.003) (0.002) (0.002) (0.002) (0.002) (0.002) Father’s years of education 0.003 0.003 0.004 0.004 0.001 0.002 0.000 0.000 (0.009) (0.009) (0.003) (0.003) (0.003) (0.003) (0.002) (0.002) Region of birth (omitted ¼ Lower Egypt) THE WORLD BANK ECONOMIC REVIEW Metropolitan 0.179*** 0.180*** 0.253*** 0.238*** 0.241*** 0.216*** 0.226*** 0.214*** (0.025) (0.025) (0.022) (0.020) (0.022) (0.020) (0.013) (0.013) Upper Egypt 2 0.084*** 2 0.085*** 2 0.065*** 2 0.060*** 2 0.028 2 0.037 2 0.051*** 2 0.053*** (0.027) (0.026) (0.021) (0.020) (0.020) (0.019) (0.013) (0.012) Constant 4.853*** 4.851*** 4.781*** 4.782*** 4.960*** 4.934*** 4.903*** 4.899*** (0.043) (0.043) (0.039) (0.036) (0.039) (0.036) (0.025) (0.024) Sample size 4135 4258 4048 4740 6499 7501 14682 16499 Adjusted R-squared 0.20 0.21 0.24 0.23 0.17 0.17 0.21 0.21 * Signi�cant at the 10 percent level; ** signi�cant at the 5 percent level; *** signi�cant at the 1 percent level. Note: The dependent variable is the logarithm of real monthly earnings. Numbers in parentheses are bootstrapped standard errors based on 100 replications. Source: Author’s calculation based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. Belhaj Hassine 275 Since using more circumstance variables and a �ner partition of categories would better capture the contribution of unequal opportunities to earnings inequality, inequality of opportunity indices were also computed parametri- cally, exploiting the richness of the dataset on family background, and the results were compared with the previous estimates. The parametric and non- parametric measures based on comparable circumstance variables are, there- fore, complemented with a parametric decomposition using additional circumstances and re�ning the categories for each circumstance. The variables used in this decomposition are father’s and mother’s edu- cation, measured by the number of years of schooling; employment for both parents, grouped into three categories (wage worker, employer, and self- employed and work for family); father’s occupation status, grouped into four categories (high status, medium status, low status, and skilled agricultural worker); and dummy variables for region of birth.13 Birth region is coded into three regions as before, and urban and rural areas in Lower and Upper Egypt are captured by a dummy variable. Gender and age are also used as circum- stance variables when the sample is not subdivided by gender or age. Table 3 presents descriptive statistics for all survey years combined. Earnings are higher for male subsamples, for the oldest cohort, and for urban areas. Father’s mean number of years of schooling is higher in urban areas of birth and signi�cantly higher for women than for men, suggesting that women with educated parents have more chances of entering the labor force, thereby suggesting the possibility of selection biases. Table 4 shows the mean and standard deviation of real monthly earnings for each survey year. Earnings declined slightly between 1988 and 1998 but then increased between 1998 and 2006. While the changes in mean earnings are not very large, dispersion in earnings increased considerably in 2006, especially for women and the mid-age cohort. Dispersion was signi�cantly lower in rural areas than in urban areas between 1988 and 1998, but it increased consider- ably in 2006, and was slightly higher in rural areas. I I I . E S T I M AT I O N RE S U LT S The parametric and nonparametric methods were applied to measure the degree of inequality of opportunity for earnings in Egypt. For the entire popu- lation, for each population subgroup, and for each survey year, table 5 displays 13. Parents’ education, reported in the surveys in discrete levels, is converted into years as follows: illiterate (0 years), read and write (2); primary (6), preparatory (9), general or vocational secondary (12), postsecondary (14), university four years (16); university �ve years (17), and postgraduate (18). The nine categories of father’s occupational position (based on the occupational classi�cation used by the Central Agency for Public Mobilization and Statistics) were recoded into four groups: high status (senior of�cers and managers, professionals, and professors); medium status (clerks, service and market sales, and craft); low status ( plant and machine operators and elementary occupations); and skilled agricultural workers. 276 T A B L E 3 . Descriptive Statistics, All Survey Years Statistic Rural Urban Women Men Ages 15 –29 Ages 30 –44 Ages 45 –65 Total Mean monthly earnings (Egyptian pounds) 444.4 653.8 481.0 572.3 411.6 590.2 721.2 554.4 (1,155.0) (1,084.3) (1,377.2) (1,051.5) (640.2) (1,422.8) (1,125.3) (1,123.3) Mean father’s years of schooling 1.5 4.1 4.5 2.5 3.1 2.9 2.5 2.9 (2.9) (5.0) (5.1) (4.0) (4.5) (4.3) (4.1) (4.3) Mean mother’s years of schooling 1.1 2.4 2.4 1.6 2.4 1.7 0.9 1.8 (2.9) (4.1) (4.2) (3.5) (4.2) (3.5) (2.5) (3.6) Father’s employment (%) Wage worker 82.7 85.5 85.3 83.9 81.5 85.9 85.1 84.1 Employer 14.9 12.7 12.6 14.0 14.0 13.1 14.3 13.7 Self employed 2.4 1.9 2.1 2.2 4.5 1.0 0.6 2.2 Father’s occupation status (%) High status 11.2 33.6 35.1 19.9 19.6 24.4 25.1 22.87 Medium status 24.6 41.6 35.2 33.0 37.7 32.8 28.4 33.4 Low status 6.9 12.2 9.1 9.8 12.1 9.5 6.5 9.7 THE WORLD BANK ECONOMIC REVIEW Skilled agricultural worker 57.3 12.6 20.6 37.3 30.6 33.3 40.0 34.0 Mother’s employment (%) Wage worker 18.4 15.5 14.1 17.6 32.1 11.4 5.5 16.9 Employer 8.6 2.7 3.4 6.1 11.9 3.1 0.9 5.55 Self employed 73.0 81.8 82.5 76.4 55.9 85.5 93.6 77.6 Region of birth (%) Metropolitan 0.0 61.5 41.9 30.0 31.6 32.5 33.2 32.3 Lower Egypt 59.5 22.5 39.6 40.2 39.7 40.4 40.0 40.06 Upper Egypt 40.6 16.0 18.5 29.9 28.7 27.2 26.8 27.65 Number of groups in nonparametric approach Observed number of groups 39.0 56.0 26.0 30.0 55.0 56.0 52.0 56.0 Mean number of observations per group 144.0 174.0 125.0 403.0 93.0 111.0 74.0 274.0 Number of observations 5947 10552 3476 13023 6039 6592 3868 16499 Note: Numbers in parentheses are standard deviations. Results are weighted by appropriate sampling weights to reflect the characteristics of the Egyptian population. Source: Author’s calculation based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. Belhaj Hassine 277 T A B L E 4 . Descriptive Statistics, Real Monthly Earnings (Egyptian pounds) 1988 1998 2006 Subgroup Mean Standard deviation Mean Standard deviation Mean Standard deviation Rural 414.6 280 334.7 222.3 543.1 1694.1 Urban 640.5 652.8 500.7 482.3 787 1543.6 Women 419.4 395.8 377.7 551.4 595.4 2021.9 Men 581.5 571.9 429.4 337.4 682.2 1512.5 Ages 15– 29 408.6 321.7 318.7 221.1 481 918 Ages 30– 44 588.5 495.5 416.4 304.4 732.5 2120.2 Ages 45– 65 754.8 838.3 559.3 584.6 848.2 1518 Total 548 544.1 419.7 387.4 665.2 1625 Source: Author’s calculation based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. the estimates of overall earnings inequality and of the degree of inequality of opportunity using the mean log deviation, GE(0), which is the only inequality measure with a path-independent decomposition. Aggregate Analysis The level of overall earnings inequality in Egypt, measured with mean log devi- ation, averaged 34 percent for the entire period 1988–2006. The parametric and nonparametric decompositions suggest that 11–20 percent of this inequal- ity can be attributed to unequal opportunities associated with only �ve circum- stance variables: gender, father’s and mother’s education, father’s occupation, and individual’s region of birth. The types nonparametric analysis and the parametric analysis yield broadly similar results for the entire period, while the tranches nonparametric analysis yields higher opportunity inequality shares. Regardless of the decomposition employed, the results should be viewed as lower-bound estimates of the share of inequality due to all circumstances. Despite the richness of the circumstance variables in the datasets, many relevant circumstance variables, such as family wealth, quality of parents’ education, and innate ability, remain unobserved. Adding more circumstance variables, or further re�ning the subdivision of cat- egories within each circumstance variable, would increase (but cannot reduce) the share of inequality arising from circumstance inequality. There is a clear pattern of increasing earnings inequality in recent years. Overall earnings inequality declined slightly, from 26.7 percent in 1988 to 21.9 percent in 1998, before increasing substantially to 42.3 percent in 2006. All the population subgroups experienced an increase in earnings inequality. Inequality was higher in urban areas, among women, and among the oldest cohort in 1988, but these differences declined over time. T A B L E 5 . Estimates of Earnings Inequality and Inequality of Opportunity 278 Nonparametric estimate Tranches approach Types approach Parametric estimate Survey Overall earnings Opportunity Opportunity Opportunity Opportunity Opportunity Opportunity year Variable inequality inequality share inequality share inequality share 1988 Rural 0.179*** 0.040*** 0.224*** 0.021** 0.117*** 0.011 0.059 (0.011) (0.007) (0.024) (0.007) (0.034) (0.007) (0.039) Urban 0.288*** 0.074*** 0.257*** 0.045*** 0.156*** 0.035*** 0.122*** (0.012) (0.006) (0.015) (0.006) (0.019) (0.006) (0.017) Men 0.246*** 0.055*** 0.225*** 0.035*** 0.141*** 0.029*** 0.119*** (0.011) (0.005) (0.013) (0.005) (0.018) (0.005) (0.019) Women 0.300*** 0.077*** 0.256*** 0.042*** 0.140*** 0.029*** 0.097*** (0.019) (0.013) (0.029) (0.013) (0.035) (0.009) (0.026) THE WORLD BANK ECONOMIC REVIEW Ages 15 – 0.270*** 0.067*** 0.247*** 0.035*** 0.130*** 0.020*** 0.075*** 29 (0.014) (0.008) (0.020) (0.006) (0.019) (0.005) (0.018) Ages 30 – 0.190*** 0.064*** 0.337*** 0.048*** 0.251*** 0.041*** 0.218*** 44 (0.011) (0.008) (0.026) (0.009) (0.034) (0.007) (0.026) Ages 45 – 0.258*** 0.094*** 0.363*** 0.073*** 0.282*** 0.060*** 0.232*** 65 (0.025) (0.012) (0.034) (0.012) (0.037) (0.012) (0.049) Total 0.267*** 0.071*** 0.268*** 0.045*** 0.170*** 0.037*** 0.140*** (0.010) (0.005) (0.011) (0.005) (0.017) (0.004) (0.014) 1998 Rural 0.178*** 0.022*** 0.121*** 0.009*** 0.051*** 0.007*** 0.039*** (0.009) (0.002) (0.013) (0.002) (0.012) (0.002) (0.011) Urban 0.222*** 0.039*** 0.173*** 0.028*** 0.126*** 0.023*** 0.105*** (0.013) (0.004) (0.013) (0.004) (0.014) (0.003) (0.015) Men 0.210*** 0.033*** 0.158*** 0.026*** 0.121*** 0.025*** 0.117*** (0.007) (0.003) (0.012) (0.004) (0.015) (0.003) (0.013) Women 0.240*** 0.049*** 0.203*** 0.031* 0.128*** 0.021** 0.086*** (0.039) (0.011) (0.023) (0.012) (0.026) (0.006) (0.019) Ages 15 – 0.176*** 0.033*** 0.187*** 0.027*** 0.153*** 0.021*** 0.116*** 29 (0.011) (0.004) (0.017) (0.004) (0.018) (0.004) (0.016) Ages 30 – 0.182*** 0.038*** 0.208*** 0.031*** 0.170*** 0.027*** 0.147*** 44 (0.008) (0.004) (0.019) (0.005) (0.021) (0.004) (0.019) Ages 45 – 0.223*** 0.060*** 0.269*** 0.047*** 0.209*** 0.036*** 0.161*** 65 (0.024) (0.009) (0.025) (0.013) (0.034) (0.006) (0.026) Total 0.219*** 0.039*** 0.177*** 0.029*** 0.130*** 0.026*** 0.117*** (0.011) (0.003) (0.011) (0.003) (0.012) (0.003) (0.012) 2006 Rural 0.404*** 0.081*** 0.201*** 0.055* 0.136** 0.015*** 0.038*** (0.061) (0.019) (0.029) (0.024) (0.042) (0.003) (0.008) Urban 0.423*** 0.072*** 0.170*** 0.053*** 0.125*** 0.024*** 0.057*** (0.028) (0.011) (0.022) (0.012) (0.023) (0.007) (0.017) Men 0.412*** 0.069*** 0.167*** 0.046*** 0.110*** 0.028*** 0.069*** (0.031) (0.008) (0.014) (0.010) (0.020) (0.006) (0.014) Women 0.445*** 0.084*** 0.189*** 0.044 0.100* 0.022*** 0.049*** (0.069) (0.018) (0.027) (0.030) (0.050) (0.004) (0.012) Ages 15 – 0.345*** 0.077*** 0.224*** 0.054** 0.157*** 0.034*** 0.099*** 29 (0.042) (0.010) (0.013) (0.020) (0.036) (0.008) (0.021) Ages 30 – 0.453*** 0.103*** 0.227*** 0.092*** 0.203*** 0.03 0.066 44 (0.047) (0.024) (0.034) (0.027) (0.042) (0.019) (0.045) Ages 45 – 0.381*** 0.092*** 0.242*** 0.061*** 0.161*** 0.027** 0.070** 65 (0.047) (0.015) (0.027) (0.018) (0.035) (0.008) (0.025) Total 0.423*** 0.077*** 0.181*** 0.049*** 0.116*** 0.023** 0.055** (0.030) (0.007) (0.012) (0.012) (0.021) (0.008) (0.019) All years Rural 0.320*** 0.064*** 0.199*** 0.032* 0.099** 0.021 0.065 Belhaj Hassine (0.034) (0.011) (0.017) (0.015) (0.036) (0.011) (0.033) (Continued ) 279 TABLE 5. Continued 280 Nonparametric estimate Tranches approach Types approach Parametric estimate Survey Overall earnings Opportunity Opportunity Opportunity Opportunity Opportunity Opportunity year Variable inequality inequality share inequality share inequality share Urban 0.341*** 0.065*** 0.190*** 0.038*** 0.112*** 0.039*** 0.115*** (0.013) (0.005) (0.008) (0.006) (0.015) (0.004) (0.010) Men 0.329*** 0.060*** 0.182*** 0.035*** 0.105*** 0.039*** 0.119*** (0.017) (0.005) (0.007) (0.006) (0.016) (0.003) (0.009) Women 0.362*** 0.070*** 0.193*** 0.02 0.056* 0.019** 0.052** (0.045) (0.009) (0.010) (0.011) (0.023) (0.006) (0.020) Ages 15 – 0.295*** 0.065*** 0.220*** 0.038*** 0.127*** 0.038*** 0.129*** 29 (0.023) (0.006) (0.007) (0.010) (0.023) (0.005) (0.013) THE WORLD BANK ECONOMIC REVIEW Ages 30 – 0.332*** 0.075*** 0.225*** 0.061*** 0.184*** 0.049*** 0.148*** 44 (0.032) (0.009) (0.016) (0.015) (0.031) (0.009) (0.029) Ages 45 – 0.315*** 0.079*** 0.251*** 0.044*** 0.141*** 0.042*** 0.134*** 65 (0.031) (0.010) (0.019) (0.008) (0.022) (0.006) (0.021) Total 0.341*** 0.068*** 0.200*** 0.036*** 0.106*** 0.039*** 0.113*** (0.017) (0.005) (0.007) (0.005) (0.012) (0.004) (0.011) * Signi�cant at the 10 percent level; ** signi�cant at the 5 percent level; *** signi�cant at the 1 percent level. Note: Numbers in parentheses are bootstrap standard deviations based on 100 replications. Source: Author’s calculation based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. Belhaj Hassine 281 Many factors could have contributed to the increase in earnings inequality, including changes in the education composition of the Egyptian labor force, increasing returns to education and experience, and a transition to a more decentralized and market-oriented economy, but it seems likely that much of the rise was driven by variable and accelerating inflation.14 Inflation was very unstable over this 19-year period, high and volatile over 1988–91, steadily declining over 1991–2002, and steadily rising after 2002, especially in 2004.15 Many studies have found inequality-increasing effects of inflation (see, for example, Ferreira, Leite, and Litch�eld 2008).16 The trend in inequality of opportunity levels is similar to that of overall inequality, declining between 1988 and 1998 and then increasing from 1998 onwards. Nevertheless, the variations in inequality of opportunity levels over the entire period are much less pronounced than those in overall inequality. The nonparametric measures posted a slight increase over 1988–2006, while the parametric estimates fell slightly. The differences in the levels of inequality of opportunity at the beginning and end of the period are barely statistically signi�cant. Since overall earnings inequality increased signi�cantly over the entire period, while the levels of inequality of opportunity were generally stable, the opportunity share of inequality declined sharply between the late 1980s and the mid-2000s. Figure 1 reveals a similar downward trend for both parametric and nonpara- metric estimates of the proportion of earnings inequality attributable to unequal opportunities. The contribution of opportunity to inequality fell from 14 –27 percent in 1988 to 6 –18 percent in 2006, depending on the measure used. The parametric estimates are systematically lower than the nonparametric estimates. The opportunity inequality shares for the entire country for each survey year measured by the tranches method dominate the types and para- metric measures. Although the types method yields higher results than the 14. A Mincerian regression was run to see whether improvements in the educational attainment of the labor force contributed to the rise of inequality. The evidence suggests that returns to education increased between 1988 and 2006, while returns to experience declined. However, the results show a substantial increase of returns to education and experience over 1988– 2006 in rural areas, suggesting that the greater dispersion of earnings, particularly in rural areas, was caused by the increasing returns to education and experience. 15. See World Bank (http://data.worldbank.org/country/egypt-arab-republic) or International Monetary Fund data (www.imf.org/external/data.htm). 16. The measures of inequality for Egypt provided by the World Bank range from a mean log deviation of 16.9 percent in 1990/91 to 17.8 percent in 2004/05, with a slight drop to 15.5 percent in 1995/96. Although the current study �nds a broadly similar pattern of variability, the inequality estimates are higher and suggest a much greater increase in recent years. The World Bank inequality estimates are based on consumption and expenditure data, which are often considerably lower than estimates based on income and labor earnings data. Moreover, earnings tend to be more volatile and more sensitive to macroeconomic fluctuations than consumption and expenditures, which are likely to be closer to permanent income (Barros and others 2009). 282 THE WORLD BANK ECONOMIC REVIEW F I G U R E 1. Parametric and Nonparametric Estimates of the Share of Inequality of Opportunity for the Egyptian Labor Force (with con�dence intervals at 95 percent) Source: Author’s calculations based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. parametric decomposition, the difference turns out to be (borderline) signi�- cant only for 2006 and insigni�cant for the other survey rounds. These differ- ences are likely the result of small-sample biases that raise spurious sampling variation in nonparametric decomposition. The problem is particularly acute in the tranches approach, where each cell is subdivided into deciles. Another plausible explanation is the ability of the tranches approach to assess in a �ner way the individual earnings gaps attributable to circumstances (Aaberge, Mogstad, and Peragine 2011). The parametric and types methods focus on the inequality between social groups identi�ed by their circumstances and are less sensitive than the tranches method to inequalities between individ- uals within the same social group. These parametric and types methods depend on group-speci�c mean income and fail to capture the effect on inequality of opportunity of Pigou-Dalton redistribution within social types, while the tranches approach does (Checchi and Peragine, 2010; Checchi, Peragine, and Serlenga 2010). Between 1988 and 1998, when the economic adjustment program was implemented, the level of overall earnings inequality fell 18 percent and that of opportunity inequality fell by 30 –45 percent. The earning gaps between social groups and among individuals at the same effort level have narrowed, presumably because of the expansion of education for underprivileged children, moderate macroeconomic stability, and the Belhaj Hassine 283 resumption of growth during this period.17 Inequality among individuals at the same effort level appears to have declined more rapidly than inequality between social types, explaining the larger decline observed in the opportunity share measured by the tranches method. From 1998 onward, earnings gaps widened again, with the gaps widening less between social groups than within effort groups. The economic policy reforms implemented in Egypt since 2000, which include privatization, deregu- lation, and progressive trade liberalization, facilitated the transition to a more market-oriented economy. Despite the positive payoffs of these reforms, they might have brought about distributional changes that widened income differen- tials. Thus the increase in inequality during this period might be associated with the market-oriented reforms and the expansion of inflation. While inequality of opportunity levels rose nearly to their 1988 level, overall earnings differentials widened even more, leading to a decrease in the share of inequality attributed to opportunities. It follows, then, that earnings inequality was due increasingly to differential effort. However, the decline in the share of inequality attributable to opportunities does not necessarily mean that the true opportunity share of inequality has fallen or that differences in effort have increased substantially. The possibility of underestimation problems attributable to omitted unobserved circumstance variables or to transitory earnings components cannot be ruled out. A more market-oriented economy rewards individuals’ skills and education quality more highly, increasing the variation in earnings within the educational attainment categories of workers and consequently within the education cat- egories of their parents. If the quality of parental education plays a growing role over time in shaping earnings inequality in Egypt, and since this variable is omitted, estimates of inequality of opportunity would be expected to be higher in 2006. On the other hand, using current earnings to measure opportunity inequality might distort the assessment of the extent to which circumstances affect the dis- tribution of outcomes, because of measurement error and idiosyncratic shocks to earnings. The transitory earnings components add to the dispersion of earn- ings not explained by circumstances, confounding the variance of transitory earnings with the part of earnings inequality due to effort. And that could lead to overestimating the degree of effort inequality (Barros and others 2009; Aaberge, Mogstad, and Peragine 2011). Population Subgroup Analysis Because the aggregate analysis presented above could mask area, gender, and age disparities in opportunity inequality within Egypt, inequality of 17. See Cogneau and Gignoux (2009) for the contribution of educational expansion to earnings and opportunity (in)equality in Brazil. 284 THE WORLD BANK ECONOMIC REVIEW F I G U R E 2. Parametric and Nonparametric Estimates of the Share of Inequality of Opportunity by Area (with con�dence intervals at 95 percent) Source: Author’s calculation based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. opportunity measures were also computed for each population subgroup (see table 5 and �gures 2, 3, and 4). AREA. The parametric and nonparametric estimates revealed somewhat higher inequality of opportunity levels and shares in urban than in rural areas, but the difference is statistically signi�cant only in 1998. Between 1988 and 1998, overall inequality was fairly stable in rural areas and declined in urban areas; inequality of opportunity levels dropped in both areas, though it dropped more in rural areas (see table 5 and �gure 2). From 1998 onwards, overall and oppor- tunity inequality levels rose in both urban and rural areas. Inequality of opportu- nity measures regained their late 1980s levels in urban areas but rose much more in rural areas. Although overall inequality in rural earnings rose signi�cantly, from 18 percent to 40 percent, the opportunity share also increased, from 12 percent to 20 percent (tranches approach), showing a larger contribution of unequal opportunities to the variance in rural earnings. The parametric estimates of the opportunity inequality share, while not signi�cantly different from those obtained by the types method, show no vari- ations over time in rural areas. This may be explained either by the omission of circumstance variables, which could cause a meaningful drop in the parametric measures of inequality of opportunity, or by large sampling variation, which could induce an upward bias in the nonparametric measures. This possibility is Belhaj Hassine 285 F I G U R E 3. Parametric and Nonparametric Estimates of the Share of Inequality of Opportunity by Gender (with con�dence intervals at 95 percent) Source: Author’s calculation based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. explored later by estimating inequality of opportunity indices parametrically using a richer set of circumstances and a more re�ned partition. In addition to rising inflation and the transition to a more liberal market, increasing labor market returns to education and to experience contributed to the substantial increase in rural earnings differential between 1998 and 2006. There have been substantial improvements in the education of the rural labor force, an accelerating movement of labor out of agriculture as structural reforms took hold, and a shift to high- and medium-status occupations. All these factors might have increased the influence of family background on income differentials between social types and, even more, among individuals at similar levels of effort. GENDER. The contribution of unequal opportunities to earnings inequality is higher for women by the tranches method, while it is higher for men by the parametric and types methods (�gure 3). However, the difference in inequality of opportunity between men and women is barely signi�cant for the tranches method and insigni�cant for the others. The difference in results can be explained by the fact that the omitted cir- cumstances are more important to the distribution of earning for women. Large sampling variance in small cells can also be involved, since sample sizes are smaller for women than for men. Again, the tranches method produces 286 THE WORLD BANK ECONOMIC REVIEW F I G U R E 4. Nonparametric Estimates of the Share of Inequality of Opportunity by Age Groups (with con�dence intervals at 95 percent) Source: Author’s calculation based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. higher estimates while the types and parametric measures are quite close to each other. The levels of overall and opportunity inequality in earnings seem to follow a similar historical path for both genders, including a decline in the late 1990s and a subsequent increase. Nevertheless, opportunity inequality rose slightly for men from the beginning to the end of the period and was more stable for women, while overall earnings inequality rose substantially for both men and women. Therefore, the share of opportunity inequality declined more over time for women’s earnings. The growing dispersion in earnings in recent years could be associated with the rise in the educational attainment of the Egyptian labor force and the increase in returns to schooling. Although improvements in education levels and returns to education were more pronounced for women, earnings inequal- ity increased slightly more for men, suggesting that inflation and policy changes had a greater effect on men’s earnings. The estimates of inequality of opportunity for women should be treated with caution since they suffer from potential underestimation. Women’s much lower labor force participation rates (30 percent, compared with almost 80 percent for men; see tables 1 and 3) raise concerns about sample selection bias. Belhaj Hassine 287 Moreover, parents’ education level and father’s occupational status are con- siderably higher for women, suggesting that women’s labor force participation decisions are influenced by their family background, so that women who expect to be discriminated against due to their circumstances might be less likely to enter the labor force. In that case, the true opportunity inequality measures for women would be expected to be much larger than those obtained here. AGE. Figure 4 displays the evolution of the shares of opportunity equality in earnings inequality over time for each age cohort; for readability, only the non- parametric estimates are plotted. The types and parametric results are quite close. Except in 1988, when the share of opportunity inequality was notably lower for the youngest cohort, there is no statistically signi�cant difference between age groups. Although the tranches method yields higher estimates, the differ- ence between the estimates and those obtained by the types method is insigni�- cant, which strengthens con�dence in the results. These �ndings suggest that the increase in educational attainment across successive age cohorts over 1988–2006 was accompanied by a slight, but borderline statistically signi�cant, increase in the levels and shares of inequality of opportunities. Between 1988 and 1998 overall earnings inequality and opportunity earnings inequality (both levels and shares) increased across age cohorts, from the youngest to the oldest, while in 2006 overall inequality and opportunity inequality were higher for the mid-age cohort (see table 5 and �gure 4).18 The decline in earnings differential already noted between 1988 and 1998 for all population subgroups was modest for Egyptian wage and salary workers ages 30 –44, while the large increase in the gap after 1998 was much more pronounced for this cohort than for the others. This age cohort also experienced a larger increase in inequality of opportunity levels during the 2000s and therefore an increase in the share of unequal opportunities (from 17 –21 percent in 1998 to 20–23 percent in 2006, depending on the measure used). The gap between social groups increased more than the gap between people at the same effort level for the mid-age cohort. Despite the increase, this age cohort had a lower incidence of opportunity inequality in 2006 than in 1988, when the share was greater than 25 percent. The upward trend in inequality of opportunity shares between 1998 and 2006 is not apparent in the parametric decomposition, however. On the other hand, from 1998 to 2006, only a slight rise in the levels of inequality of opportunity was noted for the youngest cohort and an even sligh- ter rise for the oldest. So the share of opportunity inequality in earnings inequality increased for the youngest, from 19 percent to 22 percent, and decreased for the oldest, from 27 percent to 24 percent by the tranches 18. Inequality of opportunity may be underestimated for the youngest cohort due to the possible ´ ndez 2007). importance of part-time work in this age group (Bourguignon, Ferreira, and Mene 288 THE WORLD BANK ECONOMIC REVIEW approach. When measured using the types and parametric methods, the increase in the opportunity shares for the youngest cohorts appears to be smaller, while the decrease for the oldest age group appears greater. These results suggest that during the 1990s the contribution of unequal opportunities to earnings inequality was lower for the younger cohorts than for older cohorts and highest for the mid-age cohort in recent years. The rise in inequality of opportunities for the youngest cohort likely reflects the increasing effect of parental education on their earnings. Returns to parents’ education in Egypt were found to increase with age, contributing to the greater dispersion in earnings within older cohorts. However, while returns to parents’ education increased over time for the mid-age and youngest groups, it fell for the oldest cohort. Although this explanation sounds plausible, we cannot reject the possibility of underestimation due to unobserved circum- stances, transitory earnings components, or the incidence of part-time employ- ment. Taking these elements into account might change the �nding that the younger cohort is suffering from a higher incidence of opportunity inequality than the older one today. Parametric Decomposition Although informative, these results capture only a part of the contribution of circumstances to an individual’s earnings inequalities. To check the robustness of these results to the introduction of additional exogenous circumstances, a parametric decomposition was conducted that added additional family back- ground characteristics to the previous analysis (father’s and mother’s employ- ment status; rural or urban area of birth) and used a �ner partition of the circumstance categories. Age is also used as a circumstance variable when the sample is not split by age groups. To begin, reduced form earnings equation (6) is estimated by ordinary least squares (OLS) for the entire population and separately for each population sub- group by survey year and for all years combined. Because of space limitations, regression results are presented in table 6 only for the entire population for each survey year and using the whole sample. The estimates, globally signi�- cant at the 10 percent level or lower, support the view that circumstances have an important influence on outcomes and, as shown in the descriptive statistics in table 3, that being male and being older are associated with higher income. Differences in region of birth are found to contribute to wage differences. With Lower Egypt as the reference, people born in Upper Egypt have lower incomes, while those born in Metropolitan regions do better. Likewise, individuals born and working in urban areas earn more than those in rural areas. The parental background variables are also found to affect earnings. Father’s and mother’s years of education have a signi�cant positive influence. Father’s and mother’s employment also has a positive influence, but the effect was rarely signi�cant in 1988. Father’s occupational status does not appear to be important. Belhaj Hassine 289 T A B L E 6 . Regression of Earnings On Circumstances Variable 1988 1998 2006 All years Urban dummy variable 0.086*** 0.097*** 0.070*** 0.081*** (0.028) (0.023) (0.022) (0.014) Male dummy variable 0.410*** 0.276*** 0.377*** 0.360*** (0.030) (0.020) (0.023) (0.014) Age 0.021*** 0.019*** 0.017*** 0.019*** (0.001) (0.001) (0.001) (0.001) Father’s years of school 0.024*** 0.016*** 0.020*** 0.020*** (0.005) (0.003) (0.003) (0.002) Mother’s years of school 0.008 0.015*** 0.014*** 0.013*** (0.009) (0.003) (0.003) (0.002) Father’s employment status (omitted ¼ wage worker) Employer – 0.002 – 0.061** 0.014*** – 0.015 (0.029) (0.027) (0.027) (0.016) Self-employed – 0.213 0.001 – 0.219*** – 0.149*** (0.202) (0.119) (0.051) (0.045) Father’s occupation (omitted ¼ medium status) High status 0.033 0.042* 0.038 0.038** (0.033) (0.025) (0.025) (0.016) Low status – 0.026 – 0.002 0.036 0.012 (0.040) (0.032) (0.030) (0.020) Skilled agricultural worker – 0.085*** – 0.080*** – 0.092*** – 0.083*** (0.027) (0.024) (0.024) (0.014) Mother’s employment (omitted ¼ self-employed) Wage worker – 0.137*** – 0.179*** – 0.237*** – 0.192*** (0.053) (0.030) (0.028) (0.019) Employer – 0.094 – 0.005 – 0.130*** – 0.068*** (0.067) (0.049) (0.038) (0.026) Region of birth (omitted ¼ lower Egypt) Metropolitan 0.102*** 0.185*** 0.166*** 0.154*** (0.029) (0.024) (0.025) (0.014) Upper Egypt – 0.086*** – 0.074*** – 0.024 – 0.054*** (0.027) (0.021) (0.020) (0.014) Constant 4.872*** 4.858*** 5.103*** 4.966*** (0.048) (0.046) (0.047) (0.029) Number of observations 4135 4048 6499 14682 Adjusted R-square 0.213 0.258 0.185 0.219 * Signi�cant at the 10 percent level; ** signi�cant at the 5 percent level; *** signi�cant at the 1 percent level. Note: The dependent variable is the logarithm of real monthly earnings. Numbers in parenth- eses are bootstrapped standard errors based on 100 replications. Urban dummy variable, male dummy variable, and age are not included among the regressors when the model is estimated separately by area, gender, and age cohort, respectively. Source: Author’s calculation based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. Nonetheless, with medium status as the reference, having a father in an agricul- tural occupation has a strong and statistically signi�cant negative effect on earnings. 290 THE WORLD BANK ECONOMIC REVIEW T A B L E 7 . Parametric Estimates of Inequality of Opportunity Variable 1988 1998 2006 All years Overall inequality of opportunity 0.067*** 0.058*** 0.064*** 0.063*** (0.005) (0.006) (0.012) (0.006) Share of inequality of opportunity 0.253*** 0.266*** 0.151*** 0.184*** (0.015) (0.014) (0.029) (0.020) Partial shares associated with circumstances Gender 0.030*** 0.019 0.01 0.015 (0.007) (0.012) (0.012) (0.008) Mother’s education 0.006 0.002 0.018* 0.017** (0.008) (0.004) (0.008) (0.005) Father’s education 0.055*** 0.01 0.034*** 0.041*** (0.011) (0.007) (0.008) (0.006) Father’s employment 0.001 0.001 0.010*** 0.008*** (0.001) (0.002) (0.002) (0.001) Mother’s employment 0.007* 0.005 0.018* 0.015** (0.003) (0.007) (0.009) (0.006) Father’s occupation status 0.020** 0.002 0.022*** 0.022*** (0.007) (0.006) (0.006) (0.003) Birth areaa 0.022*** 0.004 0.015** 0.017*** (0.006) (0.006) (0.005) (0.004) Birth regionb 0.035*** 0.044*** 0.024** 0.030*** (0.009) (0.010) (0.008) (0.006) * Signi�cant at the 10 percent level; ** signi�cant at the 5 percent level; *** signi�cant at the 1 percent level. Note: Numbers in parentheses are bootstrapped standard errors based on 100 replications. a. Urban or rural. b. Metropolitan, Lower Egypt, or Upper Egypt. Source: Author’s calculation based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. The estimation results for each population subgroup, available on request, show similar effects of the circumstance variables on earnings. Next, the counterfactual earnings distribution is simulated using the coef�- cient estimates to compute the share of earnings inequality arising from unequal opportunities and the contribution of individual circumstance variables. The parametric estimates of inequality of opportunity using a richer set of circumstance variables, reported in table 7, are signi�cantly higher than the parametric measures reported in table 5. The opportunity shares are 15 –27 percent over 1988 – 2006, compared with 6 –14 percent when fewer circum- stance groups and coarser partitions are used, suggesting that father’s and mother’s employment and the area of birth played a signi�cant role in account- ing for opportunity disparities in Egypt. The disaggregation by area reveals that the contribution of opportunities to earnings inequality decreases over time in rural areas, despite the rise in the Belhaj Hassine 291 F I G U R E 5. Contribution of Individual Circumstance Variables to Earnings Inequality for the Egyptian Labor Force Source: Author’s calculation based on data from the Egypt Labor Force Sample Survey of 1988, the Egypt Labor Market Survey of 1998, and the Egypt Labor Market Panel Survey of 2006. level of inequality of opportunity.19 This result suggests that the increase in the share of rural opportunity inequality using the nonparametric methods (see table 5 and �gure 2) might be due to sampling variance in small cells. The analysis by gender shows that women suffer a signi�cantly greater inci- dence of opportunity inequality than men. This �nding, similar to that reported in table 5 and �gure 3 using the tranches approach, suggests a higher sensitivity of the parametric estimates to the omitted circumstance variables for women than for men. The contribution of father’s and mother’s employment to earn- ings inequality is as important as that of the other circumstance variables for women but less important for men. Figure 5 depicts the evolution over 1988 – 2006 of the contribution of indi- vidual circumstance variables to earnings inequality for the entire population. Of all observed circumstance variables, father’s education and region of birth are associated with the largest shares of earnings inequality. Inequality of opportunity related to father’s education declined from 6 percent in 1988 to 1 percent in 1998, before rising again to nearly 4 percent in 2006. Inequality of 19. The results for the population subgroups are available from the author. 292 THE WORLD BANK ECONOMIC REVIEW opportunity resulting from region of birth was fairly stable between 1988 and 1998, at around 4 percent, and declined to 2 percent in 2006. Gender, father’s occupation and mother’s employment also play an impor- tant role in determining earnings inequality, accounting for 1–3 percent of total inequality for the entire population. Gender’s importance in shaping opportunity declines over time. Mother’s education and father’s employment make a limited contribution to reducing earnings inequality when area, gender, and other family background variables are controlled for. However, mother’s education has a growing influ- ence on earnings in the recent period. For population subgroups, parental education was found to be a more important determinant of opportunity for women in 1988, while father’s occu- pation status and mother’s employment accounted for the largest share of earn- ings variations in recent years. In rural areas, inequality was shaped mainly by gender until 1998, while father’s occupation and employment had the largest role during the recent period. There is also some evidence that mother’s education contributes more to reducing earnings inequality for the youngest cohort than do the other family background variables. The share of inequality associated with parental education increases across age cohorts and declines signi�cantly over time for the oldest cohort. This result is consistent with the previous �nding of a declining contribution of unequal opportunities to earnings inequality for Egyptian wage and salary workers ages 45 –65 and with the possibility that the decline was driven by the weakening effect of parental education on earnings for this cohort. These �ndings suggest that policies aimed at reducing the earnings effect of father’s education and skills and of regional origins would help reduce inequal- ity of opportunities in Egypt. I V. C O N C L U D I N G R E M A R K S It is increasingly argued that inequality of opportunity arising from individual circumstances contributes to the persistence of social and economic inequalities and constrains economic development and, therefore, that society should com- pensate for this sort of inequality. In the interests of equity, it is thus important to distinguish inequalities due to unequal opportunities from inequalities due to individual choices. Doing so could help identify policy measures and insti- tutional arrangements that favor more egalitarian distribution of opportunities. To assess the extent to which unequal opportunity affects the distribution of earnings in Egypt, parametric and nonparametric measures were calculated of the lower bound for inequality of opportunity, over time and by population subgroups. The results are consistent with �ndings of previous studies. Individual circumstances, captured by gender, region of birth, father’s and Belhaj Hassine 293 mother’s education, and father’s occupation status averaged 11–20 percent of the mean logarithmic deviation index, depending on the estimation procedure. There was little change in the levels of inequality of opportunity between 1988 and 2006 and a modest decline in 1998; total earnings inequality increased considerably over this period. The opportunity share of earnings inequality therefore declined from 14 –27 percent in 1988 to 6–18 percent in 2006, depending on the measure used. Although the causes of the sharp increase in earnings inequality and the decline in the opportunity share cannot be established with certainty, some explanations may be ventured. Egypt’s transition to a more market-oriented economy since the early 2000s, together with rising inflation, might have contributed to widening income differentials. Expansion of intermediate and higher education between 1988 and 1998, followed by slower expansion from 1998 onward, especially affect- ing underprivileged social groups might have contributed to equalizing oppor- tunities during the �rst period and limited the increase in earnings gaps between circumstance groups in the second period. A parametric decomposition using a richer set of family background vari- ables and a more re�ned partition to check the robustness of the results to omitted circumstances resulted in a drop in opportunity shares from 27 percent in 1988 to 15 percent in 2006 compared with a decline from 14 percent to 6 percent when fewer circumstance groups are considered. Father’s education and occupation status as well as spatial factors (measured by rural or urban area and region of birth) accounted for around 30 percent and 20 percent of the total effect of circumstances. The analysis by population subgroups reveals a lower incidence of inequality of opportunity in rural areas than in urban areas. Although estimates for rural areas might be biased because of the imprecision of earnings measurement or large sampling variance within small cells, the possibility cannot be ruled out that unobserved circumstances and institutional measures (such as family com- position, parents’ �nancial situation, supply and quality of schooling, and labor market institutions) signi�cantly shape the opportunity sets for rural Egyptian wage and salary workers. This is supported by the weak influence of father’s education and occupation status on rural earnings. Although they play a large role in determining inequality compared with the observed circum- stances, their role is very weak in rural areas, where more than 87 percent of workers have parents with an education level of two years or less and some 57 percent have fathers in agricultural employment. The disaggregation by gender reveals similar �ndings for opportunity inequality for men and women, with a somewhat higher incidence for women. The estimates for women are likely to be biased by participation in the labor market: if women’s participation decisions are negatively influenced by circumstances, inequality of opportunity would be underestimated. 294 THE WORLD BANK ECONOMIC REVIEW The analysis by age group suggests that inequality of opportunity accounted for a lower share of earnings inequality for younger cohorts than for older cohorts in the 1990s, while it accounted for a higher share for mid-age and younger groups in 2006. The decline in the contribution of unequal opportunities to earnings inequality for individuals ages 45–65 is likely due to the declining importance of parental education in determining their opportunity sets. Regardless of the estimation method used, this study suggests that the share of measured earnings inequality in Egypt attributable to circumstances alone varied from one-tenth to one-third from 1988–2006. The true fraction of opportunity inequality would likely be higher if additional circumstance vari- ables were included or if the analysis were based on other measures of econ- omic welfare, such as household income and consumption. Using current earnings may give a misleading picture of the extent of inequality of opportu- nity because of measurement errors and transitory earnings components. Ferreira and Gignoux (forthcoming) estimated inequality of opportunity for labor earnings, household income, and household consumption for Colombia, Panama, and Peru and obtained roughly similar opportunity inequality shares of earnings inequality to those found here for Egypt. Their analysis reveals that inequality of opportunity accounts for a larger portion of overall inequality when household income is used rather than labor earnings. The estimates of inequality of opportunity shares tended to be even higher when based on con- sumption rather than on income or earnings. For Latin American countries for which results are close to those reported here, parametric estimates of the opportunity shares of consumption inequality ranged from 24 percent for Colombia to 39 percent for Panama, compared with 17 percent for earnings inequality. Barros and others (2009) also found that earnings-based measures tend to underestimate inequality of opportunity for long-term welfare because measurement errors and transitory components add to the non-circumstance-driven variance in the earnings and income measures. Bourguignon, Ferreira, and Mene ´ ndez (2007) also report similar results for Brazil. They found that around 23 percent of earnings inequality among Brazilian men in urban areas in 1996 could be attributed to unequal opportu- nities, as measured by the Theil index, indicating a lower incidence of inequal- ity of opportunity than in Egypt for the same period (around 30 percent of the Theil index, using the parametric method, for 1988 – 1998). Although the results of this analysis would help in designing effective public policies for equalizing opportunities, recommending such policies requires further investigation. REFERENCES Aaberge, Rolf, Magne Mogstad, and Vito Peragine. 2011. “Measuring Long-term Inequality of Opportunity.� Journal of Public Economics 95(3– 4): 193– 204. Belhaj Hassine 295 Ali, Ifzal. 2007. “Inequality and the Imperative for Inclusive Growth in Asia.� Asian Development Review 24(2):1– 16. Assaad, Ragui. 2002. “The Transformation of the Egyptian Labor Market: 1988–1998.� In Ragui Assaad, ed., The Egyptian Labor Market in an Era of Reform. Cairo: American University in Cairo Press. Barros, Ricardo P., Francisco H. G. Ferreira, Jose R. Molinas Vega, Jaime S. Chanduvi, Mirela de Carvallo, Samuel Franco, Samuel Freije-Rodrı ´guez, and Je ´ mie Gignoux. 2009. Measuring ´ re Inequality of Opportunities in Latin America and the Caribbean. Washington D.C.: World Bank. Bourguignon, Francois, Francisco H. G. Ferreira, and Marta Mene ´ ndez. 2007. “Inequality of Opportunity in Brazil.� Review of Income Wealth 53(4): 585 –618. Bourguignon, Francois, Francisco H. G. Ferreira, and Michael Walton. 2007. “Equity, Ef�ciency and Inequality Traps: A Research Agenda.� Journal of Economic Inequality 5(2): 235–56. ´ re Cogneau, Denis, and Je ´ mie Gignoux. 2009. “Earnings Inequalities and Educational Mobility in Brazil Over Two Decades.� In S. Klasen and F. Nowak-Lehmann, eds., Poverty, Inequality and Policy in Latin America. CESifo Seminar Series. Cambridge, MA: Massachusetts Institute of Technology Press. Cogneau, Denis, and Sandrine Mesple ´ -Somps. 2008. “Inequality of Opportunity for Income in Five Countries of Africa.� In John Bishop, ed., Inequality and Opportunity: Papers from the Second ECINEQ Society Meeting, Research on Economic Inequality, vol. 16. Binglye, UK: Emerald Group Publishing Limited. Checchi, Daniele, and Vito Peragine. 2010. “Inequality of Opportunity in Italy.� Journal of Economic Inequality 8(4): 429– 50. Checchi, Daniele, Vito Peragine, and Laura Serlenga. 2010. “Fair and Unfair Income Inequalities in Europe.� IZA Discussion Paper 5025. Institute for the Study of Labor, Bonn, Germany. Elbers, Chris, Peter Lanjouw, Johan A. Mistiaen, and Berk O ¨ zler. 2008. “Reinterpreting Between-Group Inequality.� Journal of Economic Inequality 6(3): 231– 45. ´ re Ferreira, Francisco H. G., and Je ´ mie Gignoux. Forthcoming. “The Measurement of Inequality of Opportunity: Theory and an Application to Latin America.� The Review of Income and Wealth. ´ re Ferreira, Francisco H. G., Je ´ mie Gignoux, and Meltem Aran. 2011. “Measuring Inequality of Opportunity with Imperfect Data: The Case of Turkey.� Journal of Economic Inequality DOI:10.1007/s10888-011-9169-0. Ferreira, Francisco H. G., Phillippe G. Leite, and Julie A. Litch�eld. 2008. “The Rise and Fall of Brazilian Inequality: 1981– 2004.� Macroeconomic Dynamics 12: 199–230. Flaurbaey, Marc, and Vito Peragine. 2009. “Ex ante versus Ex post Equality of Opportunity.� ECINEQ Working Paper 141. Society for the Study of Economic Inequality, Palma de Mallorca, Spain. Foster, James E., and Artyom A. Shneyerov. 2000. “Path Independent Inequality Measures.� Journal of Economic Theory 91(2):199 – 222. Lefranc, Arnaud, Nicolas Pistolesi, and Alain Trannoy. 2008. “Inequality of Opportunities vs. Inequality of Outcomes: Are Western Societies All Alike?� Review of Income and Wealth 54(4): 513–46. Peragine, Vito. 2004. “Measuring and Implementing Equality of Opportunity for Income.� Social Choice and Welfare 22(1): 187– 210. Roemer, John E. 1998. Equality of Opportunity. Cambridge, MA: Harvard University Press. Roemer, J. E., R. Aaberge, U. Colombino, J. Fritzell, S. P. Jenkins, I. Marx, M. Page, E. Pommer, J. Ruiz-Castillo, M. J. S. Segundo, T. Traanes, G. Wagner, and I. Zubiri. 2003. “To What Extent Do Fiscal Regimes Equalize Opportunities for Income Acquisition among Citizens?� Journal of Public Economics 87(3–4): 539–65. Can Global De-Carbonization Inhibit Developing Country Industrialization? Aaditya Mattoo, Arvind Subramanian, Dominique van der Mensbrugghe, and Jianwu He Most economic analyses of climate change have focused on the aggregate impact on countries of mitigation actions. We depart �rst in disaggregating the impact by sector, focusing particularly on manufacturing output and exports. Second, we decompose the impact of a modest agreement on emissions reductions—17 percent relative to 2005 levels by 2020 for industrial countries and 17 percent relative to business-as- usual for developing countries—into three components: the change in the price of carbon due to each country’s emission cuts per se; the further change in this price due to emissions tradability; and the changes due to any international transfers ( private and public). Manufacturing output and exports in low carbon intensity countries such as Brazil are less affected. In contrast, in high carbon intensity countries, such as China and India, even a modest agreement depresses manufacturing output by 3 –3.5 percent and manufacturing exports by 5.5– 7 percent. The increase in the carbon price induced by emissions tradability hurts manufacturing output most while the real ex- change rate effects of transfers hurt exports most. JEL codes: F13, F18, H23, Q56 The focus of discussions on climate change mitigation has been on how much emissions should be cut and how developing countries should be compensated for any cuts they make. Accordingly, much of the literature has focused on the aggregate costs to countries of mitigation actions, and the transfers that would be necessary to maintain welfare in the poorer parts of the world. However, the structural implications of these actions have received less attention. Aaditya Mattoo (corresponding author, Amattoo@worldbank.org) is Research Manager, Trade and International Integration, World Bank. Arvind Subramanian (Asubramanian@piie.com) is Senior Fellow, Peterson Institute for International Economics. Dominique van der Mensbrugghe (Dvandermensbrugghe@worldbank.org) is Lead Economist, Development Prospects Group, World Bank. Jianwu He (Jhe3@worldbank.org) is Associate Research Fellow at the Development Research Center of the State Council, China. The views represent those of the authors and not of the institutions with which the authors are af�liated. We would like to thank Nancy Birdsall, Bill Cline, Meera Fickling, Anne Harrison, Gary Hufbauer, Jakob Kierkegaard, Jisun Kim, Kseniya Lvovsky, Will Martin, Mike Mussa, Caglar Ozden, Jairam Ramesh, Dani Rodrik, John Williamson and, especially, Hans Timmer and Michael Toman for helpful discussions, and three anonymous referees and the editor for insightful comments. Thanks also to Michelle Chester and Jolly La Rosa for excellent assistance with compiling the paper. THE WORLD BANK ECONOMIC REVIEW, VOL. 26, NO. 2, pp. 296– 319 doi:10.1093/wber/lhr047 Advance Access Publication November 9, 2011 # The Author 2011. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 296 Mattoo, Subramanian, van der Mensbrugghe and He 297 In this paper, we seek to make a twofold contribution. First, on policy, we isolate the impact of three distinct actions—emissions reductions per se, emis- sions tradability, and transfers. The policy disaggregation is useful because each dimension of policy may have different effects and, moreover, affect dif- ferent countries differently. For example, the impact of emissions reductions varies across countries depending on the carbon intensity of their production. Furthermore, the transfers that arise from tradability themselves have structural consequences and need to be evaluated.1 Second, on outcomes, we focus on manufacturing exports as well as on manufacturing output both in the aggre- gate and in selected sectors. We focus on the manufacturing sector and sub- sectors because policy-makers in developing countries may—for political and other reasons—seek to preserve output and export capacity in these sectors. The literature on the costs of climate change mitigation is voluminous and includes a number of important contributions (Cline 2007, Nordhaus 2007, Stern 2007, UNDP 2007, World Bank 2009). This literature recognizes that a regime that favors static ef�ciency through uniform global prices can be in- equitable and therefore typically recommends �nancial and technology trans- fers to alleviate the adverse effects on developing countries (Stern 2007 and World Bank 2009). Hardly explored is the potential tension between static ef�- ciency and dynamic effects, stemming from changes in the composition of output and exports in developing countries as a result of uniform global prices. The fact that transfers can themselves accentuate this tension through real ex- change rate effects, while acknowledged (Strand 2009), has also not been fully explored. Hence, for many of the vital policy questions that are the subject of this paper, there are today no good answers based on empirical research. An econo- metric approach seems handicapped by the absence of past events and our in- ability to construct experiments which are comparable with the policy changes of greatest interest. We therefore use a multicountry, multisector CGE model to derive our quantitative estimates. In situations of simultaneous policy changes of the kind that we consider in this paper, in which there could be sig- ni�cant interaction among the policies of different countries, and where we are interested in quantifying the effects of policy change on output and trade in dif- ferent sectors of the economy, a computable general equilibrium (CGE) ap- proach seems appropriate. We focus on the case where developing countries cut their emissions by 17 percent by 2020 relative to projected business-as-usual (BAU) levels (China already plans a 20 percent cut in energy intensity), and industrial countries to cut their emissions by 17 percent in 2020 relative to 2005 levels (reflecting the EU’s plans). We also consider a broad range of other scenarios. 1. The relevant literature relating to public external transfers is Elbadawi (1999), Prati and Tressel (2006), and Rajan and Subramanian (2011). The analogue for private capital inflows is Prasad and others (2008). 298 THE WORLD BANK ECONOMIC REVIEW Our main empirical �ndings, which come with a number of important caveats we discuss in Section IV, are the following. Some currently high carbon intensity countries/regions (such as China, India, Eastern Europe and Central Asia, and the Middle East and North Africa) will experience substantial reduc- tions in manufacturing output and exports from emissions reductions per se.2 For a subset of these countries, especially China and India, these effects will be aggravated by emissions tradability (especially on manufacturing output) and transfers (especially on manufacturing exports). For this subset, the negative effects will be substantial not just for carbon-intensive manufacturing but also other manufacturing sectors.3 These effects would be aggravated if these devel- oping countries pursued more ambitious emissions targets. There could also be transitional dislocation costs as resources are reallocated across sectors. In contrast, the manufacturing sector in low carbon intensity countries (such as Brazil and Latin America) will be less affected by actions related to climate change. In the case of sub-Saharan Africa, effects might even be positive, al- though any boost to manufacturing exports could be reduced through transfers and the consequent real exchange rate effects. These �ndings could have implications for the positions that countries will adopt in international negotiations. Amongst economists there is a strong con- sensus that the best way forward is to get a uniform global carbon price— either via a common global tax or international emissions trading—supplemen- ted with �nancial transfers to address the equity dimension of climate change. This article of faith in the policy community was captured by the Financial Times in its leader of November 3, 2009, when it asserted that the price of carbon, “ . . . must be high and the same everywhere. . . . In the actual world, a global scheme of tradable emissions quotas is the best solution.� However, pol- icymakers in high-carbon intensity countries may resist this prescription because it would imply a contraction in manufacturing output and exports. This paper is organized as follows. In section II, we describe the emissions reductions scenarios that we believe have the greatest relevance for policy, and briefly discuss the positions that the United States and European Union (EU) have taken on a key issue, the international tradability of emissions rights. In section III, we present the results of our quantitative simulations of each of the scenarios. Section IV provides a concluding assessment of the implications of our result. 2. The different country groupings in the model are EU27 with EFTA, United States, Japan, Republic of Korea, Rest of high income Annex 1, Rest of high income, Brazil, China, India, Russia, Rest of East Asia, Rest of South Asia, Rest of Europe and Central Asia, Middle East and North Africa, sub-Saharan Africa, Rest of Latin America and the Caribbean. 3. The different sectors in the model are Crops, Livestock, Forestry, Coal, Crude oil, Natural gas, Other mining, Processed food, Re�ned oil, Chemicals rubber and plastics, Paper products, publishing, Mineral products n.e.s., Ferrous metals, Metals n.e.s., Transport equipment, Motor vehicles and parts, Transport equipment n.e.s., Other manufacturing, Electricity, Gas distribution, Construction, Transport services, Transport n.e.s., Sea transport, Air transport, Other services. Mattoo, Subramanian, van der Mensbrugghe and He 299 I. THE SCENARIOS Our basic scenario is one where high income countries cut their emissions by 17 percent by 2020 relative to levels in 2005, and developing countries cut their emissions by 17 percent by 2020 relative to levels in business-as-usual.4 These modest cuts reflect broadly the pledges made by the major countries after Copenhagen to the UN’s Committee of the Parties (COP-15) as part of the Copenhagen accord. Our assumption is exactly the U.S. commitment. The EU’s unconditional offer is 20–30 percent reduction below 1990 levels by 2020, of which about 7.7 percent has been achieved already. Its more ambi- tious 30 percent offer is conditional on other countries raising their emissions targets. China’s and India’s COP-15 commitments are dif�cult to quantify but should not be too far off our assumption.5 We also consider a range of cuts by developing countries to test the robustness of our results. In addition to emissions cuts, insofar as recent initiatives can be projected forward, they envisage international tradability of emissions rights. In the United States, bills in the House and Senate differ slightly. The House version would limit the amount of total emissions rights that are internationally trad- able to a maximum of one-half of the 2 billion tons of CO2 that can be traded, with the remaining half being traded domestically. In the Senate version, a maximum of one-quarter of the 2 billion can be traded internationally.6 The Council of the European Union has also moved in favor of international tradability. It would like to see “preferably by no later than 2015, a robust OECD [Organisation for Economic Co-operation and Development]-wide carbon market through the linking of cap-and-trade systems which are compar- able in ambition and compatible in design, to be extended to economically more advanced developing countries by 2020.�7 In order to capture the effects of both emissions cuts and tradability, we consider four variants of the basic scenario (see Table 1). First, where cuts are implemented but emissions are not tradable and there are no international transfers (NTER). Second, where cuts are complemented with emissions trad- ability which leads to a uniform global carbon price, but abstract from the 4. This would entail agreeing on a hypothetical baseline for emissions. However, what matters most is the binding of emissions themselves at some level that would yield a positive carbon tax. 5. China’s commitment is this: “China will endeavour to lower its carbon dioxide emissions per unit of GDP by 40-45 percent by 2020 compared to the 2005 level, increase the share of non-fossil fuels in primary energy consumption to around 15 percent by 2020 and increase forest coverage by 40 million hectares and forest stock volume by 1.3 billion cubic meters by 2020 from the 2005 levels.� India’s commitment is: “India will endeavour to reduce the emission intensity of its GDP by 20 to 25 percent by 2020 in comparison to the 2005 level. The emissions from the agriculture sector will not form part of the assessment of emissions intensity.� See http://www.climateactiontracker.org/ for details of each country’s commitment. 6. The Senate version also has a stipulation that, after 2018, 1.25 international offset credits would be required to equal one allowance of domestic offset credit. 7. See http://www.consilium.europa.eu/uedocs/cms_Data/docs/pressdata/en/envir/106429.pdf. 300 THE WORLD BANK ECONOMIC REVIEW T A B L E 1 . Scenarios for Cooperative Emissions Reductions Target Emissions Cuts Low and Middle Description of Scenarios High Income Income Transfers Both industrial and developing 17% relative to 17% relative to No countries reduce emissions but 2005 business-as-usual emissions rights are not emission levels tradable (NTER) levels Both industrial and developing 17% relative to 17% relative to No countries reduce emissions; 2005 business-as-usual emissions rights are tradable; emission levels but we abstract from private levels transfers (TER1) Both industrial and developing 17% relative to 17% relative to Yes, through countries reduce emissions and 2005 business-as-usual emissions trading emissions rights are tradable emission levels (TER) levels Both industrial and developing 17% relative to 17% relative to Yes, through public countries reduce emissions; 2005 business-as-usual transfers and emissions rights are tradable; emission levels emissions trading and transfers offset welfare loss levels from emissions reductions (TERWMT) implied private transfers (TER1); this scenario is equivalent to a uniform global carbon tax regime where the taxes are retained domestically rather than being transferred across countries. The third scenario differs from the second in allowing for private transfers (TER), and this represents what will actually happen with full tradability of emissions; this scenario is equivalent to a uniform global carbon tax regime with revenues transferred across countries. Finally, we consider a scenario where supplementary public transfers are made to compensate developing countries so that they attain the same welfare levels as in the business-as-usual case (TERWMT). This last scenario might not seem realistic given the political infeasibility of generating support for large public transfers to countries such as China and India. But we use this scenario primar- ily as an illustrative benchmark and also to see the impact on some of the poorer countries in sub-Saharan Africa, for whom large public transfers remain politically feasible. II. QUANTIFYING ECONOMIC EFFECTS UN DER COOPERATIV E REDUCTIONS The quantitative results presented in this paper rely on a speci�c CGE model that has been developed at the World Bank, known as the Environmental Mattoo, Subramanian, van der Mensbrugghe and He 301 Impact and Sustainability Applied General Equilibrium Model, or the ENVISAGE model.8 The primary purpose of the ENVISAGE model is to assess the growth and structural impacts for developing countries from climate change itself and policies to address climate change—either unilaterally or in an international agreement. Any quantitative analysis in this �eld will be conditional on assumptions regarding exogenous developments (for example, the future cost of alternative technologies), key parameter values (for example, intra-fuel substitution elasti- cities) and model speci�cation (for example, carbon tax revenue recycling). Our quantitative exercise is meant to be illustrative of the signs and broad mag- nitudes of effects rather than to be taken as de�nitive in any way. The reader should nonetheless keep in mind certain caveats regarding the model and its results. First, and foremost, the model is not equipped to quantify any of the welfare bene�ts from emissions reductions per se and does not account for emissions related to forestry.9 Second, the modeling does not take into account any preexisting subsidies or other distortions in developing country energy markets whose elimination could provide opportunities for emission abatement. The OECD (2009) has calculated the fuel subsidies in a number of developing economies. Most of these are consumption rather than production subsidies and, although they vary across fuel types and income groups, their average value is relatively low (for example, less than 3 percent for China). But eliminating these subsidies will have positive welfare consequences—because distortions rise proportional- ly to the square of the subsidy rate—that our results do not incorporate. The IEA has estimated in its recent World Energy Outlook that eliminating subsid- ies alone could reduce global emissions by 10–15 percent. Third, the model is not able to represent the full range of available alterna- tive technologies, and so may tend to exaggerate the output and trade responses as energy prices rise with emission limits. But some features of the model may limit the biases on this score. We allow for exogenous improve- ments in manufacturing energy ef�ciency through the accumulation of more advanced capital stock. Also, the current version of the ENVISAGE model does allow for limited substitution between technologies. For example, it allows for switching to alternative (and cleaner) technologies in the power 8. The model has several distinguishing features: a focus on developing countries and signi�cant sectoral disaggregation, an integrated climate module that generates changes in global mean temperature based on emissions of four greenhouse gases, and economic damage functions linked to changes in temperature. A description of the model and the key assumptions are provided in van der Mensbrugghe (2010). 9. The analysis similarly ignores co-bene�ts that could arise from a reduction in carbon intensity— notably a decline in local emissions such as particulate matter, sulfur, etc. 302 THE WORLD BANK ECONOMIC REVIEW sector, albeit in limited fashion.10 The model also allows for some substitution to natural gas in the transportation sector, but not to biofuels and only to a limited extent to electricity (to the extent some modes of public transportation already rely on electricity). The limited possibilities for technological substitution may not be unrealistic given that our horizon is relatively short-term: we are projecting economic magnitudes for 2020, about 10 years out from today. Also, the emission taxes and the consequential price changes in our model are relatively small. For example, in the most extreme scenario, when both high and low income coun- tries reduce emissions, the overall price of energy rises by 41 percent in China and 26 percent in India. These prices are not large enough to induce large tech- nology switching responses. For example, Birdsall and Subramanian (2009) �nd that it took the oil price shock of the 1970s—which involved a quadru- pling of energy prices—to induce a small response in energy ef�ciency in pro- duction and an even more modest response on the consumption side. Fourth, the numerical analysis presented below focuses exclusively on carbon emissions. These emissions alone are estimated to be responsible for some 71 percent of total greenhouse gas emissions (in units of carbon equiva- lent) without deforestation and around 60 percent including deforestation.11 The results are likely to vary if other greenhouse gases are taken into account in any emission reduction scenario. A voluminous report in the Energy Journal concluded that taxes on carbon equivalent emissions would be lower in a multi-gas environment, with the mean tax on carbon equivalent emissions some 48 percent lower in 2025 than when considering carbon-only emis- sions.12 Marginal abatement curves for the other gases are considerably differ- ent and it may not necessarily be the case that they are signi�cantly less steep in developing countries than in developed countries. 10. The current electricity technologies include �ve activities—coal, oil and gas, hydro, nuclear, and other (essentially renewable). The �ve activities are aggregated together to ‘generate’ a single electricity commodity distributed to households and producers. The ‘aggregator’ (for example the electricity distribution sector) chooses the least cost supplier subject to a CES aggregation function (that is calibrated to base year shares). Thus the coal producer will see a decline in demand relative to other producers—particularly hydro, nuclear and other—when subject to the carbon tax. The amount of the shift will depend on both the overall demand elasticity as well as the base year share. In the current baseline, these shares are �xed at base year levels. It is clear that there are nonprice factors that are pushing these shares in one direction or another and we are witnessing rapid rises (from a very low base) in renewable technologies (notably wind and solar). In the model, and in reality, expansion of hydro is limited to physical potential. We make no effort to model changes in the share of nuclear power. In addition, the model ignores one potentially signi�cant change in power generation and that is the introduction of carbon capture and storage (CCS) for coal and gas powered thermal plants. However, CCS is unlikely to become a major technology before 2020 (though its anticipation could affect investment decisions in the near term). CCS may also be a feasible technology in some other fossil fuel–dependent sectors such as cement and iron and steel production. 11. See Metz and others 2007. 12. See Weyant and others 2006. Mattoo, Subramanian, van der Mensbrugghe and He 303 Analytical Intuition It may help to spell out the intuition for some to imagine the results in a world where there are just two countries, one rich and one poor. Assume that initially, the price of emissions is zero in both countries, and in each country the equilib- rium is where the marginal bene�t of emissions equals the zero price of emissions. The implications of a global agreement to cut emissions in each country can be considered in steps. First, a cut implemented either through a cap or through taxes will result in an increase in the price of emissions in each country, with a larger increase in the price in the rich country, if we assume that it is required to make bigger cuts than the poor country and that emissions are not traded. These emissions reductions would imply certain cuts in output in each country. Next, we allow international emissions tradability, which leads to arbitrage and the establishment of a uniform global price of emissions. Given our assumptions about the relative size of initial cuts, the result of tradability per se will be a rise in the price of emissions in the poor country and a decline in the rich country. In the latter, the sale of emission rights implies that output falls further whereas in the rich country, the purchase of emission rights means that output expands, so that the overall reduction in output is less than that due to emissions reductions per se. In other words, tradability accentuates the emission reduction-induced output cuts in the poor country and ameliorates the output cuts in the rich country. Of course, world output and welfare expand because the gains in the rich country outweigh the losses in the poor country. Tradability ensures a more ef- �cient allocation of world resources. Furthermore, tradability ensures that the loss in output for the poor country will be more than compensated by the �- nancial transfers that will automatically occur as a result of tradability. In addition to the negative impact on aggregate economic activity, emission cuts and the tradability of emissions can also affect the composition of eco- nomic activity. Consider an economy that produces manufactured goods (M) and other tradable goods (A) and, in the initial equilibrium, exports M and imports A. We have seen above that emission cuts combined with tradable permits lead to a contraction in the “emission endowments� of developing countries. If the production of M is energy intensive relative to the production of A (as is suggested by Table 6, which provides data on carbon intensity of the different sectors across the world) then, with unchanged world prices, this contraction is likely to lead to a cut in M output and increase in A output (which follows from the Rybczynski Theorem). But international prices are likely to change. If M were to become relatively more expensive (because at the global level, supply of relatively energy-intensive goods will decline), then the change in prices is likely to encourage the country’s production of M. If the country’s production of M is energy intensive relative to the rest of the world, then it is likely that the Rybczynski effect will dominate the relative price effect, and the share of M in total output will contract. 304 THE WORLD BANK ECONOMIC REVIEW In this new equilibrium, without any transfers, the economy attains a lower level of welfare. Transfers from the rest of the world, in the form of payments for emission rights, could of course lead to an outward shift in the economy’s “budget constraint� and compensate partially or entirely for the loss in welfare. Emissions cuts and tradability of emission rights will also have dynamic effects through transfers and changes in the composition of economic activity. If a sig- ni�cant proportion of the transfers is invested in enhancing the economy’s pro- ductive capacity, then it is conceivable that any contraction could be eventually offset. But if transfers are mostly devoted to maintaining consumption, then the economy would suffer a durable contraction in productive capacity. The Role of Abatement Costs Even within developing countries, the impact of emission reductions is likely to differ across regions because abatements costs will vary. Marginal abatement costs are best captured by linking the implied carbon tax to different levels of emissions reductions effort. For any given level of emissions reductions, carbon taxes will depend on three factors. First, the greater the technological substitu- tion possibilities between inputs including energy, the lower will be the carbon tax. Second, the greater the carbon intensity of production, the greater the pos- sibility of achieving further ef�ciency and hence the lower the carbon tax will have to be. If a country is already green, it will be more dif�cult to squeeze out ef�ciency gains; put differently, higher carbon prices will be required to do so. Finally, the higher the initial price of energy, the higher the tax to achieve a given emissions reduction. This latter follows from the fact the emissions reductions are measured in percent, and the required outcome is some percent increase in the price of carbon. The greater the initial price, the larger the tax will have to be to achieve this given percent reduction in prices. To see the role of abatement costs in determining the impact of emissions reductions, it is convenient to divide countries into three categories depending on their carbon intensity (Table 6): China and India for the high carbon inten- sity group (which also includes Russia and the rest of Eastern Europe and Central Asia) all with economy-wide carbon intensities higher than 500 tons per million dollars), and possibly the Middle East and North Africa (at 380 tons per million U.S. dollars).13 Brazil represents the relatively low carbon in- tensity group (which also includes the rest of Latin America) with economy- wide carbon intensities lower than 200 tons per million U.S. dollars. Finally, there is an intermediate group, which includes sub-Saharan Africa (SSA), the rest of South Asia (SA), and the rest of East Asia (EA), with economywide carbon intensities between 280 and 332 tons per million U.S. dollars. 13. More detailed data on carbon intensities is available in Mattoo and others (2009). Production could be relatively carbon intensive in developing countries for these broad GTAP categories both because individual products are produced more carbon-intensively and because the broad product categories include more carbon-intensive products. Mattoo, Subramanian, van der Mensbrugghe and He 305 Marginal abatement costs are greatest for Brazil, which has the lowest carbon intensity of production (largely a result of its ef�cient sources of energy) and lowest for China, which has the highest carbon intensity of pro- duction. But it is worth noting that India, despite being relatively dirty in terms of carbon intensity, has in fact higher abatement costs than Sub-Saharan Africa. The reason is that energy prices on average are higher in India than in Sub-Saharan Africa. Using base year data—which is largely composed of infor- mation from the International Energy Agency and other sources—the average price of re�ned oil and electricity is some 42 and 80 percent, respectively; higher in India. All else equal, this would tend to require a higher carbon tax for the same mitigation effort. These differences are at the heart of understand- ing the changes in output, exports, and their composition between the different groups of countries. The actual results will, of course, incorporate not just static differences but also dynamic changes to these various economies that will see changes in average carbon intensity, energy prices and economic ‘flexibility.’ Category 1: High Carbon Intensity Countries (China, India, Russia, Eastern Europe and Central Asia, and the Middle East and North Africa) We consider the impact on this group of countries in each of the four scenarios described in Section II. In the �rst scenario, when cuts are implemented without the possibility of international trade in emission rights (the “NTER� scenario), our simulations suggest that the average carbon price in high income countries rises to US$260 per ton of carbon and to US$40 per ton of carbon in low and middle income countries (LMICS) (Table 2).14 Aggregate welfare would fall by 1.0 percent relative to the baseline in all LMICs, and by about 0.6 percent in the high income countries (Table 3). The net impact on measured welfare will depend on a number of elements, but for some countries a key factor will be the change in the price of primary energy. For net importers of primary energy, the price of which declines in the presence of a carbon tax, the impact is reflected in a net positive terms of trade gain. The reverse effect occurs for net exporters of primary energy. For example for China and India, the reduction in the import bill is between US$1-2 billion in 2020 when taking only the price impact into account.15 For these two economies, this represents between 0.02 and 0.08 percent of their baseline real income, with the lower percent gain prevailing for China. Thus the terms of trade impact counteracts to some extent the real income loss suf- fered by raising the carbon tax. Net energy exporters witness a terms of trade loss—for example, between US$3-5 billion for Russia (depending on the 14. All prices are measured in terms of 2004 U.S. dollars per ton of carbon. The price per ton of CO2 can be obtained by dividing the carbon price by approximately 4 (or, more precisely, by 44/12 % 3.67). 15. The terms of trade impact in dollar terms is calculated as the change in the world price (relative to the baseline) multiplied by the average of the ex ante and ex post trade volumes. 306 THE WORLD BANK ECONOMIC REVIEW T A B L E 2 . Impact on Emissions Reductions Low and High EU27 Middle World Income United with Income Sub-Saharan Scenario Total Countries States EFTA Countries China India Brazil Africa % Change in Emissions Relative to Business as Usual (BAU) in 2020 NTER 2 21.5 2 30.0 2 33.5 2 29.9 2 17.0 2 17.0 2 17.0 2 17.0 2 17.0 TER1 2 21.5 2 10.5 2 11.8 2 8.1 2 27.5 2 33.8 2 23.8 2 8.3 2 28.8 TER 2 21.5 2 10.5 2 11.9 2 8.2 2 27.5 2 33.8 2 23.8 2 8.3 2 28.8 TERWMT 2 21.5 2 10.7 2 12.0 2 8.4 2 27.4 2 33.8 2 23.7 2 8.3 2 28.7 % Change in Emissions Relative to 2005 NTER 35.6 2 13.4 2 17.0 2 17.0 82.6 132.9 110.8 12.7 47.1 TER1 35.6 10.7 10.0 8.9 59.4 85.8 93.7 24.6 26.2 TER 35.6 10.6 9.9 8.7 59.5 85.9 93.6 24.5 26.2 TERWMT 35.6 10.5 9.8 8.4 59.7 85.9 93.7 24.5 26.4 Implied Emissions Tax in dollars per ton of carbon NTER 109.0 259.6 250.2 339.1 40.4 25.3 41.6 157.3 33.5 TER1 62.2 62.2 62.2 62.2 62.2 62.2 62.2 62.2 62.2 TER 62.5 62.5 62.5 62.5 62.5 62.5 62.5 62.5 62.5 TERWMT 63.0 63.0 63.0 63.0 63.0 63.0 63.0 63.0 63.0 Notes: NTER: Both industrial and developing countries reduce emissions by emission rights are not tradable. TER1: Both industrial and developing countries reduce emissions; emission rights are tradable, but we abstract from private transfers. TER: Both industrial and developing countries reduce emissions and emission rights are tradable. TERWMT: Both industrial and developing countries reduce emissions; emission rights are tradable; and transfers offset welfare loss from emissions. scenario). This translates into a signi�cant welfare loss for Russia of between 0.23–0.35 percent, because its economy is reliant on energy exports. Manufacturing exports decline by 2.1 percent in China and 3.5 percent in India. The corresponding declines in manufacturing output are 1.3 percent and 1.6 percent, respectively (Table 3).16 The main reason for these declines is that manufacturing is the most carbon-intensive sector, after the energy sector itself, and so is worst hit by carbon price increases. In the second scenario (TER1), tradability leads to a uniform global carbon price (of US$62 per ton (Table 2)) but we abstract from the private transfers that would result from tradability. Recall that this scenario is equivalent to a uniform global carbon tax regime where the taxes are retained domestically rather than being transferred across countries. In this case, welfare losses in- crease substantially, especially for China from 0.8 percent to 1.7 percent, and 16. Russia is an exception in this group of countries because it’s manufacturing output and exports increase in the NTER scenario (see supplementary results online at www.worldbank.org/trade). The reason is that when all countries cut their emissions, there is a signi�cant contraction in global demand for energy; energy accounts for a large share of the Russian economy (53 percent of its exports and 24 percent of its output, as shown in supplementary results online at www.worldbank.org/trade, the contraction in demand induces a signi�cant shift in resources away from Russia’s energy sector and towards other sectors, including manufacturing. Mattoo, Subramanian, van der Mensbrugghe and He 307 T A B L E 3 . Impact on Welfare, Manufacturing Output, and Exports Low and High EU27 Middle World Income United with Income Sub-Saharan Scenario Total Countries States EFTA Countries China India Brazil Africa % Change in Welfare NTER 2 0.7 2 0.6 2 0.6 2 0.6 2 1.0 2 0.8 2 0.6 2 0.5 2 1.0 TER1 2 0.4 2 0.1 2 0.1 0.0 2 1.2 2 1.7 2 0.9 2 0.2 2 0.9 TER 2 0.4 2 0.2 2 0.3 2 0.2 2 0.8 2 0.8 2 0.6 2 0.3 2 0.7 TERWMT 2 0.3 2 0.5 2 0.5 2 0.6 0.0 0.0 0.0 0.0 0.0 % Change in Output of Total Manufacturing NTER 2 0.8 2 0.7 2 1.0 2 0.5 2 1.0 2 1.3 2 1.6 2 0.4 1.1 TER1 2 0.8 0.1 0.0 0.3 2 2.0 2 2.9 2 2.6 0.2 0.4 TER 2 0.7 0.3 0.4 0.3 2 2.2 2 3.2 2 2.7 0.2 0.3 TERWMT 2 0.7 0.6 0.7 0.6 2 2.5 2 3.5 2 3.0 2 0.1 0.1 % Change in Exports of Total Manufacturing NTER 2 1.5 2 1.6 2 1.8 2 1.4 2 1.4 2 2.1 2 3.5 2 1.0 3.6 TER1 2 0.9 0.2 0.1 0.7 2 2.0 2 2.9 2 4.4 0.8 1.5 TER 2 0.9 0.7 1.3 1.3 2 2.7 2 4.3 2 5.3 1.0 1.0 TERWMT 2 0.9 1.8 2.3 3.0 2 3.9 2 5.6 2 7.3 2 0.3 2 0.5 Notes: Changes are expressed relative to business-as-usual in 2020. NTER: Both industrial and developing countries reduce emissions by emission rights are not tradable. TER1: Both indus- trial and developing countries reduce emissions; emission rights are tradable, but we abstract from private transfers. TER: Both industrial and developing countries reduce emissions and emis- sion rights are tradable. TERWMT: Both industrial and developing countries reduce emissions; emission rights are tradable; and transfers offset welfare loss from emissions. for India from 0.6 percent to 0.9 percent (see scenario TER1 in Table 3). Manufacturing output declines further to 2.9 and 2.6 percent, respectively, in China and India. Allowing private transfers (along with tradability), as expected, alleviates the welfare declines seen in the non-tradability scenario (see scenario TER in Table 3).17 However, it magni�es the impact especially on manufacturing exports via exchange rate appreciation. For example, China’s manufacturing exports fall by 4.3 percent, and India’s by 5.3 percent. The pure effect of the private transfers (the difference between the TER1 and TER scenarios) is to induce a further decline in exports amounting to 1.4 percent for China and 0.9 percent for India.18 It must be emphasized here that while these changes appear small, they are small in large part because the emissions reductions effort by all 17. The magnitude of this effect depends of course on the quota allocation scheme. 18. In our model, exchange rate appreciation effects from transfers arise mainly from the condition that the external accounts must be balanced, which is a plausible description of long run equilibrium. Are these effects from transfers plausible? In the case of China, for example, the results suggest that a transfer of about 1.8 percent of GDP would depress manufacturing export growth by about 0.5 percent. This is well within the range obtained from econometric estimates: Rajan and Subramanian (2011) �nd that a 1 percent increase in the aid-to-GDP ratio tends to reduce overall manufacturing growth by close to 1 percent. 308 THE WORLD BANK ECONOMIC REVIEW countries is very modest, reflected in the fact that the global price of CO2 increases by about US$17 per ton. (In Mattoo and others 2009, we show that declines are substantially larger with more ambitious emissions reductions of 30 percent for both groups instead of the 17 percent assumed here.) The other high carbon intensity countries in regions such as the Middle East and North Africa and Eastern Europe and Central Asia suffer output and export reductions, due to the emissions reductions just as in China and India. But the former group does not suffer much from emissions tradability and the implied private transfers. The magnitude of transfers will depend on the wedge between the domestic carbon price prevailing after emission cuts and the uniform global price that will prevail with tradability. For the Middle East and North Africa and Eastern Europe and Central Asia, the former is close to the latter, so that tradability leads to a small price change and hence also to a small private transfer. If developing countries were to receive additional of�cial transfers to com- pensate for the loss of welfare caused by emissions reductions, then the ex- change rate appreciation effects would be even stronger (see scenario TERWMT in Table 3). Manufacturing exports would decline by 5.6 percent and 7.3 percent, respectively for China and India. The corresponding �gures for manufacturing output are 3.5 percent and 3.0 percent, respectively. As we mentioned earlier, these transfers are unlikely to materialize for the larger developing countries but cannot be ruled out for poorer countries. To maintain welfare, the EU, Japan, and the United States would be required to make total transfers ( public and private) equal to about 0.5 percent of their GDP. In sum, emission limits with tradability create a dilemma for this group of countries: tradability leads to a contraction in the manufacturing sector, and the more the country seeks to maintain welfare, the higher the price it will pay in terms of further contraction of this sector. Generalizing the Results to Other Scenarios Are these results unique to the assumptions we have made about the extent of emissions reductions by developing countries? In Figures 1 –3, we show the consequence of replicating the analysis described above for a range of emissions reduction by developing countries—from no emissions reduction (relative to BAU) to a 40 percent cut—keeping the emissions reduction by high income countries �xed at 17 percent below 2005 levels. For China, for example, we �nd results consistent with the �ndings described earlier. Several features are noteworthy about these �gures. First, as expected, the greater the emissions reductions by these countries the greater the decline in their manufacturing exports and output. More interesting are the respective consequences of tradability and transfers which are captured by the gap between the different lines in the graph. For exports, signi�cant adverse impacts arise from the exchange rate effects of transfers (see in Figure 1 the difference between TER1, which involves no Mattoo, Subramanian, van der Mensbrugghe and He 309 F I G U R E 1. Impact on China’s Manufacturing Exports and Output of Emissions Reductions by All Developing Countries ( percentage change relative to BAU in 2020) Source: Authors’ analysis based on data described in the text. Note: Emissions reductions by high income countries are �xed at 17 percent below 2005 levels. transfers, and TER, which allows private transfers, or TERWMT, which allows also public transfers). The incremental effect of private transfers increases with the level of emissions reductions (gap between TER and TER1 scenarios widens).19 Note that a 40 percent emissions reduction relative to BAU still represents an increase in emissions relative to 2005. If developing countries had to start ensuring even stabilization of carbon emissions by 2020, the implied effects on manufacturing exports, based on extrapolating the trends shown in Figure 1, would be enormously large. For output, the signi�cant adverse effects arise from the economy-wide carbon price-increasing effects of tradability (see in Figure 1 the difference between NTER, which assumes emissions are not tradable, and TER1, which assumes emissions are tradable). In fact, even if China made no cuts in emis- sions but bound emission levels at BAU levels and allowed international trad- ability, each would see a decline in manufacturing output of about 1.0 percent. Category 2: Low Carbon Intensity Countries (Brazil and Latin America) The effects on the manufacturing sector of low carbon intensity countries from policy actions related to climate change are likely to be different from those on high carbon intensity countries. There are two counteracting factors. On the one hand, any change in the price of carbon affects manufacturing output and competitiveness less in these countries because of their low carbon intensity. 19. The magnitude of transfers for any country is the product of the international price of carbon and its own sales/purchases of emissions. The international price rises with deeper emissions cuts by developing countries. The sales/purchases will depend on the wedge between the domestic and international price of carbon. In the case of China, this wedge narrows more gradually—and hence the volume of its emissions sales declines gradually—because of its greater carbon intensity. 310 THE WORLD BANK ECONOMIC REVIEW F I G U R E 2. Impact on Brazil’s Manufacturing Exports and Output of Emissions Reductions by All Developing Countries ( percentage change relative to BAU in 2020) Source: Authors’ analysis based on data described in the text. Brazil’s carbon intensity in manufacturing, for example, at 168 U.S. dollars per ton (Table 6), is about one-quarter and one-third of China’s and India’s, re- spectively. On the other hand, as discussed above in the section on abatement costs, reductions in emissions require progressively higher carbon price increases in these countries, in large part because their production is already relatively clean and it is harder for them to squeeze out deeper and deeper reductions. For example, to achieve a 5 percent reduction, Brazil’s carbon price would need to be US$43; but to achieve a 17 percent reduction in emissions, Brazil’s carbon price would need to increase to US$157 per ton of carbon, about four times the required level in India, and six times the required level in China (Table 2). When only small cuts are made in emissions reductions by developing coun- tries, the positive effect on Brazil’s manufacturing sector of its relatively low carbon intensity dominates the negative effect due to its higher carbon price (see NTER in Figure 2). But when larger cuts are made, the converse is true— the large increases in carbon price overwhelm the bene�ts of low relative carbon intensity—so that Brazil’s manufacturing exports and output decline. If trade in emissions rights is allowed, Brazil enters the market at low levels of emissions reductions as a seller but at higher levels of emission reductions as a buyer—like the high income countries. The result in the latter situations is a decline in the carbon price toward the global uniform price and private out- flows, both of which bene�t the manufacturing sector (see TER1 and TER in Figure 2). Category 3: Intermediate Carbon Intensity Countries (Sub-Saharan Africa, Rest of South Asia, and Rest of East Asia) The impact on sub-Saharan Africa, South Asia, and East Asia is broadly inter- mediate between those on the high and low carbon intensity economies, and Mattoo, Subramanian, van der Mensbrugghe and He 311 F I G U R E 3. Impact on sub-Saharan Africa’s Manufacturing Exports and Output of Emissions Reductions by All Developing Countries ( percentage change relative to BAU in 2020) Source: Authors’ analysis based on data described in the text. we focus here on sub-Saharan Africa.20 The sub-Saharan African manufactur- ing sector actually expands if all developing countries cut their emissions by 30 percent (see NTER in Figure 3). The reason is primarily sub-Saharan African countries’ low carbon intensity in manufacturing, which combined with the lower emissions tax consequent upon emissions reductions, actually improves competitiveness relative to other countries. However, if sub-Saharan African countries receive large public transfers to compensate for loss in welfare (1.5 percent), then they could experience an adverse export effect from an exchange rate appreciation. The negative effect of public transfers (the gap between TER and TERWMT in Figure 3) on manu- facturing exports could be close to 1.5 percent—unless these transfers were suc- cessfully invested in ways that enhanced productivity in manufacturing or reduced trade costs. Changes in the Composition of Manufacturing It is clear that the bigger impacts are in energy intensive manufacturing, but coun- tries may also be interested in the impacts on other manufacturing sectors (which includes clothing, electronics and transport equipment).21 The trade-off between carbon and long-run growth effects could be different between these sectors. For example, if the dynamic growth effects are less strong in energy intensive sectors than in other manufacturing sectors, and if the latter are not substantially affected by emissions reductions and emissions tradability, international commitments on emissions reductions should raise fewer growth concerns. 20. East Asia resembles Brazil in that emissions reductions require a high carbon price due to their already relatively clean production. Therefore, emissions trading leads to a decline in the carbon price which bene�ts manufacturing. 21. While we make a distinction between energy-intensive and other manufacturing, we could also distinguish between heavy and light manufacturing. 312 THE WORLD BANK ECONOMIC REVIEW In countries like China and India, the impact of emissions reductions and tradability on the category “other manufacturing� will also be substantial (Tables 4 and 5). Output will decline by 2.2 percent and 1.5 percent, respective- ly for the two countries, and exports by close to 3 percent for both countries. For other countries such as Brazil, East Asia, and sub-Saharan Africa, the impact on the output of other manufacturing sectors will be relatively modest. It is note- worthy that exchange rate effects will remain strong for exports of other manu- facturing sectors in China, India, and sub-Saharan Africa. For China and India, the effect of private transfers per se is to induce a decline in exports of 1.5 and 1.0 percent, respectively. For sub-Saharan Africa, the effect of private and public transfers is to induce a 2.3 percent decline in other manufacturing exports. Overall, the preceding results suggest that the effects on manufacturing output and exports may differ across developing countries. High carbon inten- sity countries (China, India, Eastern Europe and Central Asia, and the Middle East and North Africa) will see a larger negative impact on both manufacturing output and exports than low carbon intensity countries (Brazil, Latin America, East Asia). Some high carbon intensity countries (especially China and India) may also be resistant to emissions tradability because of the further negative impact on output and of the impact of the resulting private transfers on manu- facturing exports. Low carbon intensity countries will not be averse to emis- sions tradability on these grounds. For sub-Saharan African countries, a potential negative effect could stem from the effect of public transfers on manufacturing exports, unless these transfers could be successfully invested in ways that enhanced productivity in manufacturing or reduce traded costs. The Implications of Endogenous Technological Change In all the estimates thus far, we have assumed a worldwide autonomous energy ef�ciency improvement (AEEI) of 1 percent per annum in each of the scenarios—business-as-usual, as well as the different policy scenarios. But it is possible, indeed quite likely, that some endogenous technological improvements will be induced by the emissions reductions actions undertaken pursuant to any future climate change agreement. In order to assess the robustness of our results, we assume that the climate mitigation actions described in Table 1 will induce a further improvement in energy ef�ciency of 50 percent relative to the baseline (i.e., the AEEI parameter is set at 1.5 percent per annum) in developing countries relative to the business-as-usual scenario starting in the �rst policy year, 2012.22 We do not consider any costs that may be associated with the higher AEEI. As expected, energy improvements do mitigate some of the adverse impacts on manufacturing output and exports. For example, in the case of China, 22. In principle, some energy ef�ciency will also be induced in industrial countries. But the emissions reductions will imply a much greater price increase in developing countries, which combined with the greater possibilities of adaptation and of reducing x-inef�ciency over the time frame we are considering, imply that ef�ciency increases are likely to be greater in developing than industrial countries. Mattoo, Subramanian, van der Mensbrugghe and He 313 T A B L E 4 . Change in Output by Sector (percent) High EU27 Low and Income United with Middle Income Sub-Saharan Countries States EFTA Countries China India Brazil Africa NTER Agriculture 2 1.3 2 2.2 2 1.1 0.2 0.5 0.0 2 1.6 0.3 All energy 2 9.6 2 14.8 2 7.5 2 6.2 2 5.3 2 5.8 2 5.3 2 4.0 All manufacturing 2 0.7 2 1.0 2 0.5 2 1.0 2 1.3 2 1.6 2 0.4 1.1 Energy intensive 2 1.3 2 3.3 2 0.5 2 0.8 2 0.6 2 1.7 2 1.0 3.7 manufacturing Other 2 0.4 0.1 2 0.6 2 1.1 2 1.7 2 1.6 2 0.1 2 0.1 manufacturing Other industries 2 0.7 2 1.3 2 0.3 2 0.8 2 0.8 2 0.6 2 0.4 2 0.8 Service 2 0.3 2 0.3 2 0.5 2 0.4 2 0.6 2 0.5 2 0.5 2 0.1 All goods and 2 0.8 2 1.2 2 0.7 2 1.2 2 1.3 2 1.4 2 1.0 2 0.2 services TER1 Agriculture 2 1.3 2 2.1 2 0.8 0.3 0.7 2 0.1 2 1.6 0.6 All energy 2 3.5 2 5.9 2 1.9 2 8.3 2 11.3 2 9.0 2 2.6 2 4.5 All manufacturing 0.1 0.0 0.3 2 2.0 2 2.9 2 2.6 0.2 0.4 Energy intensive 1.1 0.1 1.5 2 4.1 2 4.9 2 4.8 0.4 0.6 manufacturing Other 2 0.3 0.0 2 0.4 2 1.0 2 1.9 2 1.4 0.1 0.3 manufacturing Other industries 2 0.2 2 0.4 0.0 2 1.4 2 2.1 2 1.2 2 0.2 2 1.2 Service 2 0.1 2 0.1 2 0.1 2 0.7 2 1.4 2 0.8 2 0.3 2 0.2 All goods and 2 0.2 2 0.3 2 0.1 2 2.0 2 2.8 2 2.2 2 0.4 2 0.5 services TER Agriculture 2 1.1 2 1.5 2 0.7 0.3 0.7 2 0.1 2 1.3 0.6 All energy 2 3.5 2 5.8 2 1.9 2 8.3 2 11.2 2 8.9 2 2.5 2 4.5 All manufacturing 0.3 0.4 0.3 2 2.2 2 3.2 2 2.7 0.2 0.3 Energy intensive 1.2 0.4 1.6 2 4.3 2 5.1 2 5.0 0.4 0.5 manufacturing Other 2 0.2 0.4 2 0.3 2 1.2 2 2.2 2 1.5 0.1 0.3 manufacturing Other industries 2 0.3 2 0.5 2 0.1 2 0.9 2 1.3 2 1.0 2 0.2 2 1.0 Service 2 0.1 2 0.1 2 0.2 2 0.6 2 1.0 2 0.7 2 0.3 2 0.2 All goods and 2 0.2 2 0.3 2 0.1 2 1.9 2 2.8 2 2.2 2 0.4 2 0.5 services TERWMT Agriculture 2 0.6 2 0.8 2 0.5 0.3 0.7 0.0 2 1.4 0.5 All energy 2 3.4 2 5.8 2 1.9 2 8.2 2 11.1 2 8.9 2 2.6 2 4.2 All manufacturing 0.6 0.7 0.6 2 2.5 2 3.5 2 3.0 2 0.1 0.1 Energy intensive 1.5 0.7 1.9 2 4.6 2 5.3 2 5.4 0.1 0.2 manufacturing Other 0.1 0.7 0.0 2 1.5 2 2.5 2 1.8 2 0.1 0.1 manufacturing Other industries 2 0.5 2 0.7 2 0.5 2 0.2 2 0.5 2 0.5 0.1 2 0.4 Service 2 0.2 2 0.2 2 0.3 2 0.3 2 0.7 2 0.5 2 0.2 0.0 All goods and 2 0.1 2 0.3 0.0 2 1.9 2 2.7 2 2.1 2 0.4 2 0.4 services Notes: Changes are expressed relative to business-as-usual in 2020. NTER: Both industrial and developing countries reduce emissions by emission rights are not tradable. TER1: Both indus- trial and developing countries reduce emissions; emission rights are tradable, but we abstract from private transfers. TER: Both industrial and developing countries reduce emissions and emis- sion rights are tradable. TERWMT: Both industrial and developing countries reduce emissions; emission rights are tradable; and transfers offset welfare loss from emissions. 314 THE WORLD BANK ECONOMIC REVIEW T A B L E 5 . Change in Exports by Sector (percent) High EU27 Low and Income United with Middle Income Sub-Saharan Countries States EFTA Countries China India Brazil Africa NTER Agriculture 2 4.1 2 4.4 2 7.7 2 1.7 2.5 2.0 2 4.2 1.1 All energy 2 12.1 2 23.3 2 8.4 2 8.2 2 8.1 4.8 2 2.3 2 4.4 All manufacturing 2 1.6 2 1.8 2 1.4 2 1.4 2 2.1 2 3.5 2 1.0 3.6 Energy intensive 2 3.3 2 8.4 2 0.2 0.2 2.0 2 2.2 2 3.2 9.2 manufacturing Other manufacturing 2 0.9 0.9 2 2.0 2 1.9 2 2.9 2 4.0 0.0 2 0.5 Other industries 2 4.5 2 1.2 2 2.6 0.6 2 0.7 2 0.7 2 3.8 0.0 Services 2 1.9 2 2.6 2 2.8 2.9 3.6 3.1 2 6.6 2.8 All goods and 2 2.2 2 2.9 2 2.2 2 1.7 2 1.8 2 1.8 2 2.4 2 0.3 services TER1 Agriculture 2 4.5 2 4.4 2 6.4 2 1.7 11.8 4.1 2 4.3 2.2 All energy 2 5.0 2 10.5 2 0.8 2 4.1 2 8.1 2 0.2 2 4.2 2 2.3 All manufacturing 0.2 0.1 0.7 2 2.0 2 2.9 2 4.4 0.8 1.5 Energy intensive 3.2 0.6 5.9 2 7.5 2 12.5 2 11.0 1.7 1.4 manufacturing Other manufacturing 2 1.1 2 0.1 2 1.8 2 0.4 2 1.1 2 2.0 0.4 1.6 Other industries 2 2.7 2 1.6 2 2.6 2 1.3 2 1.4 2 4.1 2 1.6 2 3.3 Services 2 1.5 2 1.5 2 2.1 2.2 4.8 6.2 2 3.0 0.9 All goods and 2 0.6 2 1.0 2 0.3 2 1.7 2 2.4 2 2.3 2 1.4 2 0.5 services TER Agriculture 2 3.4 2 3.2 2 5.3 2 1.3 8.6 2.4 2 3.6 1.6 All energy 2 4.6 2 9.7 2 0.9 2 4.3 2 9.1 2 0.2 2 3.6 2 2.3 All manufacturing 0.7 1.3 1.3 2 2.7 2 4.3 2 5.3 1.0 1.0 Energy intensive 3.8 1.6 6.3 2 8.1 2 13.8 2 11.5 1.8 1.0 manufacturing Other manufacturing 2 0.5 1.1 2 1.1 2 1.1 2 2.6 2 3.0 0.6 1.0 Other industries 2 2.1 2 0.6 2 2.1 2 1.3 2 2.4 2 4.2 2 1.2 2 3.4 Services 2 1.2 2 0.8 2 1.9 1.7 3.4 5.3 2 3.0 0.3 All goods and 2 0.1 0.1 0.1 2 2.3 2 3.9 2 3.1 2 1.0 2 0.8 services TERWMT Agriculture 2 1.6 2 1.7 2 2.5 2 2.1 6.6 2 0.8 2 3.6 2 0.2 All energy 2 4.0 2 9.0 2 0.1 2 4.6 2 9.4 2 0.4 2 3.8 2 2.1 All manufacturing 1.8 2.3 3.0 2 3.9 2 5.6 2 7.3 2 0.3 2 0.5 Energy intensive 4.8 2.5 7.9 2 9.3 2 15.0 2 12.9 0.5 2 0.2 manufacturing Other manufacturing 0.6 2.1 0.7 2 2.4 2 3.9 2 5.3 2 0.7 2 0.7 Other industries 2 1.0 0.5 2 0.9 2 1.8 2 3.0 2 4.7 2 1.7 2 3.9 Services 2 0.6 2 0.2 2 0.8 0.5 2.2 3.5 2 4.4 2 1.1 All goods and 0.9 1.0 1.7 2 3.4 2 5.1 2 4.9 2 1.8 2 1.4 services Notes: Changes are expressed relative to business-as-usual in 2020. NTER: Both industrial and developing countries reduce emissions by emission rights are not tradable. TER1: Both indus- trial and developing countries reduce emissions; emission rights are tradable, but we abstract from private transfers. TER: Both industrial and developing countries reduce emissions and emis- sion rights are tradable. TERWMT: Both industrial and developing countries reduce emissions; emission rights are tradable; and transfers offset welfare loss from emissions. Mattoo, Subramanian, van der Mensbrugghe and He 315 manufacturing output declines by 2.2 percent (compared with 3.2 percent without the improvement in energy ef�ciency) and manufacturing exports decline by 3.5 percent (compared with 4.3 percent without the energy ef�- ciency improvement). For India, the magnitudes are similar: output and exports decline by 1.3 percent and 3.7 percent (compared with 2.7 and 5.3 percent, respectively in the absence of energy ef�ciency improvements).23 Thus, while there is some amelioration, the overall adverse impact of climate change actions remains substantial. It is worth noting that the magnitude of relief will actually be lower than these numbers if we bear in mind that improvements in technology are themselves not costless. Cost of Dislocation Thus far, we have focused on the impact of emissions reductions on the com- position of output and exports. There are also likely to be dislocation costs as resources are reallocated across sectors, and the nature of these dislocations will differ between high and low income countries. For example, in the US and EU, all nine manufacturing sectors in our model are likely to expand as a result of international tradability of emissions; in contrast, in China, eight of the nine sectors are expected to see a decline in output (re�ned oil, chemicals, rubber and plastics, paper products and publishing, mineral products, ferrous metals, other metals, transport equipment, and other manufacturing). In India, seven out of the nine sectors are likely to see a decline in output. If some factors are sector-speci�c and imperfectly mobile (which is assumed away in the model), then the transition to any new equilibrium could lead to at least temporary un- employment. The irony is that high income countries, which typically have better social protection mechanisms, are less likely to need to deal with the contraction of tradable sectors. III. CONCLUSION With an increasing number of countries accepting the need for action to address climate change, both the prospects for and the impact of cooperative emissions reductions are receiving signi�cant attention. This paper has attempted, and provided a methodological tool, to quantify the impact of co- operative policy actions related to climate change on the manufacturing sector in developing countries. We have departed from the existing work on climate change in two ways. First, we have disaggregated the policy actions into emis- sion reductions per se, international emissions tradability, and international transfers. Second, in terms of outcomes, instead of focusing on aggregate output, we quantify the effects on manufacturing output and exports. 23. Details of the estimates under this scenario of endogenous technological improvement are available from the authors upon request. 316 THE WORLD BANK ECONOMIC REVIEW Our key �ndings, which come with all the caveats that we have noted, are the following. Some currently high carbon intensity countries/regions (such as China, India, Eastern Europe and Central Asia, and the Middle East and North Africa) will experience substantial reductions in manufacturing output and exports from emissions reductions per se. For a subset of these countries, espe- cially China and India, these effects will be aggravated by emissions tradability (especially on manufacturing output) and transfers (especially on manufactur- ing exports). For this subset, the negative effects will be substantial not just for carbon-intensive manufacturing but also other manufacturing sectors. In contrast, the manufacturing sector in low carbon intensity countries (such as Brazil and Latin America) will be minimally affected by the actions related to climate change. In the case of sub-Saharan Africa, effects might even be positive, although any boost to manufacturing exports could be reduced through transfers and the consequent exchange rate effects. Of course, if private and public transfers are able to raise productivity and reduce trade costs, then these effects could be offset. If countries, for political or other reasons, want to preserve manufacturing capacity, what are their options in a world with emissions reductions? For low carbon intensity countries, there is not much of a dilemma because the results suggest that there is limited impact of climate change actions on the manufac- turing sector. For sub-Saharan Africa, there might be a tension related to trans- fers which would need to be addressed. But for high carbon intensity countries (especially China, India, Eastern Europe and Central Asia, and the Middle East and North Africa), whose manufacturing exports and output will be substan- tially affected, the choice may be more dif�cult. One solution might then be to tax the carbon externality appropriately (by taking on international obligations on emissions reductions and tradability) while addressing the manufacturing objective through a combination of pro- duction subsidies (if the objective is to preserve manufacturing output) or export subsidies (if the objective is to maintain manufacturing exports). For developing countries, this solution could encounter two problems. First, World Trade Organization rules prohibit the use of export subsidies, and production subsidies can be legally countervailed by trading partners. A second, arguably bigger, problem is the dif�culty of implementing subsidies: the experience with industrial policies and “picking winners� has highlighted the demanding requirements for successfully doing so. Thus, if implementation capacity is limited and countries �nd themselves in a second-best world, the reconciliation of the two objectives becomes more dif�cult. In this case, an alternative approach would be for countries to use one in- strument but to strike a balance between the two objectives. So, if countries cannot implement subsidies to attain the manufacturing objective, they may choose to allow some increase in carbon prices (consequent upon, say, domes- tic emissions reductions) but not to allow any further increase (resulting from emissions tradability). This suggests that selection from the menu of options T A B L E 6 . Carbon intensity of Different Sectors (tons per million U.S. dollars of output, 2004) World High Income United EU27 with Low and Middle Income Sub-Saharan Total Countries States EFTA Countries China India Brazil Africa Direct carbon intensity Agriculture 46 45 57 40 47 82 0 63 15 All energy 758 642 879 485 886 2067 1501 116 727 All manufacturing 46 25 36 18 107 127 111 53 65 Energy intensive 116 62 90 43 279 326 283 141 159 Other 14 8 12 7 31 35 36 8 17 manufacturing Other industries 13 7 3 6 30 29 13 18 38 Services 36 26 32 21 97 82 68 63 67 Total 92 55 74 40 228 303 250 78 149 Direct plus indirect carbon intensity Agriculture 168 98 141 74 223 350 301 129 72 All energy 928 729 1016 541 1147 2800 1749 186 823 All manufacturing 187 99 159 62 449 681 518 168 273 Energy intensive 330 172 272 107 811 1163 888 286 505 Other 122 66 111 42 289 459 354 107 156 manufacturing Other industries 132 60 69 46 342 561 287 89 240 Services 92 67 94 46 242 340 231 101 161 Total 187 109 153 74 479 772 535 149 281 Mattoo, Subramanian, van der Mensbrugghe and He 317 318 THE WORLD BANK ECONOMIC REVIEW within the climate change regime itself could be a possibility for high carbon intensity developing countries. Of course, any policy choices made by developing countries that depart from fully taxing the carbon externality could lead to international differences in carbon prices and make developing countries vulnerable to trade action. This cost could be minimized if developing countries could persuade their in- dustrial country partners as part of a comprehensive agreement on climate change to take either no trade actions or limited actions (described in a com- panion paper, Mattoo and others 2009). REFERENCES The word processed describes informally reproduced works that may not be commonly available through libraries. Birdsall and Subramanian. 2009. “Energy, not Emissions: Equitable Burden-Sharing on Climate Change.� Washington D.C. Processed. Cline, William R. 2007. “Global Warming and Agriculture: Impact Estimates by Country.� Center for Global Development and Peterson Institute for International Economics, Washington, D.C. Elbadawi, Ibrahim. 1999. “External Aid: Help or Hindrance to Export Orientation in Africa?� Journal of African Economies 8(4): 578–616. Mattoo, Aaditya, Arvind Subramanian, Dominique van der Mensbrugghe, and Jianwu He. 2009. “Climate Change and Trade Actions.� World Bank Working Paper No. 5121. Washington D.C. Metz, Bert, Ogunlade Davidson, Peter Bosch, Rutu Dave, and Leo Meyer. eds. 2007. Climate Change 2007: Mitigation of Climate Change. Contribution of Working Group III to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change. Cambridge: Cambridge University Press. Nordhaus, William. 2007. “The Challenge of Global Warming: Economic Models and Environmental Policy in the DICE-2007 Model.� New Haven, CT. Processed. Prasad, Eswar, Raghuram G. Rajan, and Arvind Subramanian. 2008. “Foreign Capital and Economic Growth.� Brookings Papers on Economic Activity (1): 153– 230. Prati, Alessandro, and Thierry Tressel. 2006. “Aid Volatility and Dutch Disease: Is There a Role for Macroeconomic Policies?� IMF Working Paper 06/145. Rajan, Raghuran, and Arvind Subramanian. 2011. “Aid, Dutch Disease and Manufacturing Growth.� Journal of Development Economics 94(1): 106– 118. Stern, Nicholas Herbert. 2007. The Economics of Global Climate Change: The Stern Review. Cambridge: Cambridge University Press. ———. 2009. “Transatlantic Perspective on Climate Change and Trade Policy.� Keynote address, Peterson Institute for International Economics, March 4. Washington, D.C. Strand, Jon. 2009. “Revenue Management Effects Related to Financial Flows Generated by Climate Policy.� Policy Research Paper 5053. World Bank, Policy Research Department: Washington D.C. United Nations. 2007. World Population Prospects: The 2006 Revision Population Database. United Nations Population Division, New York. http://esa.un.org/unpp/. United Nations Development Programme. 2007. “Fighting climate change: Human solidarity in a divided world.� Human Development Report, New York. van der Mensbrugghe, Dominique. 2006. “Linkage Technical Reference Document.� World Bank, Washington, D.C. Processed. Mattoo, Subramanian, van der Mensbrugghe and He 319 ———. 2010. “The Environmental Impact and Sustainability Applied General Equilibrium (ENVISAGE) Model, Version 7.1.� World Bank, Washington, D.C. Processed. Weyant, John P., Francisco C. de la Chesnaye, and Geoff J. Blanford. 2006. “Overview of EMF-21: Multigas Mitigation and Climate Policy.� The Energy Journal. Special Issue 3 (November 2006): 1–32. World Bank. 2009. World Development Report 2010. “Development and Climate Change.� The World Bank: Washington D.C. Trade Liberalization and Investment: Firm-level Evidence from Mexico ˘ lu Ivan T. Kandilov and Aslı Leblebiciog Plant-level panel data from Mexico’s Annual Industrial Survey is employed to evaluate the impact of reductions in tariffs and import license coverage on �nal goods, as well as intermediates, on �rms’ investment decisions. Using data from 1984 to 1990, a period during which a large scale trade liberalization occurred, a dynamic investment equation is estimated using the system-GMM estimator developed by Arellano and Bover (1995) and Blundell and Bond (1998). Consistent with theory, the empirical analyses show that a reduction in import protection on �nal goods leads to lower plant-level invest- ment, whereas reductions in tariffs and import license coverage on intermediate inputs result in higher investment. Also, �rms with larger import costs experience a larger in- crease in investment following a reduction in import protection. On the other hand, higher markup �rms lower investment more aggressively following reductions in tariffs and import license coverage on �nal goods. JEL codes: E22, F13, O16, O24, D92 Economic theory emphasizes the importance of free trade for increasing market ef�ciency and stimulating investment in new technologies. In the last 30 years, trade liberalization has been an important policy tool for many governments around the globe. A large number of developing countries and emerging econ- omies have abandoned protectionist policies in an attempt to boost economic growth (e.g., Brazil, Chile, Colombia, and Mexico).1 Wacziarg and Welch (2008) report that by 2000, more than 70 percent of the world’s countries were open to trade, as de�ned by Sachs and Warner (1995). To date, however, only a few studies have investigated the impact of trade liberalization on invest- ment, which is the focus of this article. In particular, the empirical analysis here estimates the effect of the Mexican trade liberalization on �rm level invest- ment in the manufacturing sector. Ivan T. Kandilov is an associate professor of agricultural and resource economics at North Carolina State University; his e-mail address is ivan_kandilov@ncsu.edu. Aslı Leblebiciog ˘ lu is an assistant professor of economics at North Carolina State University; her e-mail address is alebleb@ncsu.edu. The authors thank the journal editor, Elisabeth Sadoulet, two anonymous referees, and the editorial board for useful comments. 1. As Goldberg and Pavcnik (2004) note, the large literature on trade and growth has not reached a consensus on the impact of trade on economic growth (see also Rodriguez and Rodrik 2000). THE WORLD BANK ECONOMIC REVIEW, VOL. 26, NO. 2, pp. 320– 349 doi:10.1093/wber/lhr048 Advance Access Publication November 13, 2011 # The Author 2011. Published by Oxford University Press on behalf of the International Bank for Reconstruction and Development / THE WORLD BANK. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 320 ˘ lu Kandilov and Leblebiciog 321 In 1985, the Mexican government launched a large-scale trade liberalization program as the existing protectionist trade policies were deemed counter- productive after a foreign exchange crisis and meager growth. Prior to the liber- alization, the most restrictive component of Mexico’s import policy was not the extensive system of tariffs, but rather the high import license coverage (i.e., high ratio of industry output covered by import licenses). By the end of the liberaliza- tion, the incidence of (both input and output) import licenses decreased drastic- ally from about 90 percent in early 1985 to below 20 percent in 1988 (see Figure 1 and Table 1). Over that relatively short period of three years, output tariffs were also aggressively cut from about 40 percent to less than 15 percent, while input tariffs were cut from about 20 percent to about 10 percent. The empirical analysis here employs plant-level data from the Mexican manufacturing sector to estimate the effects of the wide-sweeping trade reforms that occurred in Mexico in the mid-1980s on �rms investment decisions. Recent contributions that assess the impact of trade liberalization on productiv- ity (Amiti and Konings 2007; Topalova and Khandelwal 2011) point out that both input and output tariffs have important, individual effects. Also, the the- oretical framework suggests that access to cheaper inputs via lower input tariffs increases �rm pro�tability, and therefore investment, while lower output tariffs bring about more intense import competition, which results in lower pro�ts and investment. Hence, theoretically, a decrease in input tariffs has the oppos- ite effect on �rm investment from a decrease in output tariffs. To capture these differences, in our empirical investigation, we separate the impact of input tariffs from that of output tariffs. Using data on manufacturing plants from Mexico’s Annual Industrial Survey for the seven year period from 1984 to 1990, a reduced-form dynamic investment equation is estimated employing panel data techniques developed by Arellano and Bover (1995) and Blundell and Bond (1998).2 One advantage of using plant-level panel data is that it allows us to control for unobservable plant effects that influ- ence investment, sales, cash flow, and foreign exposure simultaneously. Consistent with the theoretical framework, the empirical analysis shows that the decrease in input tariffs, as well as import license coverage, resulted in higher investment in Mexican manufacturing establishments. Also, in line with the theory, the results reveal that the drop in output tariffs and license coverage led to a decrease in plant-level investment. The estimated effects are economically and statistically signi�cant, and they suggest that the impacts of input tariffs and license coverage are larger than the impacts of output tariffs and license coverage, respectively. These results are consistent with recent �ndings in the trade liberal- ization and productivity literature discussed earlier, for example, Amiti and Konings (2007), as well as Topalova and Khandelwal (2011). 2. This is the same plant-level panel data from the Mexican manufacturing sector used by Tybout and Westbrook (1995) and Grether (1996), who analyzes the impact of the Mexican trade liberalization on price-cost margins. 322 THE WORLD BANK ECONOMIC REVIEW This article’s contribution to the literature is twofold. To the best of our knowledge, this is the �rst study to consider the effects of trade liberalization on establishment-level investment. While a few country-level and industry-level studies such as Ibarra (1995) and Wacziarg and Welch (2008) assess the impact of trade liberalization on investment, the use of plant-level panel data allows the empirical analysis here to control for plant-speci�c time-invariant unobser- vables that might affect investment and bias the estimated impacts. Moreover, country-level data and industry-level data can hide substantial heterogeneity in the effects of lower trade barriers for different �rms—for example, those with large market power (high markups) versus those with little market power (low markups). Indeed, the results show that �rm-speci�c factors, such as inter- national trade positions and market power, determine the sensitivity of invest- ment to changes in tariffs and import license coverage. Note that Wacziarg and Welch (2008), as well as Ibarra (1995), �nd that aggregate investment in Mexico was negatively affected in the years after the trade liberalization. The second contribution of this article to the literature is that it improves upon pre- vious work, such as Wacziarg and Welch (2008), by considering the impact of both output and input tariffs (and import license coverage) on investment and by employing cross-industry variation in actual tariffs (both output and input tariffs, as well as license coverage), rather than indicator variables for years of pre- and post-liberalization. This article is related to the broader literature on the economic impacts of trade liberalization, and especially the work investigating the effects of lower trade barriers on �rm-level productivity. Some examples in this large and growing area of research include Tybout et al. (1991), Tybout and Westbrook (1995), Pavcnik (2002), Muendler (2004), Amiti and Konings (2007), Fernandes (2007), as well as Topalova and Khandelwal (2011). All of these studies suggest that lower trade protection has a positive impact on economic ef�ciency. Amiti and Konings (2007) and Topalova and Khandelwal (2011) �nd positive effects of both lower input and output tariffs on productivity in Indonesia and India, respectively. Pavcnik (2002), Muendler (2004), and Fernandes (2007) show that tariff liberalization leads to higher �rm-level prod- uctivity in Chile, Brazil, and Colombia, respectively. The Tybout and Westbrook (1995) �ndings suggest that average costs fell in most industries fol- lowing the Mexican trade liberalization. Similarly, Tybout et al. (1991) �nd evidence that Chilean industries that experienced relatively large reductions in protection also experienced relatively large improvements in average ef�ciency levels.3 3. There exists a small literature that relates exporting opportunities to investment. For example, using data from Mexico during 1994–2004, Iacovone and Javorcik (2008) show that future exporters increase product quality (unit value) and investment before they start servicing the foreign market. Similarly, Alvarez and Lopez (2005) use data from Chile and present evidence that exporters invest more, perhaps to upgrade product quality, even before they enter the foreign market compared to �rms that supply to the domestic market alone. ˘ lu Kandilov and Leblebiciog 323 Although different in nature, the empirical analysis here is also related to that of Bustos (2011), who incorporates a technology choice in a Melitz (2003)-type model of trade and heterogeneous �rms. In her model, both trading partners are identical and trade costs (tariffs) are symmetric. Hence, when these costs decline, a �rm in the home country faces increased competi- tion from abroad due to the decrease in tariffs imposed by the home country, but it also sees its exports rise following the decrease in tariffs imposed by the foreign country.4 For exporting �rms, such trade integration results in an in- crease in total revenue and can lead to technology upgrading. The empirical example that Bustos (2011) considers is that of the regional trade agreement MERCOSUR and its effect on Argentinean �rms. She shows that after the trade agreement took effect and Brazilian tariffs fell, Argentinean �rms responded by increasing both exports to Brazil and their spending on technol- ogy upgrading. The rest of the article is organized as follows. Section I describes the details of the Mexican trade liberalization that occurred in the mid-1980s. Section II discusses the theoretical framework which illustrates how lower input and output tariffs affect the �rm’s investment choice. Section III presents the empir- ical speci�cation of the investment equation that is estimated, and it also dis- cusses the econometric issues. Section IV describes the Mexican plant-level data employed in the empirical analysis and presents the summary statistics. Potential endogeneity issues of the trade policies with respect to investment are discussed in section V. The results are presented in section VI. The last section concludes. I . T R A D E L I B E R A L I Z AT I O N I N M E X I C O Through the 1970s, domestic producers in Mexico enjoyed fairly high rates of import protection. As a result of the oil boom in the late 1970s, Mexico’s economy grew steadily during this period. Mexican real GDP per capita increased rapidly and topped $5,400 (constant 2000 U.S. dollars) in 1981 according to the World Bank’s World Development Indicators (WDI)—per capita income similar to that of low-income industrial countries such as Portugal, where per capita income in 1981 was $6,300 (WDI). After a weaken- ing of the oil market in the early 1980s, the Mexican economy faced a number of problems, including a foreign exchange crisis in 1982. While Mexico’s foreign debt was successfully restructured and the high inflation rate declined by 1984, lack of growth was still considered a major issue. Considering that 4. Note that theoretically, this is not the relevant comparative static exercise for the case of the Mexican trade liberalization of the mid-1980s that we analyze here. The Mexican trade liberalization of the mid 1980s was a unilateral trade liberalization, so export revenues of Mexican producers would not be affected following the decline in Mexico’s tariffs. 324 THE WORLD BANK ECONOMIC REVIEW the protectionist trade policies were counterproductive, in 1985 the Mexican government initiated a large scale trade liberalization program. Prior to the liberalization, the Mexican government had imposed a number of different import restrictions. The two most prominent were a system of quantity restrictions in the form of quotas or licensing and an ad valorem import tariff scheme.5 The set of quotas and licenses is considered to have been the most restrictive component of Mexico’s import policy (see Ten Kate, 1992).6 The trade liberalization started in 1985 and proceeded in a number of rounds.7 First, in 1985, a large number of import licenses were removed, which decreased the license coverage from about 90 percent to about 50 percent of domestic production.8 The large majority of goods affected in this �rst round were intermediates and capital goods. Also, to compensate for the reduction in licensing requirements, the Mexican government slightly increased tariffs. In 1986, the trade liberalization continued. In the second round, the focus was on tariffs. The highest tariff rate of 100 percent was eliminated leaving the highest rate at 50 percent. Further, a four-step across-the-board tariff reduction was initiated to decrease tariffs ranging from 0 to 50 percent down to 0 to 30 percent by the end of 1988.9 Finally, in 1987, the government removed the of�- cial tariff surcharge of 5 percent, leaving tariff rates ranging from 0 to 20 percent. This completed the Mexican trade liberalization. For the rest of our sample years until 1990, no other major changes occurred. Figure 1 shows the evolution of the average ( production-weighted) output and input tariffs, as well as the average ( production-weighted) output and input import license coverage in the Mexican manufacturing sector from 1984, which is the �rst year in our sample, to 1990.10 The �gure shows that both tariffs and the license coverage were quite high at the beginning of the sample in 1984 and decline 50 to 80 percent by the end of the sample in 1990. 5. Of�cial minimum prices for customs valuation of imports were also in place (see Ten Kate, 1992). When they were set in the domestic currency, their impact was sometimes greatly reduced because of large devaluations. On the other hand, when they were set much higher than transaction prices, they raised the effective level of the tariffs well above their nominal levels. 6. Established after WWII, its incidence had grown over the years, and in 1982, all imports were subject to licensing. Before the trade reform that was initiated in the second half of 1985, changes in import licensing were primarily driven by changes in the position of Mexico’s balance of payments. For example, many of the license requirements initiated in 1982 were removed in 1984 as a result of improvements in the balance of payments. However, license requirements were removed only for goods that had no domestic competition. 7. This was a unilateral trade liberalization that predated NAFTA, which started in 1994. 8. The remaining licenses covered about 40 percent of imports in 1984. 9. Mexico joined GATT in August of 1986 and agreed to eliminate all of�cial minimum prices by the end of 1987, an agreement which was executed as scheduled. Other commitments upon the accession to GATT were already realized or even surpassed with the unilateral liberalization program at that time. The accession mostly served to bolster Mexico’s credibility to fully implement the trade reforms. 10. See the Data section for details on the construction of the industry input tariffs and license coverage. ˘ lu Kandilov and Leblebiciog 325 F I G U R E 1. Average Tariffs and Licenses Source: Unpublished data provided by Adrian Ten Kate, SECOFI. Table 1 provides details of the tariff and license requirements changes across eight aggregate manufacturing industries.11 For the majority of industries, output tariffs start off between 30 and 40 percent and decline to 10 to 20 percent by the end of the liberalization. Input tariffs start off somewhat lower, between 20 and 30 percent, and drop down to just over 10 percent by 1989. Both output and input license coverage are between 80 and 100 percent in the beginning of the sample and both precipitously drop to near 0 percent by 1989 for most of the manufacturing industries. The �gures presented in Table 1 con- vincingly show that Mexico experienced a large-scale, across-the-board trade liberalization between 1985 and 1988. I I . I N V E S T M E N T A N D TA R I F F S In order to motivate the empirical speci�cation, and to illustrate how input and output tariffs affect investment decisions differently, this section outlines a simple model of investment. Since the model considered yields a standard in- vestment Euler equation augmented with tariffs, the discussion here focuses on the intuition behind the different effects of input and output tariffs on invest- ment. The full set-up of the model and the derivation of the investment equa- tion can be found in the Theoretical Appendix, which is available from the authors upon request. To keep the discussion tractable, we only consider the effects of lower tariffs. Theoretically, the impact of input import license is 11. The plant-level data are classi�ed into 129 4-digit Mexican Census industries, which can be aggregated into 8 main industries roughly corresponding to 2-digit ISIC industries. 326 T A B L E 1 . Tariff and License Coverage Rates Output Tariff Output License Input Tariff Input License Industry 1985 1987 1989 1985 1987 1989 1985 1987 1989 1985 1987 1989 Food, Beverages, Tobacco 41 30 16 100 45 22 19 19 11 55 59 39 Textile, Apparel and Leather Products 40 37 17 93 37 1 29 30 13 84 11 3 Wood and Paper Products 36 33 11 97 9 3 26 25 10 90 16 5 THE WORLD BANK ECONOMIC REVIEW Chemicals and Plastics 30 29 14 87 10 0 21 20 11 81 16 10 Non-metallic Products 32 28 13 95 0 0 22 21 12 73 12 8 Basic Metals 8 15 10 95 0 0 15 16 11 75 2 2 Metal Products, Machinery and Equipment 45 28 16 93 55 45 22 26 13 91 13 4 Other Manufacturing 37 36 18 100 0 0 27 25 12 88 18 9 Total Manufacturing 37.09 28.86 14.79 93.78 30.78 18.92 21.64 22.18 11.78 77.50 23.47 13.46 Note: All �gures are percentages. The output and input tariffs are exclusive of the 5 percent of�cial tariff surcharges. Source: Unpublished data provided by Adrian Ten Kate, SECOFI. ˘ lu Kandilov and Leblebiciog 327 analogous to that of input tariffs and the effect of output import license is analogous to that of output tariffs, which are discussed here. Consider the investment problem of a monopolistically competitive �rm that imports some of its variable inputs of production and sells its output in the do- mestic market, where it faces foreign competition. The optimal investment de- cision implies that the �rm will choose to invest up to the level where the marginal cost of investing in a new unit of capital is equal to the present dis- counted value of the marginal return to capital. The higher the marginal pro�t- ability of capital, the more incentives the �rm will have to undertake investment. The marginal pro�tability of capital in turn depends on expected sales as well as expected costs of domestic and foreign variable inputs. Trade liberalization can affect investment through marginal pro�tability of capital by altering expected sales and costs of imported inputs. While reductions in input tariffs can increase marginal pro�tability of capital through changes in the prices of imported inputs, reductions in output tariffs can decrease marginal pro�tability of capital through changes in foreign competitors’ prices and hence through changes in �rm sales. A reduction in input tariffs lowers the cost of using imported inputs, and thereby raises the marginal pro�tability of capital and investment. Hence, we expect trade liberal- ization to increase investment through lower input tariffs. Moreover, we expect the increase in investment to be stronger for �rms with a higher volume of imported inputs, since lower input tariffs would lead to a larger increase in the marginal pro�tability of their capital. Changes in output tariffs affect the marginal pro�tability of capital through changes in the foreign competition the �rm faces. Assuming that the �rm sells its product in the imperfectly competitive domestic market, the demand for its product will be affected by changes in domestic and foreign competitors’ prices, as well as aggregate demand conditions. A reduction in output tariffs lowers the effective price individuals pay on foreign varieties, and thereby reduces the demand for the �rm’s product. As a result, a reduction in an output tariff can lower the marginal pro�tability of capital and investment. Hence, we expect trade liberalization to decrease investment through lower output tariffs that leads to intensi�ed foreign competition. An important factor that determines the sensitivity of investment to changes in output tariffs is the �rm’s markup, which is closely linked to the degree of competition the �rm faces, as well as the industry structure. A �rm with more monopoly power, hence a higher markup, may be affected more adversely by a reduction in output tariffs and the subsequent increase in import competition. On the other hand, the reduction in output tariffs may not affect a low markup �rm as much, since it has already been exposed to ample competition. In the next section, we describe how the empirical investment equation we adopt incorporates the relationship between investment and input and output tariffs, as well as the sensitivity of this relationship to the �rm’s markup and the level of imported inputs. 328 THE WORLD BANK ECONOMIC REVIEW I I I . E M P I R I C A L I N V E S T M E N T E Q U AT I O N A N D E S T I M AT I O N The theoretical framework in section II motivates the relationship between in- vestment and tariffs, illustrates how input and output tariffs can affect invest- ment differently, and also suggests other �rm-speci�c determinants of investment (such as sales and costs). One can further generalize this framework to accommodate a richer set of investment costs and additional constraints imposed on the �rm. Each new assumption would give rise to a different struc- tural relationship. Because the main goal of this study is to estimate the impact of trade liberalization on investment, instead of focusing on the structural process, we estimate a standard reduced form investment equation.12 In their review of the empirical literature that uses �rm- or plant-level data to estimate an investment equation, Bond and Van Reenen (2008) note that this type of reduced form model can be interpreted as representing an empirical approxi- mation to the underlying investment process. We start by estimating the following baseline speci�cation, which focuses on the main effect of tariffs and import licenses on investment: Iijt IijtÀ1 Sijt Cijt ¼ a1 þ a2 þ a3 þ a4 tOT OL jt þ a5 t jt KijtÀ1 KijtÀ2 KijtÀ1 KijtÀ1 ð1Þ þ a6 tIT IL jt þ a7 t jt þ yi þ ht þ 1ijt ; where Iijt =KijtÀ1 is the investment rate for plant i in industry j in year t, and Sijt =KijtÀ1 and Cijt =KijtÀ1 are the plant’s total sales and cash flow, respectively, normalized by its capital stock.13 The terms tOT jt and tOL jt denote the output tariff and license measure for industry j, in year t, respectively.14 Similarly, tIT jt and tILjt denote the input tariff and license measure, respectively. First, note that we include industry speci�c input and output tariffs and import license coverage as measures of protection in the baseline speci�cation (1) simultaneously. It is important to include all of these four measures to- gether in the model because they are positively correlated (see Panel B of Table 2). If we exclude one or more from the speci�cation, for example, if we only include the tariffs or the license coverage ratios, omitted variable bias will 12. In the Theoretical Appendix (available from the authors upon request), we provide a model of investment with quadratic adjustment costs that can be used as the basis of the reduced form empirical investment equation. 13. The normalization by capital stock naturally arises in a model with quadratic adjustment costs, and it allows us to control for the size of the �rm. 14. We have information on tariffs and import license coverage for the months of June and December of each year. In the baseline speci�cation, we use the data on trade protection in June of year t as the relevant measure affecting investment in year t. Later, we check the robustness of our results to using an alternative timing of the protection measures, where we employ equally-weighted data on trade protection in December of year t-1, June of year t, and December of year t to explain investment in year t. ˘ lu Kandilov and Leblebiciog 329 T A B L E 2 . Summary Statistics Variable Mean St. Dev. Min Max Panel A: Descriptive À Statistics � Investment Rate À Iijt =K �ijtÀ1 0.15 0.38 2 1.88 10.85 Total Sales À Sijt =KijtÀ1 � 9.12 25.81 0.00 1,031.08 Cash Flow Cijt =KijtÀ1 2 13.61 823.84 2 80,041.88 267.85 Average ÀMarkup ðCi� Þ 2.25 5.68 2 36.89 154.26 Imports ÀIMijt =KijtÀ1 � 1.06 21.18 0.00 1,892.37 Exports EXijt =KijtÀ1 0.51 7.46 0.00 385.31 Foreign Share   0.19 0.36 0.00 1.00 Output Tariff tOT jt 21.68 10.64 0.20 45.00   Output License tOL jt 12.32 27.93 0.00 100.00   IT Input Tariff t jt 15.90 8.16 0.70 41.00   IL Input License t jt 16.16 23.75 0.20 99.50 Panel B: Correlations of the Trade Liberalization Measures Output Tariff 1.000 Output License 0.362 1.000 Input Tariff 0.770 0.456 1.000 Input License 0.090 0.275 2 0.135 1.000 Note: The total number of observations is N ¼ 11,834. The output and input tariffs are exclu- sive of the 5 percnt of�cial tariff surcharges. Source: INEGI and SECOFI; see text for details. likely be an issue.15 Note that the positive correlations between the tariffs and the license coverage ratios are not high enough to raise multicollinearity con- cerns. Second, in order to control for marginal pro�tability of capital, we include the sales-to-capital ratio. Third, following Fazzari and others (1988), we include cash flow as a proxy for �nancing constraints, which arise due to capital market imperfections. Cash flow can be an important determinant of investment for Mexican �rms, since it might be dif�cult for �rms to smooth in- vestment behavior via external capital markets.16 Empirically, cash flow is con- structed as the difference between sales and total costs, adjusted for taxes and depreciation.17 Because costs and cash flow are highly correlated, we include only cash flow in the speci�cation in order to minimize collinearity problems.18 15. For instance, the sample correlation between the input tariff and the output license ratio is 0.456. Given that we expect to �nd a positive coef�cient on the output license (and tariff ) and a negative coef�cient on the input tariff (and license), if we do not include both measures in the same speci�cation but only one of them, its estimated coef�cient will likely be attenuated and may even have the wrong sign because of the omitted variable with which it is positively correlated. 16. Examples of previous work that have shown the importance of �nancing constraints for investment in developing countries include Jaramillo and others (1996), Harrison and McMillan (2004), and Love (2003). 17. Total costs include domestic and imported material costs, as well as labor costs, costs of industrial and non-industrial services. 18. The results including costs in addition to sales and cash flow are similar to those reported in the following sections, and they are available upon request. 330 THE WORLD BANK ECONOMIC REVIEW We introduce the cost of imported inputs in a subsequent speci�cation that augments equation (1). Fourth, we include the lagged investment rate to control for the autocorrelation that may arise due to adjustment costs. The speci�cation in equation (1) also includes plant speci�c �xed effects,y i, that capture time-invariant, plant-speci�c determinants of investment, as well as year effects, ht, that capture aggregate, economy-wide fluctuations. Macroeconomic factors common to all �rms, such as changes in interest rates and exchange rates, will be absorbed in these time effects. However, �rms in different industries might face different economic conditions due to, for example, different working capital and borrowing needs. In order to allow for heterogeneous effects of the economy-wide fluctuations, in some speci�cations, we additionally include interaction terms between the year effects and a full set of eight aggregate industry dummies. Moreover, in some speci�cations, we include interaction terms between the year effects and a full set of six region dummies in order to control for temporal shocks that affect the various regions differently.19 Finally, we assume that the error term, 1ijt , is i.i.d. with E(1ijt)=0. Based on the implications of our theoretical framework, and following the empirical example of Amiti and Konings (2007), we augment the baseline spe- ci�cation (1) in two important ways. First, we recognize that reductions in import protection can increase investment by making imported inputs more readily available. In order to capture this channel, we include the cost of imported inputs (normalized by the size of capital stock, Kijt21), as well as its interaction with the input tariff and the input import license measure. We expect a �rm with larger imports to bene�t more from the reductions in input tariffs and import licenses. Secondly, to check how the impact of trade liberalization on investment depends on the �rm’s markup, we include an interaction term between the markup and all four protection measures. As discussed in section II, a reduc- tion in the output tariff (similarly in the output license coverage) can reduce in- vestment more in high markup �rms, as they begin to face more stiff competition and a decrease in marginal pro�tability. Hence, we expect the interaction terms with the markup to reinforce the negative effects of the output tariff and the license coverage. We estimate the dynamic investment equation (1) and the augmented speci�- cations using the system-GMM estimator of Arellano and Bover (1995) and Blundell and Bond (1998). This estimator for panel data sets with short time dimension addresses the potential biases that arise from the correlation 19. We group the Mexican states into six regions. Region 1 (Northwest) includes the states of Baja California Northern and Baja California Southern; region 2 (Northeast) includes the states of Coahulia, Chihuahua, Durango, Nuevo Leon, Sinaloa, Sonora, and Tamaulipas; region 3 (Central) includes the states of Aguascalientes, Guanajuato, Queretaro, San Luis Potosi, and Zacatecas; region 4 (Southwest) includes the states of Colima, Chiapas, Guerrero, Jalisco, Michoacan, Nayarit, and Oaxaca; region 5 (Southeast) includes Dictrict Federal and the states of Hidalgo, Mexico, Morelos, Puebla, Tlaxcala, and Veracruz; �nally, region 6 (East) includes Campeche, Quintana Roo, Tabasco, and Yucatan. ˘ lu Kandilov and Leblebiciog 331 between the plant �xed effects, y i and the lagged dependent variable, IijtÀ1 =KijtÀ2 , as well as the endogeneity of sales, Sijt =KijtÀ1 , and cash flow, Cijt =KijtÀ1 . The system-GMM estimator combines the �rst-difference equations, whose regressors are instrumented by their lagged levels, with equations in levels, whose regressors are instrumented by their �rst-differences.20 All of the plant speci�c variables are treated as endogenous, and lagged values dated t-2 and t-3 are used as the GMM-type instruments.21 To this instrument set, we add lagged advertisement costs as an outside instrument in order to help identi- �cation.22 The full set of instruments can be found at the end of each table. We employ and report the Sargan-Hansen tests of overidenti�cation to test for the validity of our instruments.23 I V. D ATA To identify the impact of trade liberalization on investment, we use Mexico’s Annual Industrial Survey, which includes annual plant-level data from 1984 to 1990. The seven-year time span includes the period of the broad trade liberalization that took place starting from the second half of 1985 until 1988. As already mentioned, this balanced panel was originally used by Tybout and Westbrook (1995) to assess the impact of trade liberal- ization on productivity. The data represent all industries in the Mexican manufacturing sector and were collected by Mexico’s National Institute of Statistics, Geography, and Information (INEGI). They were originally pro- vided by Mexico’s Secretariat of Commerce and Industrial Development (Secretara de Comercio y Fomento Industrial, SECOFI), which is currently Secretariat of Economy (Secretara de Economa, SE). On average, the sample plants represent about 80 percent of output in each industry; the smallest plants are excluded from the survey.24 For each establishment, the survey collects data on sales, employment, inputs, investment, wages, 20. The system-GMM estimator builds on the difference-GMM estimator of Arellano and Bond (1991), which uses only the differenced equations, instrumented by the lagged levels of the regressors. If the regressors are persistent, then their lagged levels are shown to be weak instruments. See Arellano and Bover (1995) and Blundell and Bond (1998) for more details. To avoid this drawback of the difference-GMM estimator, we opted for the system-GMM estimator. 21. In some of the speci�cations lagged values dated t-2 were shown to be invalid instruments using the Sargan-Hansen tests of overidenti�cation. In those cases, only the lagged values dated t-3 are used as instruments. The results look similar if we also include lagged values dated t-4 and t-5 in the instrument set. 22. We note that excluding the advertisement costs from the set of instruments does not change the results. We have veri�ed the suitability of advertisement costs as an exogenous variable with the difference-in-Hansen test. In all of our speci�cations, we failed to reject the null hypothesis of exogeneity of advertisement costs. 23. All the estimations and tests were done using the xtabond2 command in Stata 9.2. 24. Note that the maquiladora plants, which are considered producers of a service ( processing of intermediates), are excluded from the analysis because they do not report values for gross output or intermediate inputs. 332 THE WORLD BANK ECONOMIC REVIEW exports of output, imports of intermediate goods, inventories, and a small number of other plant characteristics. Information on industry af�liation is also available, and a unique plant identi�er is assigned to each establish- ment, which makes it possible to track plants over time.25 To construct plant-level capital stocks, we follow Tybout and Westbrook (1995). Each plant’s capital stock is computed by adding the replacement cost (at the end of the year) of �ve different types of capital—machinery and equip- ment, buildings (construction and installation), land, transportation equip- ment, and other assets. Deflators (capital formation price indices) for the different types of capital at the two-digit industry level were provided by SECOFI.26 Cash flow is calculated as the after tax operating pro�ts plus depreciation. All plants are classi�ed into 129 four-digit Mexican Census Classi�cation of manufacturing industries (Clase Censal 1975), a classi�cation that roughly cor- responds to the four-digit International Standard Industrial Classi�cation (ISIC). The data on Mexico’s commercial policy, including four-digit industry output and input tariffs as well as license coverage ratios, were originally con- structed and provided by Adrian Ten Kate of SECOFI (see Ten Kate and de Mateo Venturini 1989, Tybout and Westbrook 1995) and are based on unpub- lished data from SECOFI. The tariff rates and license coverage ratios were aggregated according to a classi�cation scheme compatible with that of Mexico’s plant-level Annual Industrial Survey. In particular, each industry output tariff was computed by aggregating the relevant product-level tariffs, i.e. tariffs for products manufactured by that industry, using domestic produc- tion weights. The input tariff for each industry was constructed as a weighted average of the output P tariffs for all inputs that the industry used, that is, Input Tariff jt ¼ s u js Output Tariffst , where ujs is the share of input s in the value of output in industry j. The output license coverage ratio represents the share of goods subject to import licensing as a percentage of the value of the industry’s production. The input license coverage ratio is computed similarly to the input tariffs. As discussed in the previous section, a �rm’s market power is an important factor that determines the sensitivity of investment to changes in input and output import protection. On one hand, �rms with high market power can be less sensitive to reductions in input tariffs and import licenses, as they can better absorb cost fluctuations in their markup. On the other hand, they can be more sensitive to reductions in output tariffs and licenses as they feel the pres- sures of competition brought about by increased imports. In order to check these predictions empirically, we construct plant-level markups, which proxy for an establishment’s market power, using the information provided in the 25. Price indices at the industry level for output and intermediate inputs were provided by SECOFI. 26. Similarly, total investment is also the sum of investment in each of the �ve different types of capital. ˘ lu Kandilov and Leblebiciog 333 panel. Following Campa and Goldberg (1999), the average markup, ci, for plant i (average over the sample period from 1984 to 1990) is de�ned as27 value of salesi þ Dinventoriesi ci ¼ : ð2Þ payrolli þ cost of materialsi Similarly, to test how a �rm’s foreign exposure affects investment, we use each establishment’s exports of output (normalized by capital) and imports of inter- mediate goods (also normalized by capital) to expand the baseline speci�cation (1). Table 2 presents the summary statistics for our dependent variable, the in- vestment rate, and all of the right-hand side variables, including the output and input tariffs as well as import license coverage ratios. V. E N D O G E N E I T Y O F T R A D E P O L I C Y It has been long recognized that trade policy may be endogenously determined by policy-makers—for example, governments may choose to offer more import protection for industries with low productivity levels or low investment rates in order to help �rms grow their capital stock or protect jobs (see, for example, Hillman 1982). The political economy of trade literature has recognized that both domestic and foreign organized groups of �rms or workers can influence local governments when import protection decisions are made (see, for example, Grossman and Helpman 1994; Grether and others 2001). For example, the fact that import license requirements in Mexico were eliminated starting in 1984 only for goods with no domestic competition may suggest that political economy forces were important in driving the trade liberalization. Therefore, there are two potential issues which may affect the reliability of the estimates of the effect of import protection measures on �rm-level investment. The �rst is the possibility that policy-makers in Mexico chose import protec- tion measures in response to industry-level investment rates. The second concern is that some of the factors that affect both import tariffs (as well as import license coverage) and investment rates, such as foreign direct investment (FDI), are omitted from the baseline speci�cation, which can bias the estimates. Before the results from the baseline speci�cation are presented, we show that the �rst issue is not relevant in the Mexican context, that is, we demonstrate that the Mexican government did not adjust any of the four trade protection measures—input and output tariffs, as well as input and output import licenses—in response to industries’ investment rates. Then, in the robustness checks, it is also shown that additionally controlling for industry FDI, one of the potential determinants of import protection policy that can also affect a 27. This markup measure is a positive transformation of the markup measure of Domowitz and others (1986), mk, where the measure de�ned in (2) is equal to 1/1 2 mk. We use the average markup rather than a contemporaneous measure in order to avoid endogeneity issues in the estimation. 334 THE WORLD BANK ECONOMIC REVIEW �rm’s productivity and therefore investment, does not change the estimates at all. If the Mexican government did adjust any of the four trade protection mea- sures in response to industries’ investment rates, one would expect either the cumulative changes in trade protection during the liberalization period (1985– 1990) to depend on the initial industry investment rates, or current period in- dustry investment rates to predict future period trade protection measures. To examine this, �rst industry investment rates are constructed as the sales- weighted average of �rms’ investment rates.28 Then, two regression models are estimated. First, the changes in import protection over the period 1985–1990 are regressed on the initial level of the industry investment rate. Second, employing industry panel data from 1985 to 1990, industry output and input tariffs, as well as industry output and input import license coverage ratios, in period t þ 1 are regressed on the industry investment rate in period t. In this latter speci�cation, industry and year �xed effects are included, and the regres- sion is weighted by the number of �rms in each industry-year cell.29 The results, which are presented in Table 3, convincingly show that none of the four import protection measures depends on the industry investment rates. Panel A of Table 3 shows the results from the four cross-sectional regressions of the 5-year (1990–1985) changes in import protection measures on initial (as of year 1985) industry investment. None of the four estimated coef�cients are statistically signi�cant with a mix of both positive and negative coef�cients. Similarly, the industry-level panel regressions in Panel B demonstrate that future input and output import protection measures do not depend on current investment rates. Again, none of the estimated coef�cients are statistically sig- ni�cant, with one positive and three negative estimates. V I . R E S U LT S The results from the baseline speci�cation (1), which estimates the impact of input and output tariffs and import licenses on plant-level investment in the Mexican manufacturing sector, are reported in Table 4. This �rst set of results evaluates the average impact of the trade liberalization on investment in manu- facturing establishments, and shows how changes in input and output protec- tion measures affect investment differently, just as the theoretical framework in section II suggests. The estimates are shown to be robust to the inclusion of both industry-speci�c and region-speci�c controls. In section VI, the results from a number of robustness checks are reported. Also, section VI presents 28. Alternatively, we also constructed the industry investment rate as the ratio of aggregate industry investment over aggregate industry capital. The results with this alternative measure are very similar to the estimates reported using the sales-weighted measure in the text. 29. We also include aggregate (corresponding to 2-digit ISIC) industry dummies interacted with year dummies to capture aggregate, industry-speci�c, time-varying shocks (such as changing �nancing conditions). T A B L E 3 . Trade Policy Endogeneity: Current Trade Policy and Past Investment (1) (2) (3) (4) Panel A: Cross-section Estimates Dependent Variable Output Tariff Output License Input Tariff Input License Investment Rate 25.51 (19.91) 73.05 (71.79) 2 8.17 (9.59) 2 57.83 (35.32) Number of Observations 126 126 126 126 R2 0.0011 0.0086 0.0009 0.0015 Panel B: Panel Estimates Dependent Variable Output Tariff Output License Input Tariff Input License Investment Rate 2 10.41 (8.01) 2 35.92 (40.21) 0.43 (4.57) 2 21.68 (17.41) Number of Observations 634 634 634 634 R2 0.8934 0.6651 0.9273 0.9403 Note: Panel A presents the cross-section regressions of changes in the corresponding trade policy tool over the sample period (1984-1990) on initial (1984) investment. The regressions are weighted by the number of �rms in each four-digit industry. Robust standard errors are reported. Panel B presents the panel regressions of current trade policy tool on lagged investment rate. Estimations include �rm �xed effects, year effects and aggregate industry  year interaction dummies, and are weighted by the number of �rms in each four-digit industry in each particular year. Standard errors are robust and they are clustered at the four-digit industry level. Source: INEGI and SECOFI; see text for details. Kandilov and Leblebiciog 335 ˘ lu 336 T A B L E 4 . Baseline Results À � Dependent variable : Iijt =KijtÀ1 (1) (2) (3) (4) À � Lagged investment rate IijtÀ1 =KijtÀ2 0.275** (0.104) 0.263** (0.105) 0.284** (0.104) 0.280** (0.103) À � Sales/100 Sijt =100  KijtÀ1 0.087 (0.076) 0.073 (0.073) 0.078 (0.074) 0.071 (0.072) À � Cash flow/100 Cijt =100  KijtÀ1 0.001 (0.004) 0.0001 (0.001) 0.001 (0.004) 0.0001 (0.004)   Output tariff/100 tOT 0.049 (0.065) 0.044 (0.063) 0.053 (0.065) 0.045 (0.062) jt =100   Output license/100 tOL 0.029** (0.015) 0.037** (0.016) 0.031** (0.015) 0.039** (0.016) jt =100   Input tariff/100 tIT 2 0.201** (0.079) 2 0.196** (0.086) 2 0.233** (0.078) 2 0.227** (0.086) jt =100   2 0.063** (0.014) 2 0.036** (0.017) 2 0.063** (0.014) 2 0.038** (0.017) THE WORLD BANK ECONOMIC REVIEW Input license/100 tIL jt =100 Industry  Year Effects No Yes No Yes Region  Year Effects No No Yes Yes Number of observations 11,834 11,834 11,706 11,706 Hansen test (p-value) 0.318 0.472 0.378 0.629 1st order serial correlation (p-value) 0.000 0.000 0.000 0.000 2nd order serial correlation (p-value) 0.393 0.445 0.378 0.406 Note: Two-step coef�cients and robust standard errors with the Windmeijer (2005) small sample correction are reported. ** and * denote signi�cance at À 5% and �10%, À respectively. � À A set of� year À dummies � are À included in all speci�cations.� The instruments for the �rst-differenced equations are: SijtÀ2 =KijtÀ3 , SijtÀ2 =KijtÀ3 , SijtÀ3 =KijtÀ4 , CijtÀ3 =KijtÀ4 , AdvertisementCostsijtÀ1 =KijtÀ2 .The instruments for the equations in levels are: DIijtÀ2 =KijtÀ3 , DCijtÀ2 =KijtÀ3 , DCijtÀ2 =KijtÀ3 . The p-values for the Hansen test of overidentifying restrictions (where the null hypothesis is that the instruments are valid, i.e., uncorrelated with the error term) are reported. The Arellano-Bond (1991) serial correlation tests are applied to the �rst-differenced residuals. Source: INEGI and SECOFI; see text for details. ˘ lu Kandilov and Leblebiciog 337 evidence of the heterogeneity in the impact of the trade liberalization across �rms of different size. It further shows how trade liberalization is especially bene�cial for investment in �rms that import intermediate inputs. This section also documents the importance of the �rm’s market power, as proxied by the size of the �rm’s markup, in mediating the effects of trade liberalization on in- vestment. Finally, the overall impact of Mexico’s trade liberalization on invest- ment is quanti�ed at the end of section VI. Main Effects of Trade Liberalization on Investment Column (1) of Table 4 presents the results from our baseline speci�cation (1), which includes both �rm and year �xed effects. Con�rming the theoretical pre- dictions, the estimates show that the decrease in output and input protection measures during the Mexican trade liberalization affected plant-level invest- ment differently. The estimated coef�cients on the output tariffs and import licenses are respectively 0.049 (with a standard error of 0.065) and 0.029 (0.015), with the latter being signi�cant at the conventional 5 percent level.30 This is consistent with the theoretical framework discussed in section II: reduc- tions in output tariffs and import licenses lower the marginal pro�tability of capital due to intensi�ed foreign competition, which in turn leads to lower in- vestment. The estimated coef�cient on the output license coverage indicates that a 10 percentage point reduction in the output license coverage leads to a 0.29 percentage point decrease in investment, or 1.93 percent at the mean in- vestment rate of 0.15. The estimated coef�cients of the input tariffs and the import license cover- age, on the other hand, are both negative and statistically signi�cant at the 5 percent level. The estimated coef�cient of 2 0.201 (0.079) on the input tariffs implies that a 10 percentage point reduction in input tariffs raises investment by 2.01 percentage points. Similarly, the estimated coef�cient of 2 0.063 (0.014) on the input import license coverage implies a 0.63 percentage point increase in investment given a 10 percentage point decline in input license coverage.31 These results are consistent with the discussion in section II, which describes how a decrease in the protection measures imposed on inputs stimu- lates investment by lowering the cost of imported (intermediate) inputs, and thereby increases the marginal pro�tability of capital. These �ndings are also consistent with Amiti and Konings (2007) and Khandelwal and Topalova (2011), who, in a similar vein, show that reductions in input tariffs increase �rm-level productivity in the Indonesian and Indian manufacturing sectors, respectively. 30. We compute robust standard errors using the Windmeijer (2005) small sample correction. 31. Note that when the size of the impacts of the output and input protection measures are compared, the investment-stimulating effect of lower input tariffs and license coverage (at 2.01 and 0.63 percentage points given equal reductions of 10 percentage points, respectively) dominates the adverse effects of lower output tariffs (at 0.29 percentage points given a reduction of 10 percentage points) and license coverage (insigni�cant effect). 338 THE WORLD BANK ECONOMIC REVIEW In order to check if the main results are robust to aggregate industry-speci�c, time-varying shocks (such as changing �nancing conditions), as well as region-speci�c, time-varying shocks (capturing, for example, dynamic product- ivity differences across regions in Mexico), aggregate industry-speci�c year dummies, as well as region-speci�c year dummies are furthered included.32 These speci�cations are presented in columns (2)-(4) of Table 4. The estimates of the impact of the four import protection measures on plant-level investment remain largely unchanged. In the most general speci�cation with both aggregate industry- and region-speci�c year dummies (column (4)), the direction of the impact and the signi�cance of the estimates of the four mea- sures of protection are the same as in the baseline case in column (1). While the impact of the output import license coverage is estimated to be about 30 percent larger in column (4) than in column (1) (0.039 vs. 0.029), the impact of the input import license coverage is about 40 percent smaller ( 2 0.038 vs. 2 0.063).33 In all four columns, lagged investment is positive and statistically signi�cant, as expected. The other �rm-speci�c determinants, sales and cash-flow, are posi- tive as expected; however, they are small in magnitude and not precisely esti- mated. All speci�cations in Table 4 are supported by the tests of over-identifying restrictions, for which the Hansen test statistic fails to reject the validity of the instrument sets (the p-values are 0.393, 0.445, 0.378 and 0.406, respectively). Moreover, the tests for serial correlation, which are applied to the residuals in the �rst differenced equations (D1ijt), show that the null hypothesis of no �rst-order serial correlation can be rejected, but the null hypothesis of no second order serial correlation cannot be rejected.34 The fact that the errors only have �rst order autocorrelation conforms to the choice of instruments dated t-2 and t-3. Robustness Checks The �rst robustness check demonstrates that the results and conclusions in the previous subsection do not change when the standard errors are clustered at the 4-digit industry level (still using the Windmeijer (2005) small sample cor- rection), instead of being clustered at the �rm level. Clustering at the industry level is of interest because the trade liberalization measures vary at the industry 32. Eight aggregate industry dummies (equivalent to two-digit ISIC manufacturing industry aggregates) and 6 region dummies (into which 32 Mexican states are geographically grouped) are used; see footnote 19. 33. We have also estimated the main model after trimming extreme observations by “winsorizing�. We followed Angrist and Kruger (1999) and “winsorized� the data within each year for all of our main variables (including investment, sales, and cash flow) by setting all values below 0.5th percentile to the value at the 0.5th percentile and all values above the 99.5th percentile to the value at the 99.5th percentile. The estimates using the “winsorized� data, which are available upon request, are quite similar, both economically and in statistical signi�cance, to those reported in Table 4. 34. Assuming that the residuals, 1ijt, in equation (1) are i.i.d, we expect D1ijt in the �rst-differenced equations to have �rst order autocorrelation. ˘ lu Kandilov and Leblebiciog 339 level. The �rst column of Table 5 shows that the results discussed in the previ- ous section are robust to such clustering.35 As discussed in the previous section, it is possible that a variable which affects both import protection levels and �rm-level investment is omitted from the baseline speci�cation. Consequently, the estimates of the impact of trade protection on �rm-level investment can be biased. The political economy of trade literature has identi�ed a number of factors that may affect protection levels - industry share of foreign direct investment (FDI), in- dustry concentration, industry labor intensity, etc. One that can feasibly affect both import protection levels and �rm-level investment is industry FDI.36 In the Mexican context, Grether, de Melo, and Olarreaga (2001) have shown that industry-level FDI did affect trade policy during the earlier liberalization years (1985–1988). While there is not much evidence, it is widely speculated that the industry share of FDI can bring about positive productivity spillover effects on domestic establishments, and hence increase investment (see, for example, Javorcik 2004). Therefore, omitting the industry-level share of FDI from the baseline speci�cation may potentially lead to inconsistent estimates of the effect of trade policy on �rm-level in- vestment. In order to check if that is the case, the baseline speci�cation is re-estimated additionally including the share of FDI (as a share of total in- dustry output, or total industry capital) on the right-hand side of the regres- sion.37 Columns (2) and (3) of Table 5 show that additionally controlling for industry-level FDI does not change the estimated effects of import pro- tection on �rm-level investment. For the next robustness check, the baseline speci�cation is estimated using the alternative timing of the four import protection measures. As already discussed earlier, data on input and output tariffs as well as import licenses are available for June and December of each year in the sample. In the baseline speci�cation, the data on trade protection in June of year t are used as the relevant measure affecting investment in year t, that is, the in- vestment rate in 1985 was regressed on trade protection in June of 1985. In this robustness check, equally-weighted data on trade protection in December of year t-1, June of year t, and December of year t are used to explain investment in year t. The results with this alternative timing of the 35. Since the two-step coef�cients from the system-GMM are presented, clustering the standard-errors at the industry level affects the coef�cients as well as the standard errors. However, the coef�cients obtained with industry level clustering are very similar to the baseline estimates. 36. To the best of our knowledge, there is no previous theoretical or empirical work that relates any of the other determinants of trade protection to �rm-level investment. 37. On theoretical grounds, one would want to include FDI as share of industry imports, not as a share of industry output, on the right-hand side of the regression (see Grether, de Melo, and Olarreaga 2001). However, industry imports in the same (1975 Mexican Census) classi�cation at the detailed four-digit industry level could not be located. Instead of aggregating the FDI data, FDI as a share of industry output at the original, detailed four-digit industry level was used. 340 T A B L E 5 . Robustness Checks À � Dependent variable: Iijt =KijtÀ1 (1) (2) (3) (4) À � Lagged investment rate IijtÀ1 =KijtÀ2 0.270** (0.128) 0.274** (0.105) 0.275** (0.104) 0.276** (0.104) À � Sales/100 Sijt =100  KijtÀ1 0.105 (0.095) 0.088 (0.076) 0.087 (0.076) 0.087 (0.075) À � Cash flow/100 Cijt =100  KijtÀ1 2 0.0005 (0.006) 0.0001 (0.004) 0.0001 (0.004) 0.0001 (0.004)   Output tariff/100 tOT 0.049 (0.067) 0.035 (0.067) 0.058 (0.064) 0.009 (0.051) jt =100   Output license/100 tOL 0.032** (0.016) 0.029* (0.015) 0.029* (0.015) 0.035** (0.015) jt =100   Input tariff/100 tIT 2 0.183* (0.109) 2 0.205** (0.079) 2 0.205** (0.078) 2 0.155** (0.071) jt =100 THE WORLD BANK ECONOMIC REVIEW   Input license/100 tIL 2 0.061** (0.024) 2 0.069** (0.015) 2 0.062** (0.014) 2 0.056** (0.015) jt =100 Foreign output (share of total industry output) 2 0.022 (0.018) Foreign output (fraction of total industry capital) 0.003 (0.007) Number of observations 11,834 11,834 11,834 11,834 Hansen test (p-value) 0.469 0.317 0.321 0.320 1st order serial correlation (p-value) 0.000 0.000 0.000 0.000 2nd order serial correlation (p-value) 0.454 0.397 0.396 0.391 Note: The �rst column reports the baseline estimates with standard errors clustered by the four-digit industries. The last column reports the estimates with the tariff and license coverage measures that are averages over the previous year’s December and corresponding year’s June and December rates. See Table 4 for additional notes. Source: INEGI and SECOFI; see text for details. ˘ lu Kandilov and Leblebiciog 341 protection measures are presented in the last column of Table 5. These esti- mates are quite similar to their counterparts in the baseline speci�cation in column (1) of Table 4. The only difference is in the coef�cient on the output tariff, which is smaller than the baseline estimate, but still positive and not statistically signi�cant.38 Heterogeneity in the Impact of the Trade Liberalization The estimates presented in Table 6 shed light on the heterogeneity of the impact of Mexico’s trade liberalization on �rm-level investment. Building on the work of Melitz (2003), Bustos (2011) shows both theoretically and em- pirically that when a multilateral regional trade agreement, such as MERCOSUR, is implemented, �rms in Argentina will have an incentive to upgrade technology given the expanded export opportunities as a result of lower tariffs in Brazil. In particular, Bustos (2011) emphasizes the fact that this incentive is not the same for all �rms—it varies with productivity. Her model predicts that MERCOSUR will induce technology adoption for �rms in the middle range of the productivity distribution. On the other hand, the trade agreement will not affect the least ef�cient producers, who do not export even after the agreement is in place, and the most productive �rms, who already employed the upgraded technology even before the agreement took effect. Similar effects of the unilateral trade liberalization in Mexico on �rm-level investment are also likely to exist. For example, as input tariffs fall, �rms in the middle of the productivity distribution are most likely to experience the largest investment incentive due to the lower input prices of imported inter- mediates. Such �rms were previously likely on the margin and the lower input tariffs provided enough incentive for them to increase investment. On the other hand, the incentive was not enough for the least ef�cient �rms, for which the marginal pro�tability of capital was quite low before and after the fall in tariffs. Moreover, the most productive establishments would also not increase their investment by much because they had likely already achieved a high investment rate based on the high expected level of sales before the trade liberalization. To empirically test for heterogeneity in the impact of Mexico’s trade liberal- ization on �rm-level investment, following Bustos (2011) all �rms were divided into 4 groups—the four quartiles of the initial �rm size distribution, where initial size is a proxy for initial productivity.39 Consequently, the following 38. In addition to the aforementioned robustness analyses, we also con�rmed that the baseline results are robust to excluding observation with negative investment rates. Since there are only 372 negative observations (out of 11,834), not surprisingly, excluding them does not affect the estimates. 39. As in Bustos (2011), initial �rm-level (log) employment relative to the four-digit-industry average was used as a measure of the �rm’s initial size. 342 THE WORLD BANK ECONOMIC REVIEW T A B L E 6 . Heterogeneity of the Impacts across the Size Groups Output Tariff Output License Input Tariff Input License First Quartile 0.108 (0.152) 2 0.042 (0.043) 2 0.162 (0.171) 2 0.052** (0.024) Second Quartile 0.029 (0.156) 0.035 (0.035) 2 0.030 (0.206) 2 0.064** (0.028) Third Quartile 0.045 (0.064) 0.041* (0.021) 2 0.243** (0.104) 2 0.087** (0.020) Fourth Quartile 0.001 (0.007) 0.062** (0.020) 2 0.282** (0.103) 2 0.055** (0.021) Note: The reported coef�cients are the interaction terms between the corresponding liberaliza- tion measure and initial size dummy variables for the four quartiles. The initial size measure is constructed as the initial employment of the �rm normalized by the employment in the corre- sponding four-digit industry. Source: INEGI and SECOFI; see text for details. expanded version of the baseline speci�cation (1) was estimated: Iijt IijtÀ1 Sijt Cijt X 4 ¼ a1 þ a2 þ a3 þ gr OT ðtOT r jt  Qij Þ KijtÀ1 KijtÀ2 KijtÀ1 KijtÀ1 r¼1 t X 4 X 4 þ gr OL r tOL ðt jt  Qij Þ þ gr IT r tIT ðt jt  Qij Þ ð3Þ r¼ 1 r¼ 1 X 4 þ gr IL r tIL ðt jt  Qij Þ þ yi þ ht þ 1ijt ; r¼ 1 where r indexes the four quartiles of the size distribution and Qij is the indica- tor variable equal to one when �rm i belongs to quartile r.40 The estimates are presented in Table 6. In general, the results are consistent with expectations and imply that the impact of lower tariffs and licenses increases with the size quartiles, and for both output tariffs and input licenses falls at the top of the distribution going from the third to the fourth quartile. Only for the input licenses are the estimates for each quartile statistically signi�cant at the �ve percent level, and the coef�cients follow the expected pattern. The impact of the reduction of input licenses is largest for �rms in the third quartile and the point estimate of 2 0.087 (0.020) is about 40 percent larger than the average impact of 2 0.063 (0.014) that is estimated for all �rms in the baseline speci�- cation (1) (see Table 4). The effects of lower output tariffs and licenses are less precisely estimated, and in the case of output licenses, the coef�cients imply that the impact increases with the size quartiles.41 40. Note that the size indicator dummies are not included in the regression as they categorize �rms based on their initial size, which is time-invariant. 41. This is broadly consistent with expectations and it may be due to the fact that �rm size is not a perfect measure of productivity. ˘ lu Kandilov and Leblebiciog 343 The Imported Inputs Channel The discussion in section II illustrates how trade liberalization can increase in- vestment by lowering the cost of imported inputs and making them more ac- cessible. Hence, a �rm that requires the use of imported inputs should bene�t more from a reduction in input tariffs and import license coverage. To test this prediction explicitly, the baseline speci�cation (1) is augmented with two inter- action terms—one between the input tariff and the �rm’s import costs (normal- ized by the capital stock) and a second term between the input import license coverage and the �rm’s import costs. The results are presented in column (1) of Table 7. The main effects of the input tariff and import licenses are again negative and statistically signi�cant at 2 0.158 (0.076) and 2 0.048 (0.014), respectively. As expected, both inter- action terms are negative with the input import license coverage interaction term statistically signi�cant at the 5 percent level. The estimated coef�cient of 2 0.008 (0.004) on this term implies that a �rm facing the average amount of import costs (1.06) would increase investment by 0.56 percentage points given a 10 percent reduction in input import license coverage. At 2.26 percentage points, the effect is �ve times larger for a �rm facing import costs that are one standard deviation above the mean import costs (22.24 ¼ 1.06 þ 21.18). This �nding highlights the additional bene�ts of the reductions in input tariffs and import licenses for �rms that import (intermediate) inputs, and it is consistent with previous work by Amiti and Konings (2007), who have shown that such �rms enjoy larger productivity gains from a reduction in input tariffs. Moreover, similar to their results, in Table 7, the coef�cient on imports itself is positive and highly signi�cant, showing that a 10 percentage point increase in imports is associated with a 2.22 percentage point increase in investment. The second column of Table 7 presents the results when, in addition to imports, two other (time-varying) �rm-level characteristics that can potentially affect investment behavior—exports and foreign ownership—are included in the model. Either higher exports or foreign ownership can imply higher invest- ment pro�les, since such �rms are typically more productive and are larger in size. Contrary to what one would expect, both higher foreign ownership and higher exports appear to be associated with a lower investment rate, although inferences are problematic since neither of the coef�cients is statistically signi�cant.42 The Markup Channel Finally, this subsection considers the differential effects of input and output protection measures for �rms with various levels of market power. The theoret- ical framework in section II illustrates how the effect of output tariffs and 42. Amiti and Konings (2007) also estimate a negative and insigni�cant effect of exports on �rm-level productivity in Indonesian plants. 344 THE WORLD BANK ECONOMIC REVIEW T A B L E 7 . Imported Inputs and Markup À � Dependent variable: Iijt =KijtÀ1 (1) (2) (3) À � Lagged investment rate IijtÀ1 =KijtÀ2 0.324** (0.103) 0.353** (0.110) 0.276** (0.105) À � Sales/100 Sijt =100  KijtÀ1 0.087 (0.059) 0.084 (0.064) 0.089 (0.077) À � Cash flow/100 Cijt =100  KijtÀ1 0.001 (0.003) 0.002 (0.002) 0.001 (0.004)   Output tariff/100 tOT 0.036 (0.066) 0.015 (0.070) 0.020 (0.068) jt =100 Output tariff  mark – up/100 0.016** (0.007)   tOT jt  Ci =100   Output license/100 tOL 0.025* (0.015) 0.024 (0.015) 0.018 (0.017) jt =100 Output license  mark – up/100 0.005 (0.003)   tOL jt  Ci =100   Input tariff/100 tIT 2 0.158** 2 0.158** 2 0.126 (0.085) jt =100 (0.076) (0.080) Input tariff  imports/100 2 0.003 (0.004) 2 0.005 (0.005)   tIT jt  IMijt =100  KijtÀ1 Input tariff  mark – up/100 2 0.036**   tIT (0.014) jt  Ci =100   Input license/100 tIL 2 0.048** 2 0.051** 2 0.075** jt =100 (0.014) (0.014) (0.017) Input license  imports/100 2 0.008** 2 0.009**   tIL (0.004) (0.004) jt  IMijt =100  KijtÀ1 Input license  mark – up/100 0.005 (0.005)   tIL jt  Ci =100 À � Imports/100 IMijt =100  KijtÀ1 0.222** (0.101) 0.268** (0.119) À � Exports/100 EXijt =100  KijtÀ1 2 0.112 (0.122) Foreign share 2 0.021 (0.017) Number of observations 11,834 11,834 11,834 Hansen test (p-value) 0.559 0.546 0.327 1st order serial correlation (p-value) 0.000 0.000 0.000 2nd order serial correlation (p-value) 0.244 0.203 0.395 Note: In columns (1) and (2), we treat imports and the interaction terms as endogenous and use lags 2 and 3 of imports and the interaction terms as GMM-type instruments, in addition to the set of instruments listed in Table 4. In column (2), we further add lags 2 and 3 of exports and foreign share (both treated as endogeneous) to the instrument set used in the previous two columns. See Table 4 for additional notes. Source: INEGI and SECOFI; see text for details. licenses can be increasing in the size of the �rm’s markup. A �rm with substan- tial market power, that is, with a high markup, can be affected more by lower output tariffs and import license coverage because of the heightened import competition that erodes its marginal pro�tability. At the opposite end of the spectrum, an already competitive �rm with a low markup will not be affected considerably by the additional competition. While there is no direct theoretical prediction about the marginal impact of input tariffs and license coverage for ˘ lu Kandilov and Leblebiciog 345 high markup �rms, one could expect the high markup �rms would be affected less by changes in input tariffs and import licenses, as they can adjust their markups and absorb some of the cost fluctuations in their pro�t margins without any signi�cant changes in their investment behavior. To test these predictions, the baseline speci�cation is augmented with markup interactions, where a time-invariant measure of the �rm’s markup, as described earlier, is employed. These results are presented in the last column of Table 7. As expected, the interaction terms between the markup and the output tariff and import license are both positive, implying that a reduction in output protection lowers investment more in high markup �rms. While the main effect of the output tariff is not signi�cant, as in the baseline speci�ca- tion, its interaction is statistically signi�cant at the 5 percent level.43 The esti- mated coef�cients imply that a reduction in output tariffs of 10 percentage points lowers investment by 0.56 percentage points for a �rm with the average markup (2.25), and it lowers investment by 1.47 percentage points for a �rm with a markup that is one standard deviation above the mean (7.93 ¼ 2.25 þ 5.68). While both the coef�cient on the main output import license coverage term and the interaction with the markup is positive, neither is statistically signi�cant. Finally, the interaction term between the markup and the input tariff measure and the interaction between the markup and the input import license measure have a negative and a positive sign, respectively. As expected, the interaction term between the input import license coverage and the markup is positive, albeit insigni�cant, implying a mitigating role for the markup when it comes to the effect of input licenses on investment. The coef�cient on the input license coverage itself is negative and statistically signi�cant as in the baseline results. While the main effect of input tariffs is negative and insigni�cant, the interaction term between the input tariff and the markup is negative and statis- tically signi�cant. The negative interaction term implies that the higher markup �rms increase investment more following a reduction in input tariffs. This could reflect the fact that high markup �rms tend to be larger and import more intermediates, so their marginal pro�tability increases by more following a re- duction in input tariffs. Overall Impact of the Trade Liberalization on the Investment Rate in Mexico’s Manufacturing Sector Finally, this section analyzes the overall impact of Mexico’s trade liberalization in the 1980s on the investment rate ðIijt =KijtÀ1 Þ in the manufacturing sector. Additionally, the respective contributions of the four major trade barriers—the output tariff and the output license coverage, as well as the input tariff and the input license coverage—which declined substantially as part of the trade liber- alization process, are separated and compared. First, note that while both 43. The main effect and the interaction are jointly signi�cant at the 10 percent level. 346 THE WORLD BANK ECONOMIC REVIEW output and input tariffs fell signi�cantly by about 22 percentage points and 10 percentage points respectively, the drop in both the output and input license coverage ratios was even more dramatic (see Figure 1). In 1984, the average output license coverage was 93 percent and the average input license coverage was 77 percent. By the end of our sample period in 1990, the average output license coverage had fallen 74 percentage points to 19 percent and the average input license coverage had dropped 66 percentage points to 11 percent. Given the overall decrease in these trade barriers, the baseline estimates in column (1) of Table 4 imply that the 22 percentage point decline in the output tariffs led to a 1.08 percentage point decline in the investment rate and the 10 percentage point decrease in the input tariffs led to a 2.01 percentage point in- crease in the average investment rate. Hence, the overall change in the invest- ment rate due to the drop in tariffs was an increase of 0.93 percentage points. Further, the estimation results in column 1 of Table 4 also imply that the 74 percentage point decline in the output license coverage resulted in a 2.15 per- centage point drop in the investment rate, while the 66 percentage point de- crease in the input license coverage led to a 4.16 percentage point increase in the investment rate. The net effect of the lower output and input import license coverage on the investment rate is therefore positive at 2.01 percentage points. Overall, the impact of the decrease in all four trade barriers is positive, imply- ing that Mexico’s trade liberalization led to a 2.94 percentage point increase in the investment rate in the Mexican manufacturing sector. It is also worth noting that the largest single ( positive) impact on the investment rate came from the substantial decrease in the import license coverage on inputs. Naturally, the net impact of the trade liberalization on the investment rate differs across four-digit manufacturing industries, driven by differences in the decline in the four import restriction measures. While the net impact is positive for the large majority of the industries, 14 (out of 129) actually experienced a decrease in the investment rate as a result of the trade liberalization, with the largest decrease being 8.88 percentage points (“Milling of Wheat�), and the largest increase being 7.94 percentage points (“Nonwoven Fabrics�). Among the industries that witnessed the largest decline in their investment rates is also “Milling of Corn.� On the opposite end of the spectrum, some of the industries that experienced the largest increase in their investment rates as a consequence of the fall in tariffs and license coverage ratios include “Manufacturing of Fiberglass� and “Manufacturing of Paper.� Conclusion Much research has been done on the impact of trade liberalization on �rm productivity (e.g., Amiti and Konings 2007; Topalova and Khandelwal 2011). However, few studies have evaluated the effects of lower trade protection on investment. Apart from a few country- and industry-level studies (e.g., Ibarra 1995; Wacziarg and Welch 2008), to the best of our knowledge, this study is ˘ lu Kandilov and Leblebiciog 347 the �rst to have estimated the impact of trade liberalization on �rm-level investment. Employing plant-level data from the Mexican manufacturing sector, this study evaluates the impact of lower input and output tariffs as well as import license coverage on plants’ investment decisions. It improves upon previous work with aggregate data as it controls for establishment-speci�c, time- invariant unobservables that may affect investment and bias the estimated impacts if not included in the empirical analysis. Also, it shows that employing aggregate country-level or industry-level data hides substantial heterogeneity in the effects of lower trade barriers on �rms with different market power and international trade positions. Finally, as some of the recent studies on trade lib- eralization and productivity highlight the predominant importance of lower input tariffs (increased access to foreign inputs) compared to lower output tariffs (increased product market competition) for productivity, it evaluates the effect of both input and output protection measures on plant-level investment in the Mexican manufacturing sector. In the case of investment, theory implies that access to cheaper inputs via lower input tariffs increases �rm’s pro�tability, and therefore investment, while lower output tariffs bring about more intense import competition, which results in lower pro�ts and investment. This is exactly what the analysis �nds. Using data that cover a period of broad trade liberalization in Mexico in the mid-1980s, it is shown that the decrease in input tariffs as well as import license coverage resulted in higher investment in Mexican manufacturing estab- lishments. Also, in line with theory, the drop in output tariffs and import license coverage led to a decrease in plant-level investment. The estimated effects are economically and statistically signi�cant and consistent with previ- ous work on plant-level productivity (Amiti and Konings 2007; Topalova and Khandelwal 2011). Altogether, they suggest that the impacts of lower input tariffs and import license coverage (increased access to foreign inputs) on plant-level investment are larger than the impacts of output tariffs and import license coverage (increased product market competition). Consequently, the results show that Mexico’s trade liberalization led to an increase in the average investment rate in the Mexican manufacturing sector. REFERENCES Alvarez, Roberto, and Ricardo A. Lopez 2005. “Exporting and Firm Performance: Evidence from Chilean Plants.� Canadian Journal of Economics 38(4): 1384–1400. Amiti, Mary, and Jozef Konings. 2007. “Trade Liberalization, Intermediate Inputs, and Productivity: Evidence from Indonesia.� American Economic Review 97(5): 1611–1638. Angrist, Joshua D., and Alan B. Krueger. 1999. “Empirical Strategies in Labor Economics.� In O. C. Ashenfelter, and D. Card, eds., Handbook of Labor Economics Vol. 3A. New York: Elsevier Science. 348 THE WORLD BANK ECONOMIC REVIEW Arellano, Manuel, and Stephen Bond. 1991. “Some Tests of Speci�cation for Panel Data: Monte Carlo Evidence and an Application to Employment Equations.� Review of Economic Studies 58: 277– 297. Arellano, Manuel, and Olympia Bover. 1995. “Another Look at the Instrumental Variable Estimation of Error-Components Models.� Journal of Econometrics 68: 29 –51. Blundell, Richard, and Stephen Bond. 1998. “Initial Conditions and Moment Restrictions in Dynamic Panel Data Models.� Journal of Econometrics 87: 115– 143. Bond, Stephen, and John Van Reenen. 2008. “Microeconometric Models of Investment and Employment.� In James J. Heckman, and Edward E. Leamer, eds., Handbook of Econometrics. Basingstoke, U.K: Palgrave Macmillan. Bustos, Paula. 2011. “Multilateral Trade Liberalization, Exports and Technology Upgrading: Evidence on the Impact of MERCOSUR on Argentinean Firms.� American Economic Review 101(1): 304– 340. Campa, Jose Manuel, and Linda S. Goldberg. 1999. “Investment, Pass-through, and Exchange Rates: A Cross-Country Comparison.� International Economic Review 40(2): 287–314. Domowitz, Ian, Robert G. Hubbard, and Bruce Petersen. 1986. “Business Cycles and the Relationship Between Concentration and Price-Cost Margins.� Rand Journal of Economics 17: 1– 17. Fazzari, Steven M., Robert G. Hubbard, and Bruce C. Petersen. 1988. “Financing Constraints and Corporate Investment.� Brooking Papers on Economic Activity: 141–195. Fernandes, Ana. 2007. “Trade Policy, Trade Volumes and Plant-Level Productivity in Colombian Manufacturing Industries.� Journal of International Economics 71(1): 52– 71. Goldberg, Pinelopi K., and Nina Pavcnik. 2004. “Trade, Inequality, and Poverty: What Do We Know? Evidence from Recent Trade Liberalization Episodes in Developing Countries.� Brookings Trade Forum: 223–269. Grether, Jean-Marie, Jaime de Melo, and Marcelo Olarreaga. 2001. “Who Determines Mexican Trade Policy?� Journal of Development Economics 64, 343–370. Grossman, Gene M., and Elhanan Helpman. 1994. “Protection for Sale.� American Economic Review 84: 833–850. Hillman, Arye L. 1982. “Declining Industries and Political-support Protectionist Motives.� American Economic Review 72: 1180–1187. Iacovone, Leonardo, and Beata S. Javorcik. 2008. “Shipping Good Tequila Out: Investment, Domestic Unit Values and Entry into Exports.� Technical Report. Ibarra, Luis A. 1995. “Credibility of Trade Policy Reform and Investment: The Mexican Experience.� Journal of Development Economics 47: 39– 60. Jaramillo, Fidel, Fabio Schiantarelli, and Andrew Weiss. 1996. “Capital Market Imperfections Before and After Financial Liberalization: An Euler Equation Approach to Panel Data for Ecuadorian Firms.� Journal of Development Economics 51(2): 367–386. Javorcik, Beata S. 2004. “Does Foreign Direct Investment Increase the Productivity of Domestic Firms? In Search of Spillovers through Backward Linkages.� American Economic Review 94(3): 605 –627. Love, Inessa. 2003. “Financial Development and Financial Constraints: International Evidence from the Structural Investment Model.� Review of Financial Studies 16(3): 135– 161. Love, Inessa, Ann E. Harrison, and Margaret S. McMillan. 2004. “Global Capital Flows and Financing Constraints.� Journal of Development Economics 75: 269– 301. Melitz, Marc. 2003. “The Impact of Trade on Aggregate Industry Productivity and Intra-Industry Reallocations.� Econometrica 71(6): 1695– 1725. Muendler, Marc-Andreas. 2004. “Trade, Technology and Productivity: A Study of Brazilian Manufacturers.� Policy Research Working Paper Series 1148, CESinfo. Pavcnik, Nina. 2002. “Trade Liberalization, Exit and Productivity Improvements: Evidence from Chilean Plants.� Review of Economic Studies 69(1): 245– 276. ˘ lu Kandilov and Leblebiciog 349 Rodriguez, Francisco, and Dani Rodrik. 2000. “Trade Policy and Economic Growth: A Skeptic’s Guide to the Cross-National Evidence.� In Ben Bernanke, and Kenneth Rogoff, eds., NBER Macroeconomics Annual. Cambridge, MA: MIT Press. Sachs, Jeffrey D., and Andrew Warner. 1995. “Economic Reform and the Process of Global Integration.� Brookings Papers on Economic Activity 1: 1–118. Ten Kate, Adriaan, and Fernando de Mateo Venturini. 1989. “Apertura Comercial y Estructura de la Proteccion en Mexico.� Comercio Exterior 39: 313– 329. Topalova, Petia, and Amit Khandelwal. 2011. “Trade Liberalization and Firm Productivity: The Case of India.� Review of Economics and Statistics 93(3): 995–1009. Tybout, James R., and M. Daniel Westbrook. 1995. “Trade Liberalization and the Dimensions of Ef�ciency Change in Mexican Manufacturing Industries.� Journal of International Economics 39(1): 53– 78. Tybout, James R., M. Daniel Westbrook, Jaime De Melo, and Vittorio Corbo. 1991. “The Effects of Trade Reforms on Scale and Technical Ef�ciency: New Evidence from Chile.� Journal of International Economics 31(3): 231–250. Wacziarg, Romain, and Karen Horn Welch. 2008. “Trade Liberalization and Growth: New Evidence.� World Bank Economic Review 22(2): 187–231. Windmeijer, Frank. 2005. “A Finite Sample Correction for the Variance of Linear Ef�cient Two-step GMM Estimators.� Journal of Econometrics 126: 25 –51. Forthcoming papers in THE WORLD BANK ECONOMIC REVIEW • Is There a Metropolitan Bias? The Relationship between Poverty and City Size in a Selection of Developing Countries Céline Ferré, Francisco H.G. Ferreira, and Peter Lanjouw • Impact of SMS-Based Agricultural Information on Indian Farmers Marcel Fafchamps and Bart Minten • Crises, Food Prices, and the Income Elasticity of Micronutrients: Estimates from Indonesia Emmanuel Skoufias, Sailesh Tiwari, and Hassan Zaman • Economic Geography and Economic Development in Sub-Saharan Africa Maarten Bosker and Harry Garretsen • The Decision to Import Capital Goods in India: Firms’ Financial Factors Matter Maria Bas and Antoine Berthou • Coffee Market Liberalization and the Implications for Producers in Brazil, Guatemala and India Bill Russell , Sushil Mohan, and Anindya Banerjee • COMTRADE CONFUSION! Implications of COMTRADE Compilation Practices for Trade Barrier Analyses and Negotiations Alexander J. Yeats THE WORLD BANK 1818 H Street, NW Washington, DC 20433, USA World Wide Web: http://www.worldbank.org/ E-mail: wber@worldbank.org