WPS7458 Policy Research Working Paper 7458 Exploring the Sources of Downward Bias in Measuring Inequality of Opportunity Gabriel Lara Ibarra Adan L. Martinez Cruz Poverty Global Practice Group October 2015 Policy Research Working Paper 7458 Abstract This study analyzes the extent of downward bias in the circumstance explains. Second, not observing the top 5 per- calculation of inequality of opportunity for continuous cent of the income distribution can lead to downward biases outcomes such as income. A typically recognized source of of anywhere between 12 and 35 percent, and the combina- bias is the unobserved circumstances as there is a limited set tion of missing the most favored population and even one of variables available in household and labor force surveys. relevant circumstance exacerbates the bias of the empirical Another previously overlooked source is the likely unobserv- estimates. The third key result is that the estimated inequal- able nature of top incomes. Using Monte Carlo simulations ity of opportunity is strongly correlated with the amount where the underlying inequality of opportunity is predeter- of variation in the outcome variable explained by the mined at various levels, the study presents three key findings. combination of circumstances (measured by the R2). This First, the omission of a relevant circumstance can bias the result suggests that in empirical applications, the inequal- inequality of opportunity estimate by as much as 80 percent, ity of opportunity estimate can be roughly (and quickly) depending on how much variation of the outcome such approximated using simple econometric techniques. This paper is a product of the Poverty Global Practice Group. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The authors may be contacted at glaraibarra@ worldbank.org . The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Exploring the Sources of Downward Bias in Measuring Inequality of Opportunity Gabriel Lara Ibarra The World Bank Adan L. Martinez Cruz ETH- Zurich JEL codes: D63, C15 Keywords: Inequality of opportunity, mean log deviation, Monte Carlo, income distribution, top incomes. Introduction Income inequality has become firmly placed at the center stage of economic and policy debate. The reasons for this are the current discussions of how long-term welfare disparities have evolved over time and may continue on such a path (Piketty, 2014), the evidence on large concentrations of wealth in few individuals (OXFAM, 2014), and inequality’s role in fueling discontent in several contemporaneous social movements (Los indignados in Spain, the Occupy protests in USA, the events in the middle east region that became known as the Arab Spring).1 However, understanding inequality in income, consumption or other such outcomes2 and the role of policy in addressing it is itself subject to debate because not all inequality can be unambiguously deemed objectionable. On one side, if we take two individuals who exert different levels of effort, whomever exerts higher effort (demonstrated by attaining higher educational levels or working more) should be able to reap higher economic rewards, for instance, in the way of higher incomes. Inequality in this way then, is necessary to produce the right incentives to promote economic development. In contrast, following an egalitarian ethical point of view, inequalities originated in factors beyond individuals’ responsibility are inequitable and must be compensated by society (Peragine, 2004). Moreover, this type of inequality may lead to reduction in economic growth as it favors human capital accumulation by individuals with better social origins rather than by individuals with more talent or skills.3 This reasoning is not only a theoretical artifact. People do make a distinction between circumstances and efforts when judging distribution of outcomes such as income (Ramos and Van de Gaer, 2012). Arguably, policy makers aiming to reduce inequality should not focus on the inequality caused by choices that individuals can be held responsible for, but instead address the inequality due to circumstances that prevent a “level playing field”, circumstances upon which an individual happens to be born into but ultimately affect her available development life paths. Roemer (1993, 1998) has coined the term inequality of opportunity (IOO) to distinguish between inequality due to differences in circumstances beyond an individual’s control and inequality of effort (IOE). IOE is inequality caused by choices an individual can be held responsible for. If focus is on IOO                                                              1 People participating in these social movements share the perceptions of rising inequality and decreasing economic mobility – “when Occupy Wall Street sprang up in parks and under tents, one of the many issues the protesters pressed was economic inequality” (see http://www.nytimes.com/2013/09/14/opinion/blow-occupy-wall-street-legacy.html). Statements about income inequality perceptions of Occupy movement’s participants can be found in http://topics.nytimes.com/top/reference/timestopics/organizations/o/occupy_wall_street/index.html. Documentation of Los indignados movements can be found in http://elpais.com/tag/movimiento_indignados/a/. 2 Other types of inequality such as access to reliable health services are clearly important for the development of the individual. An example of consequences from this type of inequality has been documented in the USA: infant mortality in USA has been linked to inequality at time of birth –infants born to non-white, non-college-educated, non-married US mothers have higher probabilities of post-neonatal mortality (Chen et al., 2014). 3 From an economic point of view, a multiple state framework with borrowing constraints can be used to show that this type of inequality reduces economic growth as it favors human capital accumulation by individuals with better social origins rather than by individuals with more talent or skills. Following a similar reasoning, income inequality among people exerting different efforts rewards unequal effort and/or unequal talent –incentivizing people to work which in turns stimulates growth (Marrero and Rodriguez, 2013). 2    instead of inequality in outcomes, then public policies can be reframed. That is, instead of aiming for a policy that equalizes outcomes, public policy makers may put their efforts into designing a policy aiming to nullify, to the greatest extent possible, the effect of circumstances on outcomes, but allowing outcomes to be sensitive to individuals’ effort (Roemer and Trannoy, 2013). By distinguishing between unchosen circumstances and individual choices, equal opportunity theorists have shifted the focus to the individuals’ responsibility. Implicit in this approach is the view that an optimal public policy must find an equilibrium between re-allocating initial resources and respect individuals’ free will. Thus the goal of a policy must be to equalize what is available to individuals at the beginning of their journey –not at the end. Where and how individuals end up is their responsibility and the society should not interfere in this realm. Notice that, from this point of view, inequality in outcomes is neither desirable nor undesirable –i.e. the inequality of opportunity theory has no aversion to inequality in outcomes (see Roemer and Trannoy, 2013). Despite the fact that the interest in inequality of opportunity has been fueled by the recent debate on whether large concentrations of wealth in few individuals is due to overcompensation of effort (e.g. Piketty, 2014), Roemer and Trannoy (2013) have highlighted that the theory of equal opportunity is not intended as a theory of distributive justice. They offer two reasons. A first one points the pragmatism behind this theory: this theory does not provide general rules to decide what people are responsible for. Practitioners infer the circumstances, and implicitly the effort, according to what they think a particular society rewards or punishes. A second reason refers to the recognition that this theory does not provide a view on what the proper rewards to effort consist in. That is, the theory of equal opportunity has no stand on the debate on whether individuals are compensated too much or too little. Thus, after decades of being mostly theoretical, the literature on IOO has become very empirical. IOO has been analyzed in a wide array of realms such as income distribution, income taxation, health conditions, health care, educational achievement, and anti-poverty policy.4 This rapidly growing empirical literature is based on the work by Bourguignon et al. (2007), Paes de Barros et al. (2009), and Ferreira and Gignoux (2011). In short, for a given outcome, IOO can be obtained as a ratio of the inequality between groups or “types” of individuals to the overall observed inequality. Between-group inequality is calculated as the inequality across types of individuals, where a type results from intersecting categorized variables that capture circumstances beyond an individual’s control. The number of types in specific applications depends on the number of categories into which each circumstance is divided. When the types of individuals have been correctly defined–i.e. all relevant circumstances and categories have been taken into account- individuals belonging to a type are treated as homogeneous in their circumstances, and differences in outcomes across types are imputable to differences in circumstances. Thus, the                                                              4  Extensive recent reviews of the literature are provided by Roemer and Trannoy (2013), Ferreira and Peragine (2015), Pignataro (2012), and Ramos and Van de Gaer (2012).  3    share of total inequality that is related to the inequality due to differences in circumstances is interpreted as the IOO. Unfortunately, this empirical strategy suffers from a drawback: it provides at best lower bounds of the true IOO. One well-recognized factor leading to this downward bias is that only a subset of all relevant circumstances are observable in available datasets (Ferreira and Gignoux, 2011). The imprecision in measurement of the outcome variable is another reason recently pointed out by Chavez-Juarez (2015). An additional factor possibly adding to the downward bias of IOO estimates is the lack of information about a portion of the population under study. Available data are usually gathered through surveys that likely do not reach the most favored population. If this is the case, sample distributions of outcome variables miss the rightmost section of the right tail of the population distribution. Thus sample distributions of outcome variables erroneously look more homogeneous than the true population distribution. Downward bias in IOO estimates is then a consequence of the smaller variation in the sample distribution. The magnitude of the downward bias depends on how much variation is lost. Loss in variation is likely large because most favored individuals usually reach large outcome values.5 The possible additional bias from not observing top income populations has largely been overlooked in the IOO literature. 6 For instance, Roemer and Trannoy (2013) provide a lengthy, thorough discussion on the limitations stemming from poor quality data. They emphasize on the need of improving survey, particularly in developing countries. They point out the relevance of gathering physical information such as body mass index and psychological information such as mental health indicators and IQ measures. They also highlight how important is to measure achievements of children around the age of consent. But these recommendations aim to improve the gathering of circumstances, overlooking the possible consequences of not observing subpopulations at all. In this context, a question with very practical implications naturally arises: is there a way to quantify the downward bias of IOO estimates? This paper answers this question by means of Monte Carlo simulations that experimentally manipulate three factors: i) not observing most favored populations; ii) not observing all relevant circumstances; and iii) interaction between not observing neither most favored population and all relevant circumstances.                                                              5 The issue on information about the most favored population not being available in household survey datasets has been discussed before in studies of inequality of outcomes (e.g. Korinek et al. (2006) for the case of the U.S., and Hlasny and Verme (2013) for the case of Egypt) but its implications for the estimation of IOO measures has been overlooked so far. 6 As part of a companion project, we study the impact of an additional empirical issue –living standards across regions. So far, researchers focusing on income or wages have overlooked differences in costs of living across regions of a country. As a consequence, total inequality in income is likely to be overestimated. This is the case because high income individuals usually live in locations with higher costs of living than low income individuals. By not adjusting for differences in purchasing power, high income individuals appear richer than they actually are. Ongoing research is focusing on i) the implications of not adjusting for purchase power on IOO estimates, and ii) the impact of not observing the most favored population on IOO estimates in this context. Implications on the estimation of Growth Incidence Curves from not adjusting for purchase power has been discussed in other studies (e.g. Skoufias and Olivieri, 2013) but it is an overlooked issue in the IOO literature. 4    While strategies to handle the lack of information about all relevant circumstances already have been proposed (e.g. Niehues and Peichl, 2012), the interactive impact from this downward bias source and the lack of information about the most favored population has not been studied. Results from Monte Carlo simulations are straightforward: i) loss of information about the most favored population produces negligible downward bias when IOO is large (e.g. 0.978); ii) magnitude of downward bias, however, increases when true IOO decreases – e.g. representing 22% of the true value when true IOO is 0.299; iii) loss of information due to not observing a relevant circumstance may or may not be negligible –the magnitude of the downward bias depends on the variation of the outcome the circumstance can explain by itself; and iv) interaction between not observing neither most favored population and a circumstance increases the magnitude of the downward bias –i.e. there are interactive effects between the two sources of downward bias. By compiling results from the Monte Carlo simulations carried out in this paper, we are able to observe a strong positive correlation between estimated IOO and the amount of variation in the outcome variable explained by the combination of circumstances (measured by the R2). This association holds both for estimates reported in published studies and under a variety of Monte Carlo scenarios carried out in this paper. This result suggests that in empirical applications, the IOO estimate can be roughly (and quickly) approximated using simple econometric techniques. More importantly, this highlights the steep data requirements of an exercise such as the IOO: only to the extent that variables found in a given survey can explain a larger share of the variation of the outcome of interest, a higher IOO estimate will be estimated. The rest of this document is organized as follows. Section 2 of this document describes estimation of inequality of opportunity. Section 3 describes the experimental designs behind our Monte Carlo and bootstrapping simulations. Section 4 presents results. Section 5 presents conclusions. Measurement of Inequality of Opportunity The estimation of Inequality of Opportunity (IOO) for continuous outcomes followed in this paper relies on the methodology described in Ferreira and Gignoux (2011). Borrowing Roemer’s (1998) model of advantages, assume desirable economic outcomes are defined by three types of characteristics: circumstances (C), effort (E), and luck (u). Circumstances are all variables beyond an individual’s control. Effort is captured by variables over which individuals have control and may also be correlated with circumstances. Luck refers to the completely randomly variables affecting economic outcomes. Thus, individuals’ outcomes we observe can be written as , , (1) 5    As noted by Ferreira and Gignoux (2011), equality of opportunity in Roemer’s sense implies that while outcomes can vary by effort (individuals who exert more effort should be rewarded higher incomes) and luck (situations outside the control of the individual or policy), circumstances should not matter in how the outcomes are distributed. That is, equality of opportunity requires that | , where . is the cumulative distribution function of the outcome of interest. Measuring inequality of opportunity then is equivalent to measure the extent to which | . To measure this difference, this paper employs the ex-ante non parametric approach of equality of opportunity.7 Generally speaking, this approach consists of five steps. First, we define the outcome variable of interest in the survey. In our case, we will focus on individuals’ wage earnings. Second, define the set of circumstances that are believed to be relevant to the individuals’ observed outcomes. These circumstances include gender, education of the parents, region of birth, etc.8 Third, allocate individuals into groups or “types” that result from combining circumstances across all their categories. Fourth, calculate the inequality of outcomes from a smoothed distribution (Foster and Shneyerov, 2000). For this distribution, each individual’s outcome (y) is replaced by the group-specific mean for her type. Finally, for both the original distribution of incomes and the smoothed distribution calculate the following ratio: (2) Where refers to the earnings of individual i, represents the average outcome of individuals who belong to type k, and I() is an inequality index. While any inequality index could be used, it is preferable that the chosen index satisfies the following axiomatic properties: symmetry (anonymity), transfer principle, scale invariance, population replication, and additive decomposability. In turn, for any inequality index satisfying these properties, satisfies: i) the principle of population, i.e. the index is invariant to a replication of the population; ii) scale invariance, i.e. the index is invariant to the multiplication of all circumstances by a positive scalar; iii) normalization, i.e. if the smoothed distribution is degenerate then the index takes a value zero; and iv) within-type symmetry, i.e. the index is invariant to any permutation of two individuals within a type. Our estimate of IOO ( ) is bounded by 0 and 1 and can be roughly interpreted as the share of total inequality that is explained by circumstances.9 The Inequality index we use here is the mean log deviation (MLD). MLD is defined as 1/ ∑ ln / ,                                                              7   An  ex‐post  approach  identifies  individuals  with  the  same  level  of  effort;  then  estimates  inequality  in  outcome;  and  finally  measures  how  outcomes  vary  by  types.  See  Ferreira  and  Gignoux  (2009)  for  details.  The  authors  also  propose  a  parametric  approach  based  on  an  OLS  regression  and  simple  functional  assumptions.  The  non‐ parametric approach is implemented here.  8  Continuous variables are broken into categories.  9  In the numerator, all inequality within types is eliminated, and thus only inequality across circumstances groups is  taken into account.   6    where i stands for individual i, is individual i’s outcome, and is the overall mean of the outcome variable.10 The reasoning behind estimation of IOO this way assumes that the i) sample distribution of outcome variable resembles population distribution, and ii) all relevant circumstances beyond an individual’s control (and used to create the types that partition the population) impact his/her chances of economic development. If this is the case, the grouping strategy creates groups of individuals facing identical circumstances. Thus, differences in outcome across these homogeneous groups reflect differences attributable to differences in circumstances. Our strategy to explore potential sources of downward bias in IOO estimates is based on a series of Monte Carlo simulations seeking to quantify the consequences of i) not observing all relevant circumstances (i.e. circumstances that affect one’s income); ii) not observing individuals in the top of the income distribution (those most favored, the top incomes); and iii) interaction from not observing neither. We describe this approach in detail next. Studying Inequality of Opportunity in a Simulated Environment i. Pseudo‐population The experimental design is implemented on a simulated population of 200,000 pseudo- individuals. In this population, there are five characteristics that define the individuals. We label these characteristics in the same way we would find relevant variables in a household or individual survey: gender, urban/rural setting, region of birth, father’s education, and mother’s education. Individuals can be either male or female, live in an urban or rural community, and have been born in one of three regions. Each individual’s father’s and mother’s education fall into one of three possible categories: illiterate, literate, or completed primary or above. Taken together, these five circumstances create (2x2x3x3x3=) 108 mutually exclusive groups or types to which each pseudo-individual belongs. To keep things simple, we assign approximately the same number of individuals to each group.11 To define the characteristics of each individual in the pseudo-population, we apply the following rules. A pseudo-individual is female with probability 0.52. To assign individuals into other circumstances we considered an “ordering” of groups in which group 1 contains the most disadvantaged individuals and the group 108 contains the most favored individuals. Thus for example, an individual is born from an illiterate father with probability 0.0092 ∗ 109 , where 1,2, … ,108. This pseudo-individual is assigned with probability 1 0.0092 ∗ 109 ∗ 2/3 to a father who reads and writes, and with probability 1 0.0092 ∗ 109 ∗ 1/3) to a father with elementary school or above. A similar                                                              10  Note that in the smoothed distribution, yi will be replaced by  .  11  Each group has approximately 0.92 % of the population.  7    assignment is carried out for mother’s education. We note that assigning a weight of 1/3 to category 3, the assignment rule aims to resemble a realistic situation in which a smaller portion of the whole population has completed elementary school in comparison to the portion that can only read and write. Regarding the region of birth, a pseudo-individual is assigned to region 1 (the most advantageous like the capital region of a country) with probability 1 0.0092 ∗ 109 ∗ 1/3 , to region 2 with probability 1 0.0092 ∗ 109 ∗ 2/3 and assigned to region 3 (the least advantageous) with probability 0.0092 ∗ 109 . In this way, pseudo-individuals with larger probabilities of living in the least advantageous region also face the largest probability of having illiterate parents. Finally, assignment to an urban setting follows a reasoning similar to the one used for assignment to regions. The probability that an individual is born in a urban context is 1 0.0092 ∗ 109 ∗ 2/3 . This assignment implies that individuals in the first regions are most likely urban. To obtain a general idea of the composition of our pseudo-population, Table 1 reports the percentage of pseudo-individuals by circumstances. Percentages in the main diagonal refer to the entire pseudo-population (i.e. 200,000 observations). Percentages outside the diagonal refer to the individuals with the circumstance listed in the corresponding column. For instance, the first element in the diagonal reports that 52% of pseudo-individuals are female, 50% of individuals live in an urban setting, 17% live in region 1, and so on. If we follow column 1, we find that just under 50% of females live in a urban context, or that 33.29% have a father who can read and write. To learn the percentage of pseudo-individuals whose mother and father are both illiterate, we look up such an intersection in Table 1: 66.64%. Similar calculations can be carried out if interested in learning the number of individuals with a given set of circumstances. Using the pseudo-population distribution, we define the income generating process of individuals as follows: 85 7∗ 3∗ 9∗ 3∗ 4∗ 0∗ 1∗ 2∗ (3) 0 3∗ 4∗ According to equation (3), average income in the pseudo-population is 85 units in the omitted category. Being female is associated with lower incomes (8.2% lower). Being born in an urban context is a favoring circumstance increasing average income by 3.5 %. Region 1 is the most advantageous region, increasing income by 10.5%. In contrast, pseudo-individuals born in region 3 receive income 4.7% lower than average income, and pseudo-individuals born in region 2 receive 3.5% lower income. 8    In terms of parents’ educational attainment, equation (3) reflects a monotonic increase in income based on increased parents’ education. Resembling typical findings from the empirical literature, impacts from parents’ education in equation (3) differ by parent. An empirical regularity with respect to whether father’s or mother’s education impacts most has not been determined but differences have been documented in case studies (see Dickson et al., 2013). In this simulation, the variable labeled mother’s education is associated with higher improvements of income in comparison to the variable labeled father’s education. Table 2 presents average income by circumstance across the pseudo-population. Differences across circumstances are evident and consistent with equation (3). For instance, the largest variation is observed across regions (from 80.93 in region 3 to 95.42 in region 1). This distribution of income determines our baseline scenario. To explore how variations in total inequality impact our simulations, we also simulate an inequality enhanced scenario. The inequality enhanced scenario generates a pseudo population whose income-generating process resembles a context in which, in addition to circumstances, a second inequality inducing process is at play. The relative position of an individual at birth can further affect her income either positively or negatively. If an individual is born at a high income group, the individual’s network tends to increase her income. If born in a low income environment (rural setting and the most disadvantageous region), her expected average income further decreases. This stylized description of the negative effect could be consistent with the presence of poverty traps. Particularly, geographical poverty traps. According to these theories, being born in a poverty context reinforce poverty because it permanently limits the decisions available to the poor individual (Kraay and McKenzie, 2014). The inequality inducing adjustments are introduced to the data-generating process described in equation (3) and adding non-zero mean normally distributed error terms according to the following rule.12 Individuals who belong to the bottom 1% experience income decreases by the absolute value of a normally distributed random draw, with mean 15 and standard deviation of 10. The income of individuals between the bottom 1% and the 10th percentile is decreased by a random draw of a normal distribution with mean and standard of 5. If pseudo-individuals are between the 10th and 50th percentiles, their income is changed (either increased or reduced) by a normally distributed draw with mean either 5 (10th to 25th) or 15 (25th to 50th). Standard deviations of these draws are both 10. Individuals above the 50th percentile increase their income. Individuals who belong to the range between the 50th and 75th percentiles, experience an income increase equal to the absolute value of a normal draw from a distribution with mean and standard deviation of 5. For individuals between 75th and 90th percentiles, the draw is taken from a normal distribution with mean 15 and standard deviation of 10; for individuals between 90th and 95th percentiles, the mean is 20 and the standard deviation is 10; when between 95th and 99th, the mean is 40 and the standard deviation is 5; and finally, individuals in the top 1%, the draw is obtained from a normal distribution with mean 60 and the standard deviation is 2. Overall, the                                                              12  All percentiles used are based on the original income distribution.  9    final distribution of incomes yields a higher inequality than under the first income-generating process. ii. Experimental design The experimental section of the Monte Carlo simulations uses as basis the two income- generating processes described above. For each of these, a series of scenarios are developed to study the potential downward bias found in empirical applications of the IOO estimate. Scenarios are created along three possible dimensions: the population effectively observed in the data, the number of observed circumstances, and the true share of inequality of opportunity. Three observed population scenarios are analyzed: a) the entire pseudo-population is observed, b) the top 1% of the income distribution is unobserved, and c) the top 5% of the income distribution is unobserved. Six observed circumstances scenarios are studied: all five circumstances are observed in one scenario, and in each of the remaining five we exclude one circumstance at a time. Variation in IO share is created by adding a zero-mean, normally distributed error term to the income-generating process (baseline or inequality enhanced). By varying the dispersion of the error added, we obtain different underlying “true” IOO. An error term with standard deviation of 1 produces an IOO of 0.978 in the first data-generating process, and a IOO of 0.326 in the second data-generating process. Standard deviations of 5, 7, 10, 15, and 20 produce IOO estimates of 0.635, 0.468, 0.299, 0.156, and 0.091, respectively, in the first data-generating process. The same standard deviations produce IO shares of 0.272, 0.233, 0176, 0.109, and 0.069 in the second data-generating process. Table 3 describes true inequality measures under each data-generating/error distribution scenario when the entire distribution is observed. Scenarios have been labeled as baseline (panel A) and inequality enhanced (panel B) to highlight the assumptions behind each data-generating process. In the first panel of Table 3, the true IOO ranges from 0.978 to 0.104, while the corresponding Gini Index ranges from 0.081 to 0.252. An increase in IOO share is associated with a decrease in Gini Index. While this may appear counterintuitive, this results follows from the mechanical increase in relevance of the random component or “luck” when adding zero-mean terms with larger standard errors. Incidentally, column (4) labeled “All” shows the R-squared from an OLS regression of income as the dependent variable and all relevant circumstance categories as regressors. The R-squared obtained when only including one circumstance at a time are also presented in Table 3 (columns [5]-[9]). These R-squared show how much variation can each variable explain by itself. Region can explain up to 0.564 of the variation when the IO ratio is 0.978. As we will see below, this feature becomes relevant when analyzing the magnitude of the downward bias. 10    As shown in the Table 3 panel B, addition of zero-mean errors results in larger Gini Index measures –ranging from 0.17 to 0.36. In this case, for a given distribution of luck, IOO estimates are smaller in the inequality enhanced scenario. For instance, when the error is distributed with standard deviation of 10, the true IOO is 0.299 (panel A), and only 0.176 in the inequality enhanced scenario (panel B). Figure 1 illustrates distribution of income under each data-generating/luck scenario. Two features are worth highlighting. First, with exception of one distribution, all of them resemble realistic distributions. The exception corresponds to the baseline scenario in which the error component is distributed with unit standard deviation. In this case, the distribution is multi- modal, reflecting the fact that the random component (or one’s “luck”) is relatively small in this scenario and groups of individuals are clearly identifiable when considering their circumstances –R-squared under this scenario is 0.978. A second feature of Figure 1 refers to increase in variation of income under the inequality enhanced scenarios. The larger upper tails under such scenarios are evidence of the larger variation induced by our modeled “poverty traps/networking” effects. This increase in variation is what allows for larger Gini Indexes. Taken together, the experimental design generates 108 study cases for each data-generating process.13 The combination of observed population scenarios and observed circumstances scenarios generate a range of cases that perhaps could be thought of going from a close to ideal case to arguably more realistic ones. The ideal case corresponds to the scenario in which researchers have access to all relevant information: the entire pseudo-population is observed and all five circumstances are observed. A layer of realism can then be added by excluding one circumstance at a time. In these scenarios, researchers observe the entire pseudo-population but cannot observe all relevant circumstances. Another layer of realism is added by truncating the observed population. Thus a more realistic scenario may correspond to the case in which researchers do not observe individuals at the top of the income distribution and at least one circumstance. Finally, we also explore whether differences in the underlying (or “true”) IOO affect the expected bias of IOO estimates. Results Monte Carlo simulations Table 4 reports results for the baseline data-generating scenario. The table shows, in percentage terms, differences between the true IOO (which we defined implicitly based on the error distribution) and the median IOO estimate from 1,000 simulations.14 Each panel of Table 4                                                              13  There are three distinct shares of “observed” data (all, truncated top 1%, and truncated top 5%), six sets of  observed circumstances and  6 error distributions. These together yield a total of (3x6x6=) 108 cases.  14  Remember that for each simulation we calculate the IOO. The median from all 1000 IOO estimates us used to  calculate the difference shown in the table of results.  11    reports differences for each IOO level –i.e. 0.978, 0635, 0.468, 0.299, 0.156, and 0.091. The first row of each panel reports differences obtained when the entire pseudo-population is observed. The second row of each panel reports differences obtained under scenarios in which observations above 99th percentile are excluded. The third row report the results for the case when the top 5% is not observed. We also present results when the number of circumstances available to the researcher vary: column (1) shows the case when all variables are observed, while subsequent columns present results when one of the relevant circumstances (notes in the column header) is not observed in the data. A first feature to underscore from Table 4 is that as true IOO decreases, the associated downward bias when a portion of the population is unobserved increases. This increase is monotonic for the case in which all circumstances are observed but not necessarily monotonic under scenarios in which one circumstance is not observed. Figure 2 illustrates the pattern just described. The two lines closest to the horizontal axis illustrate how the downward bias increases when all circumstances are observed but a portion of the population is not observed. When observations above 99th percentile are not observed, downward bias is close to 0% when true IO share is 0.978 and closer to 10% when true IOO relatively low at 0.091. Under the scenario in which the top 5% is unobserved, the downward bias appears to be a very small (close to 0%) when true IOO is 0.978. However, the bias is just under 20% when the IOO is 0.47 and reaches 26% when the true IOO is 0.097. That is, missing information from the most favored population appears to be a concern at IOO levels of 0.5 and below. This result seems intuitive: if we observe all relevant circumstances, and these circumstances explain a good deal of the variation in inequality, missing 1% or even 5% of the top incomes does not create large bias. As other factors explain the variation in incomes and circumstances explain a lower share, more information is lost when we miss some portions of the population. A second feature from Table 4 is that, the downward bias increases when there is interaction between missing information from a portion of the population and not observing a circumstance. An example of this is illustrated in Figure 2, where we plot lines referring to the scenario in which the gender characteristic is not observed. When this information is not observed but the entire population is observed, the downward bias remains practically constant at around 28% across different levels of the true IOO. However, missing 1% of the most favored population is enough to increase downward bias to 36% under the scenario in which true IOO is 0.091. Missing the top 5% incomes increases the bias to around 50% even when the IOO is at mid-level (0.47). Similar patterns can be inferred when studying the downward bias originated in missing other circumstances, as reported by Table 4. In our simulated pseudo-population, the circumstance explaining the most variation is region (see column [7] of Table 3). Accordingly, missing this circumstance produces the largest downward bias –starting at 39 when the entire population is observed and the true IOO is 0.978, and reaching a maximum of 53% when the 5% most favored population is unobserved and the IOO is lowest 0.091. 12    A third feature to highlight from Table 4 is that the magnitude of the downward bias seems to increase quickly when the top 5% is missing- even at true IOO “medium range” levels of 0.64 and 0.46. For instance, not observing the father’s (mother’s) education produces a bias of around 15% (19%) when the true IOO is 0.634. Not observing any of these variables leads to a bias of approximately 20% when the IOO is still as high as 0.468. The bias is notable as it comes from variables that explain less than 10% of the variation in income.15 Table 5 shows the results for the inequality enhanced scenario.16 Focusing on the cases where circumstances explain more than 25% of the variation in income (i.e. the top two panels with true IOO higher than 0.25), we see that the biases tend to be larger than in the baseline scenario. Not observing the top 5% of the population leads to a 27% downward bias even if we observe and take into account all the relevant circumstances. Larger biases are present when we fail to observe at least one of the circumstances too. Not observing mother’s education produces a 14% bias when the top 1% is missing, and over 33% when the top 5% is unobserved. The largest biases happen when the regional circumstance is not included in the calculation. Even when we observed the full population, not including the region of birth circumstance, the median IOO estimate is 40% lower than the true IOO, when the true IOO is 0.326 and 0.272. Missing the top incomes further exacerbates the problem. Results from Monte Carlo simulations can be summarized as follows: i) loss of information about the most favored population produces negligible downward bias when IOO is very high (e.g. 0.978); ii) the magnitude of downward bias, however, increases when true IOO decreases – e.g. representing 22% of the true value when true IOO is 0.299; iii) loss of information due to not observing a relevant circumstance may or may not be negligible –the magnitude of the downward bias depends on the variation the circumstance can explain by itself and on whether the observed circumstances are correlated with the unobserved circumstance; iv) in contrast to the effect from not observing the most favored population, the magnitude of the downward bias originated in not observing a circumstance does not significantly varies across true IOO values; and iv) interaction between not observing neither the most favored population and a circumstance increases the magnitude of the downward bias. Simulating with “real” parameters In the previous section we made an effort to provide scenarios that resemble a real-life scenario: income that is related to circumstances of the individual at varying degrees in the context of varying levels of overall inequality. Here we take an additional step and reproduce the simulation exercise above under conditions that follow income differentials obtained from an actual dataset as a way to establish whether the previous findings would translate into under a more ‘realistic’                                                              15  See columns (8) and (9) of Table 3.  16  In this scenario, lower income individuals had their incomes reduced randomly while higher income individuals  experienced a random increase in their income. Under this scenario we have higher inequality (as measured by the  Gini index), but individuals’ circumstances explain on average a lower share of overall inequality.  13    context. We carry out Monte Carlo simulations following a similar reasoning as in the previous section, but the baseline data-generating process follows equation (4). 5 0.35 ∗ 0.15 ∗ 2∗ 0.10 ∗ 0.17 ∗ 0∗ 0.05 ∗ 0.12 ∗ (4) 0 0.13 ∗ 0.18 ∗ where the coefficients in equation (4) closely resemble those obtained from fitting an OLS on natural log of wages observed in the 2012 Egypt Labor Force Survey. Assuming these are all the relevant circumstances, we can add a normally distributed error term to equation (4) and determine what would be the IOO in this hypothetical country. As before, we construct a baseline scenario following equation (4) and an inequality enhanced scenario where lower income individuals are randomly assigned an even lower income and higher individuals can get even higher incomes.  Table 6  reports the parameterization of each error term. When adding an error term with standard deviation of 0.1, the corresponding IOO is 0.984. When adding error terms with standard deviation of 0.3, 0.7, 1.0, 1.2, and 1.5, the corresponding IOO are 0.869, 0.543, 0.358, 0.270, and 0.183, respectively. Table 7 and Table  8 report the magnitude of the downward bias under the baseline and inequality enhanced scenarios. Using as guidance the most recent estimates of the Gini index for Egypt (around 0.30), we focus on the cases where the true IOO is 0.27 in the baseline scenario (next to last panel in Table 7) and 0.267 in the inequality enhanced scenario (next to last panel Table 8). If we believe that such values are reasonably accurate for the Egyptian case, Table 7 and Table  8 indicate that we may still find ourselves greatly underestimating IOO when we miss the top 1% of the distribution from around 9% in the case we have all the relevant variables, to a substantial 80% when the region of birth is not taken into account.17 That is, if the true IOO is around 0.30, not controlling for region as a relevant circumstance would lead to and (under-)estimated IOO of 0.06. Robustness checks Two additional sets of Monte Carlo simulations are carried out to check the robustness of our results. So far, in both the baseline and inequality-enhanced scenarios IOO decreases as Gini increases (see Table 3 and Table  6). This is a mechanical consequence resulting from increasing total inequality through an increase in the variation explained by the error term. Instead, total inequality can be increased by increasing the variation due to a specific variable in equations (3) and (4). In this way, in contrast to what happens in the scenarios analyzed before, IOO increases at the same time as Gini increases. The first set of additional Monte Carlo simulations studies downward bias of IOO estimates by varying the coefficient of the dichotomous variable urban                                                              17  If we run a simple OLS regression of income on a series regional dummies, the R2 is about 80%.  14    (u) among five values –0, 10, 15, 25, 35, and 50. We label these scenarios as increase-in-IOO scenarios. Table 9 describes the data-generating process and corresponding inequality measures. All six scenarios include a normally distributed error term with mean zero and standard deviation of 10. Figure 3 illustrates the distribution of simulated income under the studied scenarios. Table 10 reports the downward bias under scenarios excluding top incomes and circumstances. Results hold similar to those reported in tables 4, 5, 7 and 8. A second set of simulations modifies the assumption that the same number of pseudo-individuals fall in each type. Instead, individuals are allocated to types according to a normal distributions that assigns a larger number of individuals to middle-income types and relatively small number of individuals to low-income and high-income types. IOO estimates are carried out under baseline, enhanced and increase-in-IOO scenarios. Table 11 reports the data-generating processes with their corresponding inequality measures. Table 12, Table  13, and Table  14 report the downward bias under scenarios excluding top incomes and circumstances. Results hold similar with one nuance: impacts from not observing most favored population is larger in comparison to the case in which individuals are allocated uniformly across types. The relationship between IOO and R2 As noted above, one of the main takeaways of the Monte Carlo exercise is that there is a positive correlation between the amount of variation explained by a certain circumstance and the bias that an empirical IOO estimate would suffer with respect to the true value of IOO. A natural question that arises is then, how would the combination of the variation of circumstances observed in the data be correlated with the IOO estimate that we can expect to obtain. To explore this question, we compile information from this study as well as previous studies that have calculated the IOO in other countries. Ferreira and Gignoux (2011) produced a series of IOO estimates for countries in Latin America while focusing on two related outcomes: labor earnings and household income. In turn, World Bank (2015) produced IOO estimates for labor earnings for a few countries in the MENA region. Finally, we use the data generated in our Monte Carlo Simulations and show how the IOO estimate varies with the R2 of a simple OLS regression of the outcome of interest on all the circumstances used in the estimation of IOO. Figure 4 presents the results of this compilation and shows a clear pattern. There is a strong positive correlation between the IOO estimated and the amount of variation of earnings (wages, or household income) explained by the combination of circumstances (measured by the R2): plotting the IOO estimates and corresponding R2 yield almost all the data points along the 45 degree line. This result could imply that in empirical applications, the IOO estimate can be roughly (and quickly) approximated using simple econometric techniques. More importantly, this highlights the steep data requirements of an exercise such as the IOO: only to the extent that variables found in a given survey can explain a larger share of the variation of the outcome of interest, a higher IOO estimate will be reached. 15    Conclusions Results from Monte Carlo simulations carried out in this paper are straightforward: i) loss of information about the most favored population produces negligible downward bias when IOO is large (e.g. 0.978); ii) the magnitude of downward bias, however, increases when true IOO decreases – e.g. representing 22% of the true value when true IOO is 0.299; iii) loss of information due to not observing a relevant circumstance may or may not be negligible –the magnitude of the downward bias depends on the variation the circumstance can explain by itself; iv) in contrast to the effect from not observing the most favored population, the magnitude of the downward bias originated in not observing a circumstance does not significantly varies across true IOO values; and v) interaction between not observing the most favored population and not observing a circumstance increases the magnitude of the downward bias –i.e. there are interactive effects between the two sources of downward bias. These results imply that strategies proposed to handle the lack of information about all relevant circumstances (e.g. Niehues and Peichl, 2012) may not fully take into account the downward bias originated in the lack of information about top income populations. Importantly, when pseudo-individuals are not allocated uniformly across types, loss of information about the most favored population has a larger impact than in the scenario in which they are allocated uniformly. This is relevant because a non-uniform distribution is more likely observed in real world applications. As a by-product of the Monte Carlo simulations carried out in this paper, we are able to observe a strong positive correlation between estimated IOO and the amount of variation in the outcome variable explained by the combination of circumstances (measured by the R2). This association holds both for estimates reported in published studies and under a variety of Monte Carlo scenarios carried out in this paper. This result suggests that in empirical applications, the IOO estimate can be roughly (and quickly) approximated using simple econometric techniques. More importantly, this highlights the steep data requirements of an exercise such as the IOO: only to the extent that variables found in a given survey can explain a larger share of the variation of the outcome of interest, a higher IOO estimate will be estimated. 16    References Bourguignon, François, Francisco HG Ferreira, and Marta Menendez. "Inequality of opportunity in Brazil." Review of Income and Wealth 53, no. 4 (2007): 585-618. Chen, Alice, Emily Oster, and Heidi Williams. “Why is infant mortality higher in the US than in Europe” (2014). NBER Working paper 20525. Available at http://www.nber.org/papers/w20525. Dickson, Matthew, Paul Gregg, and Harriet Robinson. "Early, late or never? When does parental education impact child outcomes?." IZA Discussion Paper No. 7123 (2013). Available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2203273 Dworkin, Ronald. "What is equality? Part 1: Equality of welfare." Philosophy & Public Affairs (1981): 185-246. Dworkin, Ronald. "What is equality? Part 2: Equality of resources." Philosophy & Public Affairs (1981): 283-345. Elbers, Chris and Lanjouw, Peter and Mistiaen, Johan A. and Ozler, Berk. “Re-interpreting Sub- group Inequality Decompositions.” World Bank Policy Research Working Paper No. 3687 (2005). Available at SSRN:http://ssrn.com/abstract=786626 Ferreira, Francisco HG, and Jérémie Gignoux. "The measurement of inequality of opportunity: Theory and an application to Latin America." Review of Income and Wealth 57, no. 4 (2011): 622-657. Ferreira, Francisco HG, and Vito Peragine. “Equality of Opportunity: Theory and evidence.” World Bank Policy Research Working Paper No. 7217 (2015). Available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2584375. Friedman, Milton, and Leonard J. Savage. "The utility analysis of choices involving risk." The journal of political economy (1948): 279-304. Friedman, Milton. "Choice, chance, and the personal distribution of income." The Journal of Political Economy (1953): 277-290. Hlasny, Vladimir, and Paolo Verme. “Top incomes and the measurement of inequality in Egypt.” World Bank Policy Research Working Paper No. 6557. Available at http://elibrary.worldbank.org/doi/abs/10.1596/1813-9450-6557. Kanbur, Ravi, “How Useful is Inequality of Opportunity as a Policy Construct?” (July 1, 2014). World Bank Policy Research Working Paper No. 6980. Available at SSRN: http://ssrn.com/abstract=2475067 17    Korinek, Anton, Johan A. Mistiaen, and Martin Ravallion. "Survey nonresponse and the distribution of income." The Journal of Economic Inequality 4, no. 1 (2006): 33-55. Kraay, Aart, and David McKenzie. “Do poverty traps exist? Assessing the evidence.” Journal of Economic Perspectives 28 (2014): 127-148. Marrero, Gustavo A., and Juan G. Rodríguez. "Inequality of opportunity and growth." Journal of Development Economics 104 (2013): 107-122. Niehues, Judith and Peichl, Andreas, Bounds of Unfair Inequality of Opportunity: Theory and Evidence for Germany and the US (May 15, 2012). CESifo Working Paper Series No. 3815. Available at SSRN:http://ssrn.com/abstract=2060014 OXFAM. “Even it up: Time to end extreme inequality”. (2014). Available at http://policy- practice.oxfam.org.uk/publications/even-it-up-time-to-end-extreme-inequality-333012. Paes de Barros, Ricardo, Francisco H.G. Ferreira, José R. Molinas Vega, and Jaime Saavedra Chanduvi. “Measuring Inequality of Opportunities in Latin America and the Caribbean”. World Bank (2009). Peragine, Vito. “Ranking Income Distributions According to Equality of Opportunity,” Journal of Economic Inequality 2 (2004): 11–30. Pignataro, Giuseppe. "Equality of opportunity: Policy and measurement paradigms." Journal of Economic Surveys 26, no. 5 (2012): 800-834. Piketty, Thomas. "Capital in the 21st Century." Cambridge: Harvard Uni (2014). Roemer, John E., and Alain Trannoy. "Equality of opportunity." (2013). Available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2345357. Ramos, Xavier and Van de Gaer, Dirk, “Empirical Approaches to Inequality of Opportunity: Principles, Measures, and Evidence”. IZA Discussion Paper No. 6672 (2012). Available at SSRN: http://ssrn.com/abstract=2096802 Roemer, John E. Equality of opportunity. Harvard University Press (1998). Sailesh Tiwari, Gabriel Lara Ibarra, and Ambar Narayan (2015). “How unfair is the inequality of wage earnings in Russia? Estimates from panel data”,  Policy Research Working Paper; no. WPS 7291. Washington, D.C. : World Bank Group. Sen, Amartya. Development as freedom. Oxford University Press (1999). Skoufias, Emmanuel, and Sergio Olivieri. “Inequality and the distribution of gains from growth in Thailand between 2000 and 2009” (2013), mimeo. 18    Velez, Carlos E. and Al-Shawarby, Sherine and El-laithy, Heba, Equality of Opportunity for Children in Egypt, 2000-2009: Achievements and Challenges (August 1, 2012). World Bank Policy Research Working Paper No. 6159. Available at SSRN: http://ssrn.com/abstract=2127059 Wendelspiess Chávez Juárez, Florian. "Measuring Inequality of Opportunity with Latent Variables." Journal of Human Development and Capabilities (2015): 1-14                     19    Table 1. Percentage of pseudo-individuals by circumstance Father  Father  Mother  Mother  Father is  Mother is  Female  Urban  Region 1  Region 2  Region 3  reads and  completed  reads and  completed  illiterate  illiterate  writes  elementary  writes  elementary  Female  52                                Urban  49.90  50     Region 1  16.59  21.89  17     Region 2  33.39  44.22  0  33     Region 3  50.09  33.53  0  0  50     Father is Illiterate  50.15  33.66  33.04  34.15  66.71  50     Father reads and writes  33.29  43.92  42.95  44.48  22.23  0  33     Father completed elementary school  16.62  22.07  21.22  22.29  11.39  0  0  17     Mother is Illiterate  49.96  33.25  32.65  33.69  66.66  66.64  33.41  33.25  50     Mother reads and writes  33.33  43.99  42.70  44.81  22.40  22.47  44.56  42.97  0  33     Mother completed elementary school  16.78  22.40  21.85  22.43  11.28  11.38  50.78  21.78  0  0  17  Notes: Percentages in the diagonal refer to the entire pseudo‐population (200,000). Non‐diagonal percentages refer to the population with the circumstance listed in the column.    Table 2. Average income by circumstance across observed population scenarios Father  Mother  Father is  Father is  completed  Mother is  Mother is  completed  Male  Female  Rural  Urban  Region1  Region2  Region3  illiterate  literate  primary  illiterate  literate  primary  Average  87.79  80.82  81.25  87.09  95.42  83.44  80.93  81.83  86.19  87.15  81.13  86.83  87.9  Std. Dev  5.79  5.79  5.64  6.52  4.42  4.41  4.38  5.91  6.74  6.73  5.57  6.44  6.48  Sample  95,931  104,069  100,356  99,644  33,051  66,614  100,335  100,487  66,193  33,320  99,994  66,488  33,518  Notes: Descriptive statistics refers to the 200 thousand pseudo‐individuals generated according to equation (3).  20    Table 3. Data-generating processes and inequality measures    Inequality measures  R‐squared by observed circumstance  Error  Total  Father's  Mother's  Distribution  Gini  MLD  IOO  All  Female  Urban  Region  education  education  (1)  (2)  (3)  (4)  (5)  (6)  (7)  (8)  (9)  Panel A. Baseline scenario  N(0,1)  0.081  0.003  0.978  0.979  0.260  0.183  0.564  0.120  0.199  N(0,5)  0.100  0.005  0.635  0.647  0.172  0.122  0.372  0.080  0.132  N(0,7)  0.116  0.007  0.468  0.483  0.128  0.091  0.279  0.058  0.099  N(0,10)  0.144  0.011  0.299  0.314  0.086  0.058  0.180  0.038  0.064  N(0,15)  0.196  0.020  0.156  0.170  0.045  0.031  0.099  0.019  0.034  N(0,20)  0.252  0.035  0.091  0.104  0.027  0.019  0.059  0.012  0.020  Panel B. Inequality Enhanced scenario  N(0,1)  0.165  0.014  0.326  0.354  0.094  0.067  0.206  0.045  0.074  N(0,5)  0.180  0.017  0.272  0.298  0.080  0.056  0.172  0.037  0.062  N(0,7)  0.194  0.020  0.233  0.258  0.068  0.049  0.150  0.032  0.055  N(0,10)  0.219  0.026  0.176  0.199  0.053  0.039  0.116  0.025  0.043  N(0,15)  0.274  0.043  0.109  0.130  0.034  0.024  0.075  0.016  0.027  N(0,20)  0.335  0.069  0.069  0.087  0.021  0.017  0.051  0.012  0.018  Notes:  Data  generating  process  is  presented  in  equation  (3).  Error  terms  listed  in  the  first  column  are  randomly  added  to  equation  (3)  to  induce  variation  in  incomes  across  individuals.  R‐squared  is  obtained  from  an  OLS  regression  using  income  as  the  dependent  variable  and  the  specified  circumstance  as  regressors.  MLD  is  the  mean  log  deviation,  IOO  is  the  ratio  of  the  smoothed distribution MLD and total MLD. All estimates are based on observing the full income distribution (i.e. no truncation).                          21    Table 4. Difference between true IOO and median estimated IOO in percentage terms: Baseline scenario Excluded Circumstances  Observed population  Father’s  Mother’s  None  Gender  Urban  Region  Education  education  (1)  (2)  (3)  (4)  (5)  (6)  True IO share = 0.978  All  0.00  ‐27.88  ‐4.14  ‐39.82  ‐1.07  ‐5.35  Top 1% truncated  ‐0.14  ‐29.65  ‐4.56  ‐40.91  ‐1.27  ‐5.82  Top 5% truncated  ‐0.82  ‐37.14  ‐6.41  ‐42.76  ‐2.25  ‐8.02  True IO share = 0.635  All  0.00  ‐27.99  ‐4.29  ‐39.83  ‐1.18  ‐5.46  Top 1% truncated  ‐3.25  ‐32.14  ‐7.47  ‐41.66  ‐4.34  ‐8.76  Top 5% truncated  ‐12.81  ‐45.35  ‐17.71  ‐45.07  ‐14.05  ‐19.20  True IO share = 0.468  All  0.00  ‐27.95  ‐4.12  ‐39.77  ‐1.10  ‐5.32  Top 1% truncated  ‐5.03  ‐33.58  ‐9.35  ‐42.03  ‐6.24  ‐10.55  Top 5% truncated  ‐18.01  ‐48.01  ‐22.50  ‐47.24  ‐19.14  ‐23.76  True IO share = 0.299  All  0.00  ‐28.03  ‐4.36  ‐40.01  ‐1.42  ‐5.53  Top 1% truncated  ‐7.15  ‐32.14  ‐7.47  ‐41.66  ‐4.34  ‐8.76  Top 5% truncated  ‐22.55  ‐45.35  ‐17.71  ‐45.07  ‐14.05  ‐19.20  True IO share = 0.156  All  0.00  ‐28.44  ‐4.76  ‐39.99  ‐1.80  ‐6.09  Top 1% truncated  ‐9.28  ‐36.05  ‐13.22  ‐44.20  ‐10.34  ‐14.63  Top 5% truncated  ‐25.45  ‐49.94  ‐29.18  ‐52.60  ‐26.27  ‐30.33  True IO share = 0.091  All  0.00  ‐28.79  ‐5.56  ‐40.51  ‐2.26  ‐6.88  Top 1% truncated  ‐10.42  ‐36.77  ‐14.21  ‐45.31  ‐11.36  ‐15.73  Top 5% truncated  ‐26.63  ‐49.83  ‐30.21  ‐53.74  ‐28.26  ‐31.25  Notes: Results based on 1,000 simulations. Baseline scenario refers to data-generating process described by equation (3) 22    Table 5. Difference between true IOO and median estimated IOO in percentage terms: Inequality enhanced scenario Excluded Circumstances  Observed population  Father’s  Mother’s  None  Gender  Urban  Region  Education  education  (1)  (2)  (3)  (4)  (5)  (6)  True IO share = 0.326  All  0.00  ‐28.28  ‐4.59  ‐39.69  ‐1.28  ‐5.69  Top 1% truncated  ‐8.10  ‐36.19  ‐12.59  ‐44.08  ‐9.18  ‐13.65  Top 5% truncated  ‐27.73  ‐56.20  ‐32.14  ‐51.21  ‐28.73  ‐33.20  True IO share = 0.272  All  0.00  ‐28.16  ‐4.54  ‐39.58  ‐1.17  ‐5.64  Top 1% truncated  ‐8.84  ‐36.80  ‐13.13  ‐43.72  ‐9.85  ‐14.20  Top 5% truncated  ‐28.10  ‐54.67  ‐32.22  ‐52.36  ‐28.99  ‐33.31  True IO share = 0.232  All  0.00  ‐28.15  ‐4.44  ‐39.50  ‐1.02  ‐5.43  Top 1% truncated  ‐9.24  ‐36.91  ‐13.54  ‐43.67  ‐10.19  ‐14.63  Top 5% truncated  ‐27.90  ‐53.61  ‐31.90  ‐52.76  ‐29.00  ‐32.95  True IO share = 0.176  All  0.00  ‐27.91  ‐4.02  ‐39.33  ‐0.84  ‐5.01  Top 1% truncated  ‐9.06  ‐36.55  ‐13.40  ‐43.74  ‐10.21  ‐14.52  Top 5% truncated  ‐27.28  ‐52.02  ‐31.22  ‐52.92  ‐28.42  ‐32.17  True IO share = 0.109  All  0.00  ‐29.13  ‐5.47  ‐40.05  ‐2.27  ‐6.77  Top 1% truncated  ‐10.68  ‐36.91  ‐14.78  ‐45.03  ‐11.75  ‐16.17  Top 5% truncated  ‐27.99  ‐51.22  ‐31.81  ‐54.39  ‐28.82  ‐32.75  True IO share = 0.069  All  0.00  ‐28.55  ‐5.26  ‐39.86  ‐1.53  ‐6.21  Top 1% truncated  ‐10.13  ‐36.80  ‐14.30  ‐45.03  ‐11.23  ‐15.82  Top 5% truncated  ‐27.22  ‐49.95  ‐30.50  ‐54.19  ‐28.65  ‐31.51  Notes: Results based on 1,000 simulations. Inequality enhanced scenario refers to data-generating process described by equation (3), modified as explained in section describing the pseudo-population under analysis.   23      Table 6. Data-generating processes and inequality measures, using coefficients resembling 2012 Egypt Labor Force Survey     Inequality measures   R‐squared by observed circumstance   Father’s  Mother’s  Error  Total  Femal educatio educatio Distributio Gini  MLD  IO Ratio  All  e  Urban  Region  n  n  n   (1)  (2)  (3)  (4)  (5)  (6)  (7)  (8)  (9)  Panel A. Baseline Scenario  N(0,0.1)  0.165  0.012  0.984  0.987  0.041  0.075  0.923  0.060  0.075  N(0,0.3)  0.174  0.013  0.869  0.890  0.037  0.068  0.833  0.054  0.068  N(0,0.7)  0.212  0.022  0.543  0.600  0.024  0.045  0.562  0.036  0.045  N(0,1.0)  0.252  0.033  0.358  0.422  0.018  0.033  0.393  0.026  0.033  N(0,1.2)  0.282  0.043  0.270  0.336  0.015  0.025  0.314  0.020  0.026  N(0,1.5)  0.332  0.064  0.183  0.246  0.011  0.019  0.230  0.015  0.018  Panel B. Inequality Enhanced Scenario  N(0,0.1)  0.170  0.013  0.973  0.978  0.041  0.074  0.915  0.059  0.075  N(0,0.3)  0.178  0.014  0.860  0.884  0.037  0.067  0.826  0.053  0.067  N(0,0.7)  0.217  0.023  0.535  0.594  0.024  0.045  0.558  0.036  0.046  N(0,1.0)  0.259  0.035  0.353  0.420  0.017  0.032  0.392  0.025  0.032  N(0,1.2)  0.290  0.046  0.267  0.335  0.014  0.026  0.315  0.021  0.025  N(0,1.5)  0.338  0.067  0.179  0.243  0.010  0.018  0.230  0.015  0.019  Notes: Data generating process is presented in equation (4). Error terms listed in the first column are randomly added to  equation (4) to induce variation in incomes across individuals. R‐squared is obtained from an OLS regression using ln(income) as  dependent variable  and circumstances included in equation (4) as explanatory variables. MLD is the mean log deviation, IOO is  the ratio of the smoothed distribution MLD and total MLD. All estimates are based on observing the full income distribution (i.e.  no truncation).  24    Table 7. Difference between true IOO and median estimated IOO in percentage terms, scenario following parameters from 2012 Egypt Labor Force Survey Excluded circumstances  Observed population  Father’s  Mother’s  None  Gender  Urban  Region  education  education  (1)  (2)  (3)  (4)  (5)  (6)  True IO share = 0.984  All  0.00  ‐5.10  ‐0.76  ‐80.15  ‐0.27  ‐0.77  Top 1% truncated  ‐0.10  ‐5.47  ‐0.90  ‐82.07  ‐0.38  ‐0.91  Top 5% truncated  ‐0.64  ‐7.49  ‐1.68  ‐86.10  ‐1.00  ‐1.69  True IO share = 0.869  All  0.00  ‐5.12  ‐0.79  ‐80.15  ‐0.29  ‐0.77  Top 1% truncated  ‐0.90  ‐6.22  ‐1.68  ‐81.74  ‐1.17  ‐1.69  Top 5% truncated  ‐5.41  ‐12.16  ‐6.42  ‐85.12  ‐5.75  ‐6.44  True IO share = 0.543  All  0.00  ‐5.42  ‐1.07  ‐80.17  ‐0.59  ‐1.07  Top 1% truncated  ‐4.84  ‐10.16  ‐5.67  ‐81.54  ‐5.21  ‐5.68  Top 5% truncated  ‐22.99  ‐29.34  ‐23.91  ‐84.33  ‐23.37  ‐23.87  True IO share = 0.358  All  0.00  ‐5.03  ‐0.78  ‐80.14  ‐0.36  ‐0.79  Top 1% truncated  ‐7.77  ‐6.22  ‐1.68  ‐81.74  ‐1.17  ‐1.69  Top 5% truncated  ‐32.91  ‐12.16  ‐6.42  ‐85.12  ‐5.75  ‐6.44  True IO share = 0.270  All  0.00  ‐4.80  ‐0.45  ‐80.04  0.05  ‐0.54  Top 1% truncated  ‐9.29  ‐14.74  ‐10.42  ‐81.74  ‐9.57  ‐10.14  Top 5% truncated  ‐35.88  ‐41.14  ‐36.65  ‐85.46  ‐36.22  ‐36.75  True IO share = 0.183  All  0.00  ‐4.85  ‐0.38  ‐80.09  0.00  ‐0.46  Top 1% truncated  ‐11.73  ‐16.73  ‐12.06  ‐81.94  ‐11.77  ‐12.40  Top 5% truncated  ‐37.34  ‐42.30  ‐38.15  ‐85.92  ‐37.53  ‐38.23  Notes: Results based on 1,000 simulations. Baseline scenario refers to data-generating process described by equation (4). 25    Table 8. Difference between true IOO and median estimated IOO in percentage terms, inequality enhanced scenario and following parameters from 2012 Egypt Labor Force Survey Excluded circumstance  Observed population  Father’s  Mother’s  None  Gender  Urban  Region  education  education  (1)  (2)  (3)  (4)  (5)  (6)  True IO share = 0.973  All  ‐0.01  ‐5.12  ‐0.78  ‐80.11  ‐0.28  ‐0.78  Top 1% truncated  ‐0.17  ‐5.55  ‐0.98  ‐82.04  ‐0.45  ‐0.98  Top 5% truncated  ‐1.05  ‐7.90  ‐2.10  ‐86.11  ‐1.41  ‐2.11  True IO share = 0.860  All  0.00  ‐5.18  ‐0.84  ‐80.11  ‐0.32  ‐0.82  Top 1% truncated  ‐1.01  ‐6.36  ‐1.80  ‐81.69  ‐1.28  ‐1.81  Top 5% truncated  ‐5.84  ‐12.60  ‐6.87  ‐85.03  ‐6.18  ‐6.88  True IO share = 0.535  All  0.00  ‐4.88  ‐0.50  ‐80.01  0.02  ‐0.48  Top 1% truncated  ‐4.32  ‐9.71  ‐5.19  ‐81.40  ‐4.70  ‐5.18  Top 5% truncated  ‐22.81  ‐29.17  ‐23.74  ‐84.22  ‐23.15  ‐23.68  True IO share = 0.353  All  0.00  ‐4.94  ‐0.63  ‐80.05  ‐0.20  ‐0.63  Top 1% truncated  ‐7.66  ‐6.36  ‐1.80  ‐81.69  ‐1.28  ‐1.81  Top 5% truncated  ‐32.88  ‐12.60  ‐6.87  ‐85.03  ‐6.18  ‐6.88  True IO share = 0.267  All  0.00  ‐4.81  ‐0.42  ‐79.97  0.00  ‐0.52  Top 1% truncated  ‐9.49  ‐14.73  ‐10.29  ‐81.76  ‐9.84  ‐10.45  Top 5% truncated  ‐35.83  ‐41.28  ‐36.75  ‐85.54  ‐36.17  ‐36.79  True IO share = 0.179  All  0.00  ‐4.22  0.34  ‐79.81  0.88  0.24  Top 1% truncated  ‐10.71  ‐16.04  ‐11.57  ‐81.87  ‐11.02  ‐11.69  Top 5% truncated  ‐36.96  ‐41.88  ‐37.51  ‐85.75  ‐37.54  ‐37.57  Notes: Results based on 1,000 simulations. Inequality enhanced scenario refers to data-generating process described by equation (4), modified as explained in section describing simulation with “real” parameters. 26    Table 9. Data-generating processes and inequality measures under increase-in-IOO scenarios    Inequality measures  R‐squared by observed circumstance   Coefficient  Father's  Mother’s  in Urban  Gini  MLD  IOO  All  Female  Urban  Region  education  education  indicator  (1)  (2)  (3)  (4)  (5)  (6)  (7)  (8)  (9)  0  0.143  0.010  0.271  0.285  0.089  0.014  0.172  0.025  0.046  10  0.153  0.012  0.422  0.438  0.068  0.230  0.197  0.069  0.098  15  0.163  0.014  0.523  0.540  0.055  0.367  0.201  0.088  0.118  25  0.191  0.019  0.680  0.699  0.037  0.586  0.191  0.108  0.136  35  0.222  0.026  0.779  0.799  0.024  0.724  0.179  0.117  0.139  50  0.268  0.038  0.860  0.880  0.014  0.835  0.164  0.121  0.140    Notes:  Data  generating  process  is  presented  in  equation  (3).  All  six  scenarios  include  a  normally  distributed  error  term  with  mean zero and standard deviation of 10. R‐squared is obtained from an OLS regression using income as the dependent variable  and the specified circumstance as regressors. MLD is the mean log deviation, IOO is the ratio of the smoothed distribution MLD  and total MLD. All estimates are based on observing the full income distribution (i.e. no truncation).                                27      Table 10. Difference between true IOO and median estimated IOO in percentage terms: increase-in-IOO scenarios    Excluded circumstance  Father’s  Mother’s  None  Female  Urban  Region  Observed population  education  education  (1)  (2)  (3)  (4)  (5)  (6)  True IO share = 0.27  all  0.00  ‐33.10  ‐1.05  ‐47.31  ‐2.31  ‐7.23  Top 1% truncated  ‐8.85  ‐40.75  ‐8.90  ‐50.23  ‐10.14  ‐15.00  Top 5% truncated  ‐25.18  ‐55.82  ‐25.24  ‐56.52  ‐26.46  ‐31.16  True IO share = 0.42  all  0.00  ‐15.92  ‐26.14  ‐22.40  ‐0.30  ‐2.76  Top 1% truncated  ‐3.88  ‐20.23  ‐30.89  ‐24.77  ‐4.56  ‐7.07  Top 5% truncated  ‐14.34  ‐30.75  ‐41.44  ‐31.30  ‐14.99  ‐17.55  True IO share = 0.52  all  0.00  ‐11.17  ‐39.82  ‐15.32  ‐0.70  ‐2.35  Top 1% truncated  ‐2.88  ‐13.84  ‐43.36  ‐16.79  ‐3.34  ‐5.02  Top 5% truncated  ‐10.00  ‐21.13  ‐51.31  ‐21.85  ‐10.47  ‐12.18  True IO share = 0.68  all  0.00  ‐5.88  ‐55.38  ‐7.83  ‐0.48  ‐1.33  Top 1% truncated  ‐1.23  ‐20.23  ‐30.89  ‐24.77  ‐4.56  ‐7.07  Top 5% truncated  ‐4.39  ‐30.75  ‐41.44  ‐31.30  ‐14.99  ‐17.55  True IO share = 0.77  all  0.00  ‐3.56  ‐63.19  ‐4.63  ‐0.28  ‐0.80  Top 1% truncated  ‐0.56  ‐3.95  ‐64.93  ‐4.72  ‐0.69  ‐1.21  Top 5% truncated  ‐2.08  ‐5.51  ‐68.96  ‐5.82  ‐2.21  ‐2.74  True IO share = 0.86  all  0.00  ‐1.98  ‐69.19  ‐2.51  ‐0.11  ‐0.40  Top 1% truncated  ‐0.17  ‐2.11  ‐70.47  ‐2.49  ‐0.25  ‐0.55  Top 5% truncated  ‐0.82  ‐2.77  ‐73.52  ‐2.92  ‐0.90  ‐1.19  Notes: Results based on 1,000 simulations. Increase‐in‐IOO  scenarios refer to data‐generating process described by equation  (3), modified as described in robustness checks  section.              28    Table 11. Data-generating processes and inequality measures under scenarios in which pseudo- individuals are allocated to each type according to a normal distribution Inequality measures  R‐squared by observed circumstance  Father’s  Mother’s  Error  Gini  MLD  IO Ratio  All  Female  Urban  Region  Education  education  Distribution  (1)  (2)  (3)  (4)  (5)  (6)  (7)  (8)  (9)  Panel A. Baseline Scenario  N(0,0.1)  0.076  0.003  0.974  0.975  0.302  0.074  0.522  0.026  0.094  N(0,0.3)  0.095  0.005  0.602  0.614  0.189  0.047  0.329  0.016  0.059  N(0,0.7)  0.111  0.006  0.431  0.444  0.137  0.034  0.238  0.013  0.042  N(0,1.0)  0.140  0.010  0.270  0.283  0.088  0.021  0.152  0.008  0.026  N(0,1.2)  0.193  0.020  0.136  0.148  0.044  0.011  0.081  0.004  0.014  N(0,1.5)  0.249  0.035  0.080  0.091  0.028  0.007  0.049  0.003  0.009  Panel B. Inequality Enhanced Scenario  N(0,0.1)  0.198  0.020  0.786  0.665  0.155  0.083  0.334  0.031  0.105  N(0,0.3)  0.206  0.022  0.730  0.623  0.145  0.078  0.313  0.029  0.098  N(0,0.7)  0.212  0.023  0.678  0.584  0.136  0.072  0.294  0.027  0.093  N(0,1.0)  0.225  0.027  0.593  0.518  0.121  0.064  0.260  0.024  0.082  N(0,1.2)  0.255  0.036  0.446  0.405  0.094  0.050  0.203  0.019  0.065  N(0,1.5)  0.290  0.048  0.326  0.309  0.071  0.037  0.156  0.015  0.049  Urban  Indicator's  coefficient  Panel C. Increase‐in‐IOO Scenario  0  0.141  0.010  0.256  0.269  0.089  0.000  0.155  0.006  0.026  10  0.146  0.011  0.378  0.391  0.075  0.167  0.135  0.008  0.027  15  0.155  0.012  0.479  0.493  0.064  0.305  0.117  0.010  0.026  25  0.181  0.017  0.651  0.668  0.041  0.545  0.081  0.009  0.022  35  0.211  0.023  0.761  0.778  0.027  0.696  0.057  0.009  0.018  50  0.256  0.035  0.851  0.870  0.016  0.823  0.037  0.007  0.014  Notes: Data generating process is presented in equation (3), and modified as described in robustness checks section.                  29    Table 12. Difference between true IOO and median estimated IOO in percentage terms: Baseline scenario   Excluded circumstance  Observed  Father’s  Mother’s  None  Female  Urban  Region  population  education  education  (1)  (2)  (3)  (4)  scenarios  (5)  (6)  True IO share = 0.97  all  0.00  ‐31.99  ‐5.87  ‐49.55  ‐1.46  ‐7.52  Top 1% truncated  ‐0.18  ‐34.18  ‐6.42  ‐50.97  ‐1.72  ‐8.17  Top 5% truncated  ‐0.92  ‐42.50  ‐8.78  ‐52.50  ‐2.88  ‐10.96  True IO share = 0.60  all  0.00  ‐32.01  ‐5.92  ‐49.58  ‐1.52  ‐7.58  Top 1% truncated  ‐3.56  ‐36.71  ‐9.67  ‐51.28  ‐5.08  ‐11.37  Top 5% truncated  ‐13.76  ‐50.54  ‐20.58  ‐53.64  ‐15.45  ‐22.45  True IO share = 0.43  all  0.00  ‐31.46  ‐5.19  ‐49.17  ‐0.77  ‐6.88  Top 1% truncated  ‐4.77  ‐37.61  ‐10.83  ‐51.22  ‐6.30  ‐12.53  Top 5% truncated  ‐18.15  ‐52.07  ‐24.44  ‐54.95  ‐19.75  ‐26.21  True IO share = 0.27  all  0.00  ‐31.74  ‐5.54  ‐49.38  ‐1.15  ‐7.23  Top 1% truncated  ‐7.09  ‐36.71  ‐9.67  ‐51.28  ‐5.08  ‐11.37  Top 5% truncated  ‐22.50  ‐50.54  ‐20.58  ‐53.64  ‐15.45  ‐22.45  True IO share = 0.14  all  0.00  ‐31.03  ‐4.58  ‐48.86  ‐0.16  ‐6.33  Top 1% truncated  ‐7.39  ‐38.65  ‐13.30  ‐52.08  ‐9.08  ‐14.96  Top 5% truncated  ‐23.92  ‐51.92  ‐29.22  ‐58.68  ‐25.37  ‐30.70  True IO share = 0.08  all  0.00  ‐32.88  ‐7.19  ‐50.26  ‐2.94  ‐8.86  Top 1% truncated  ‐10.27  ‐40.25  ‐16.04  ‐53.76  ‐12.03  ‐17.59  Top 5% truncated  ‐26.46  ‐52.55  ‐31.47  ‐60.63  ‐28.00  ‐32.96  Notes: Results based on 1,000 simulations. Baseline scenario refers to data‐generating process described by equation (3),  modified as described in robustness checks  section.              30      Table 13. Difference between true IOO and median estimated IOO in percentage terms: Enhanced inequality scenario   Excluded circumstance  Observed  Father’s  Mother’s  None  Female  Urban  Region  population  education  education  (1)  (2)  (3)  (4)  scenarios  (5)  (6)  True IO share = 0.78  all  0.00  ‐39.39  ‐20.56  ‐40.60  ‐6.61  ‐23.73  Top 1% truncated  ‐2.51  ‐43.63  ‐24.69  ‐41.52  ‐9.41  ‐28.00  Top 5% truncated  ‐9.66  ‐55.95  ‐36.58  ‐36.95  ‐17.47  ‐40.10  True IO share = 0.73  all  0.00  ‐39.47  ‐20.66  ‐40.68  ‐6.73  ‐23.83  Top 1% truncated  ‐3.29  ‐44.12  ‐25.31  ‐41.89  ‐10.14  ‐28.59  Top 5% truncated  ‐11.72  ‐56.96  ‐37.96  ‐38.58  ‐19.34  ‐41.37  True IO share = 0.67  all  0.00  ‐39.28  ‐20.43  ‐40.50  ‐6.46  ‐23.60  Top 1% truncated  ‐3.59  ‐44.38  ‐25.57  ‐41.95  ‐10.42  ‐28.84  Top 5% truncated  ‐13.08  ‐57.60  ‐38.81  ‐39.66  ‐20.54  ‐42.16  True IO share = 0.59  all  0.00  ‐39.47  ‐20.66  ‐40.68  ‐6.75  ‐23.84  Top 1% truncated  ‐4.96  ‐44.12  ‐25.31  ‐41.89  ‐10.14  ‐28.59  Top 5% truncated  ‐16.00  ‐56.96  ‐37.96  ‐38.58  ‐19.34  ‐41.37  True IO share = 0.44  all  0.00  ‐39.38  ‐20.57  ‐40.62  ‐6.64  ‐23.73  Top 1% truncated  ‐6.80  ‐46.46  ‐28.21  ‐43.05  ‐13.40  ‐31.32  Top 5% truncated  ‐19.81  ‐60.07  ‐42.70  ‐45.07  ‐26.42  ‐45.60  True IO share = 0.32  all  0.00  ‐39.33  ‐20.46  ‐40.55  ‐6.54  ‐23.66  Top 1% truncated  ‐8.28  ‐47.31  ‐29.38  ‐43.57  ‐14.81  ‐32.41  Top 5% truncated  ‐22.25  ‐60.33  ‐43.71  ‐47.25  ‐28.51  ‐46.41  Notes: Results based on 1,000 simulations. Enhanced inequality scenario refers to data‐generating process described by  equation (3), modified as described in robustness checks  section.              31    Table 14. Difference between true IOO and median estimated IOO in percentage terms: Increase-in-IOO scenario   Excluded circumstance  Observed  None  Female  Urban  Region  Father’s education  Mothers‘ education  population  (1)  (2)  (3)  (4)  (5)  (6)  scenarios  True IO share = 0.25  all  ‐0.12  ‐34.75  ‐0.20  ‐53.74  ‐1.78  ‐8.35  Top 1% truncated  ‐8.03  ‐42.27  ‐8.10  ‐56.15  ‐9.65  ‐16.18  Top 5% truncated  ‐24.20  ‐56.73  ‐24.22  ‐61.10  ‐25.75  ‐31.91  True IO share = 0.38  all  0.00  ‐19.75  ‐39.84  ‐30.50  ‐1.05  ‐4.76  Top 1% truncated  ‐4.90  ‐24.49  ‐44.89  ‐32.96  ‐5.82  ‐9.55  Top 5% truncated  ‐15.52  ‐34.73  ‐54.80  ‐38.93  ‐16.47  ‐20.11  True IO share = 0.48  all  0.00  ‐13.35  ‐59.06  ‐20.46  ‐0.97  ‐3.41  Top 1% truncated  ‐3.31  ‐16.36  ‐62.85  ‐22.13  ‐3.95  ‐6.43  Top 5% truncated  ‐10.69  ‐23.67  ‐70.02  ‐27.36  ‐11.35  ‐13.82  True IO share = 0.65  all  0.00  ‐6.78  ‐78.83  ‐10.25  ‐0.65  ‐1.86  Top 1% truncated  ‐1.44  ‐24.49  ‐44.89  ‐32.96  ‐5.82  ‐9.55  Top 5% truncated  ‐4.67  ‐34.73  ‐54.80  ‐38.93  ‐16.47  ‐20.11  True IO share = 0.76  all  0.00  ‐3.80  ‐87.11  ‐5.82  ‐0.21  ‐0.92  Top 1% truncated  ‐0.45  ‐4.19  ‐88.83  ‐5.90  ‐0.63  ‐1.33  Top 5% truncated  ‐1.94  ‐5.68  ‐92.16  ‐7.11  ‐2.13  ‐2.83  True IO share = 0.85  all  0.00  ‐2.01  ‐92.48  ‐3.11  ‐0.03  ‐0.42  Top 1% truncated  ‐0.06  ‐2.12  ‐93.64  ‐3.08  ‐0.15  ‐0.54  Top 5% truncated  ‐0.64  ‐2.69  ‐95.89  ‐3.51  ‐0.74  ‐1.13  Notes: Results based on 1,000 simulations. Increase‐in‐IOO scenario refers to data‐generating process described by equation  (3), modified as described in robustness checks  section.    32    Figure 1. Distribution of simulated incomes Panel A. Distribution in the Baseline scenario Panel B. Distributions in the Inequality enhanced scenario Notes: Distributions in income generated according to equation (3) –panel A—plus enhanced shocks as described in experimental design section –panel B. 33    Figure 2. Downward bias (in percentage terms) across observed circumstances scenarios Notes: Values are taken from tables 4 and 5.   34    Figure 3. Distribution of simulated incomes under increase-in-IOO scenarios     35    Figure 4. Correlation between overall variation explained and the inequality of opportunity estimate   Source: Authors’ compilation using data from this study (LIMC, 2015), Ferreira and Gignoux (2011) and World Bank (2015) study in inequality in the MENA region. Notes: The results from LIMC (2015) are obtained from the Monte Carlo simulations in Table 3’s Baseline scenarios (labeled MC – B) and Inequality enhanced scenarios (MC – IE) and Table 6’s baseline scenario (MC - EGY).  36