WPS5718 Policy Research Working Paper 5718 Heterogenous Peer Effects, Segregation and Academic Attainment Maria Ana Lugo The World Bank Latin America and the Caribbean Region Office of the Chief Economist June 2011 Policy Research Working Paper 5718 Abstract Socioeconomic segregation is often decried for denying the random assignment of pupils between classes to poorer children the benefits of positive ‘peer effects’. identify more general peer effects in Argentine test-score Yet standard, linear-in-means models of peer effects (a) data. Estimates violate both assumptions (a) and (b), and implicitly assume that segregation is zero sum, with gains provide micro foundations for the correlations between and losses to rich and poor perfectly offsetting, and (b) school segregation, average test-scores, and test-score rule out theories of ‘social distance’ whereby peer effects inequality in municipality-level data. are strongest among similar pairings. The paper exploits This paper is a product of the Office of the Chief Economist, Latin America and the Caribbean Region. It is part of a larger effort by the World Bank to provide open access to its research and make a contribution to development policy discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The author may be contacted at mlugo1@worldbank.org. The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent. Produced by the Research Support Team Heterogenous peer effects, segregation and academic attainment Mar´ Ana Lugo∗ ıa JEL classification: D30, I20, O15 Keywords: Test scores, economic segregation, peer-group effects ∗ Department of Economics, University of Oxford and the Office of the Chief Economist, Latin America and the Caribbean Region, World Bank. I am most grateful to Tony Atkin- son, Stefan Dercon, Francisco Ferreira, Julian Messina, Daniel Ortega, Rafael Rofman and Justin Sandefur for their comments and suggestions. I am also indebted to Marta Kisilevsky from the Argentine Ministry of Education (DINIECE) for providing data. All errors remain mine. This research was carried out in part with the financial support from the UK Economic and Social Research Council (PTA-026-27-1826), while at the Univer- sity of Oxford. The views expressed in this paper are those of the author, and do not necessarily reflect those of the World Bank or its Boards of Directors. All errors are my own. 1 Introduction Should government school systems strive to integrate pupils from different socioeconomic or ethnic backgrounds? The United States Supreme Court ruled almost sixty years ago that racial segregation will invariably harm the most disadvantaged and, as a consequence, separate can never be equal (Coleman 1966). Yet de facto segregation on both racial and socioeconomic grounds remains widespread in much of the world, the U.S. included. The large empirical literature on this de facto segregation laments its potential affect on academic performance. While other channels may connect segre- gation to test scores – e.g., resource allocation or teaching strategies – the primary focus in the economic literature has been on peer effects. Despite this interest in the role of social interactions on pupil perfor- mance, the standard functional form used to test peer effects renders most of this literature inherently silent with respect to key questions about the effect of segregation. First, equations that relate pupil performance to a linear function of mean peer characteristics are incapable of finding any relationship between segregation and average performance. Second, the as- sumption of homogenous effects – i.e., that rich and poor are equally and symmetrically susceptible to peers’ characteristics – may obscure important asymmetries. For instance, rich peers may exert a strong positive effect on poor pupils (or vice versa) without the converse being true, as assumed in homogenous models. The purpose of this paper is to demonstrate that a more general model of peer effects – abandoning the implicit assumption of homogeneous, lin- ear effects – can provide micro-foundations that explain the aggregate re- lationship between socioeconomic segregation and the distribution of test scores in Argentina. I focus both on the mean and the gap in attainment between the most-advantaged and the least-advantaged groups. Recent evi- dence shows that average education quality, though not necessarily quantity, has a significant effect on growth rates (Hanushek & Woessmann 2008). But the distribution of education also matters for growth (Gradstein & Justman 2002, Judson 1998, Lopez, Thomas & Wang 1998) and, quite cru- cially – as mechanisms and social objectives themselves – for future income inequality (Mayer 2002, Jenkins, Micklewright & Schnepf 2008), social co- hesion (Jenkins et al. 2008, Gradstein & Justman 2000), and democracy (Castell-Climent 2008). Therefore, the two social objectives in the paper will be equality in academic attainment and average achievement, without necessarily prioritizing one over the other. 2 Economic segregation can be thought to affect educational attainment through two distinct channels (Mayer 2002): through the distribution of resources between schools, and through the composition of students within schools. In this paper, I use a pupil level data-set for sixth graders in Ar- gentina to explore the second mechanism. The effect of the composition of classmates will be identified using the variations across classes within schools, so that the first set of factors – related to the political economy of school financing, and quality of monitoring and institutions – will be con- trolled for rather than tested. The identification strategy follows closely that of Ammermueler and Pischke (2009). The aim is to understand the impact of increased socioeconomic segregation on both the achievement gap between poor and non-poor and the average academic achievement of chil- dren at school. To do this, I need to depart from the standard peer-group effects approach, allowing for heterogeneity in the effects both in terms of the socioeconomic background of the recipient student and in terms of the distribution of socioeconomic characteristics within the classroom. I find that, first, peer effects are indeed heterogenous; wealthier pupils are more sensitive to peer effects overall. Second, poor children do best with peers who are richer, but not too rich. This result is consistent with hypotheses of social distance and competition; people that are too far away in the social spectrum have lower (or even negative) effects on others (Akerlof 1997). The combined prediction of these two effects implies a strong equity- efficiency tradeoff from school integration. Increasing socioeconomic segre- gation raises both the overall average test score and the gap between the rich and poor pupils, where the latter is appreciably more sensitive to the allocation of students than the former. The paper proceeds as follows. The next section describes the data em- ployed and presents basic stylized facts on the relation between test scores and segregation within provinces and localities. Section 3 presents the stan- dard approach to estimating peer-group effects and the alternative model proposed. Section 4 describes the empirical specification and the estimation strategy chosen. Section 5 and presents the results from the estimations and the next section the simulations to show the effects of increased segregation on average test scores and the gap in test scores between the wealthiest and the poorest groups. Section 7 concludes. 3 2 Argentina Argentina is one of the wealthiest countries in Latin America and is well positioned in terms of educational indicators. In 2000 (the year of the data used in this paper), per capita GDP was around PPP$ 10, 000, the liter- acy rate was nearly 98%, primary net enrollment was close to 100%, and secondary school enrolment was just above 80%.1 However, as in much of Latin America, wealth was distributed extremely unequally and poverty was far from eradicated. According to official figures, the Gini index of per capita household income circa 2000 was 0.50, 14% of people lived below the 2.5-dollar-a-day poverty line, and 29% below the national poverty line. In the last couple of decades, the composition of primary school pupils in Argentina has changed (Llach 2006). Enrolment into primary schooling has risen significantly since 1980, particularly among less well-off students, and drop-out rates have fallen appreciably (Rivas 2010b). Along with these changes, the quality of education received is much less equal. The differ- ence in achievement between private and public school pupils has broad- ened in the areas of the country with weak technical capacities following the school decentralization process in the early nineties (Galiani, Gertler & Schargrodsky 2008), and the inequality of achievement remains sizeable in the poorest provinces (Etchart, Gasparini, Bohorquez, Curia, Ferroni & Hontakly 2004).2 Disparities in test scores are driven by differences at three levels: the school, the class, and the pupil. Schools differ in their physical, human and ‘social’ resources. Spending per student may greatly vary across schools. As an example, the private sector in Buenos Aires province spends 30 percent more per pupil than the public sector (Rivas 2010b). There are also large disparities between spending in the public sector across provinces – for in- stance, in 2009 Tierra del Fuego public schools spent five times more per pupil than Salta (Rivas 2010b).3 In addition, teacher quality, pedagogy and 1 o Source: World Development Indicators and Censo Nacional de Poblaci´n y Vivienda, INDEC (2001). 2 The ratio between the first decile and the tenth decile of performance is above four for the poorest provinces, and only slightly better for the other ones. 3 Public schools are mostly funded by the provincial governments (approximately 70% of the consolidated expenditure in education, Rivas et al 2011). The federal government gives some restricted resources to specially disadvantage schools and students which can only be used to pay for infrastructure work, provision of didactic material, support of special schools initiatives, and training (Fiszbein 1999, p. 11). In a few cases, the local government may also contribute to the school financing, through funding special construction work or extra-curriculum activities (Veleda 2005). Parents contribute voluntarily with a very 4 management may vary across schools within and across provinces. Less ex- perience teachers and school principals are more likely to be placed in schools with fewer resources and higher proportion of poor children (CIPPEC 2004, Llach 2006).4 At the class level, the sorting of students across schools will de- termine the potential pool of classmates (social capital) that children can be exposed to. Lastly, differences in performance are due to the students’ own characteristics – such as ability, effort and parental background. Of these forces, I will focus on the third, that is, the extent to which the composition of classmates explains differences in pupils’ achievement, by exploiting dif- ferences within schools across classes. All other differences will be controlled for. The composition of peers – and by extension, the degree of segregation across schools – is the result of choices made by families and the recipient schools. Parents choose which school to apply to, and school authorities choose the pupils they accept, based on their preferences and certain policy restrictions. These choices may be determined by observed and unobserved characteristics of all participants. Students are segmented in a first stage when choosing between private and public schools. The ability to pay fees, in a context of scarce schol- arships, clearly contributes to the disparity in incomes across public and private schools. In urban Argentina 65% of children in the upper quintile of the income distribution attended private establishments, while only 7% of children in the lowest quintile did so.5 Within the pool of public schools, there is still some degree of segregation of students. According to the 1994 Federal Education Law children should be allowed to apply to any school, irrespective of their place of residence. The selection of pupils by the school should be on a first-come-first-served basis, with priority given to children with siblings or parents in the same small monthly fee to cover repairs and maintenance. Private schools are mainly funded by fees charged to students and by subsidies from the central government. Subsidies are in the form of salaries for teachers, and vary between 20% and 100% of the total salary, the latter mainly in the case of schools run by churches. 4 Within the public sector, teachers can choose where to be posted, with priority given according to a points system based on their tenure, training, and evaluations from school directors. Some authors have argued that schools with better-off pupils are able to capture more resources from the State (better infrastructure, more computers and books, better teachers) because high- and middle-income families can exert greater pressure on the educational system (Llach 2006, Veleda 2005). 5 Source: Encuesta de Calidad de Vida 2001, carried out by the National Ministry of Social Development (SIEMPRO). The sample is representative of urban areas with populations above 5,000 inhabitants across the whole country. 5 school. The new regulation represents a change from the previous rules whereby children were usually given priority by residence, although it for- mally constitutes a recommendation to the provinces rather than a binding rule.6 In practice, however, parents still are more likely to send their chil- dren to the neighbourhood school, and school authorities do manage to apply some level of discretion in the selection of students on non-reasonable basis (Fiszbein 1999, Veleda 2003, Veleda 2005). Together, they lead to a particular composition of students according to their socioeconomic status and school segmentation within the state system. Private schools, on the other hand, have several (legal) ways of selecting students. Family background, psychometric tests, recommendations, or in- terviews are among the common criteria. Naturally, the ability to pay the fee as a requirement is in itself enough to determined a relatively homogeneous composition of students according to their family background. As mentioned previously, the paper will identify peer effects using the variation of class composition within schools, across classes, and hence is robust to endogenous sorting described above. 2.1 o Data: ‘Operativo Nacional de Evaluaci´n’ o The Operativo Nacional de Evaluaci´n (ONE) is a standardized test set up by the Argentine Ministry of Education in 1993. The test covers Mathe- matics and Spanish at different levels of the educational system, two pe- riods during primary school and two periods in secondary school. Tests are multiple-choice, and build on basic knowledge and capacities previously agreed among all provincial offices. In the year 2000, all schools in the coun- try were covered by the test. I restrict my analysis to the students of the sixth grade (approximately 10 years of age) and their test score for Spanish. All students present at the day of the survey in the chosen class were tested. Unfortunately, it is not possible to know exactly the proportion of absentees on that day in each school, so there is a possible bias if, for instance, teachers suggested students of lower ability not to come on the day of the test. However, given that the results of the tests are not linked with remuneration or otherwise, and that the results are only released at the provincial level (and not to the school), there are no clear incentives for 6 Provincial governments maintain autonomy in regards to these matters. Therefore, the Federal Law can only be interpreted as a ‘recommendation’ from the central government. In some provinces school choice prevailed before 1994, while other provinces changed the admission system in response to the law. Still others, notably the City of Buenos Aires, continue to selects pupils according to neighbourhood of residence. 6 the school or teachers to try to alter the results. Once the test is completed, students are asked about their personal char- acteristics (gender, age, educational history) and a set of questions related to their family background. In particular, the questionnaire asks the max- imum level of education that each parent attained, the possession of more than a dozen assets, and access to basic infrastructure services. Of these, I will base most of the analysis to a question on the number of books at home as a proxy of family socioeconomic status. Among the family background questions, this is the one with the best response rate. In addition, this vari- able is highly correlated with parental income and education, and reading skills in various international surveys (Hanushek & Woessmann 2008, Am- mermueller & Pischke 2009). While parental education is generally perceived to be a reasonable mea- sure of overall family background, I chose to exclude it from the main anal- ysis because more than a third of the students in the survey did not report it. This is not unusual in these sorts of surveys; the proportion of missing values is similar to that in PIRLS (Progress in International Reading Lit- eracy Study, for developed countries plus others), and in SERCE (Segundo Estudio Regional Comparativo y Explicativo, for seventeen Latin American countries). Still, the pupils who have missing values in parental education tend to come from relative less well-off families (lower number of books at home, durable assets, and services) and to perform worse than those who has completed information (in terms of test scores and grade repetition). For robustness, I will replicate the analysis using parental education and an index of assets. Of the total number of schools surveyed, for the purpose of estimation I exclude schools with only one class per school year. This exclusion allow me to identify the peer-group effects in a convincing way.7 The resulting sample includes over 7,000 schools (half of the schools in the country), 20,000 classes and almost 400,000 students distributed across the country. Table 2 presents basic summary statistics for all students, according to the socioeconomic background of the students. The classification is based on the reported number of books at home, divided in four groups: lower (ten or less books), lower middle (between 11 and 50 books), upper middle (between 51 and 100 books) and upper (more than 100 books). More than 7 See section 4 for a detailed explanation of the estimation and identification strategy employed. 7 Table 1: Sample sizes and average test scores Spanish Number of cases test score Students Classes Schools All 61.6 574,322 Sample with no missing 62.9 467,304 26,077 13,795 Schools with > 1 class 63.1 389,516 19,662 7,439 Source: ONE 2000, Argentina Table 2: Students’ characteristics, according to own family background. Own background (no. books) Total Lower Upper Lower Middle Middle Upper (0-10) (11-50) (51-100) (>100) No. students 399,378 128,321 156,673 60,557 53,827 100.0 32.1 39.2 15.2 13.5 Spanish test score 63.0 55.9 64.2 69.5 69.3 Spanish test score (sd) 19.1 18.2 18.3 18.1 19.0 Maths test scores 59.4 51.9 60.5 65.9 66.1 Parents education (yrs) 10.8 8.8 10.6 12.4 13.5 Assets index (prices) 26.4 20.6 27.2 31.1 32.3 Grade repeated 0.17 0.28 0.14 0.08 0.09 Private school 0.23 0.09 0.24 0.37 0.41 Class size 26.8 25.7 27.0 27.6 27.6 Age 11.6 11.7 11.5 11.4 11.4 Source: ONE 2000, Argentina a third of the students belong to the lower group and another third to the lower-middle group. The proxy chosen for family background performs relatively well; on average, parental education and assets increase with the number of books held at home. As expected, wealthier students perform on average better in both Spanish and Math, are less likely to repeat any grade and the attend private institutions significantly more. The gap in test scores between the lower group and the upper group is 14 points, statistically different from zero and close to one standard deviation in test scores observed in the overall distribution. Students tend to share their classes with disproportionately more peers 8 from the same socioeconomic group (figure 1). Children from the wealthiest group have, on average, three times as many peers from the same upper group than those from the lowest group. And the reverse is also true; pupils from the lower social background share the classroom with almost three times as many classmates from the lower group than those in the upper group. It is precisely the effect of this observed class composition on pupils’ test scores that this paper estimates. Figure 1: Proportion of classmates in each family background group, ac- cording to own family background 100% 90% 80% 70% Peers ' 60% background 50% Upper Upper middle 40% Lower middle 30% Lower 20% 10% 0% Lower Lower middle Upper middle Upper Own background Source: ONE 2000, Argentina The following graphs (figure 2) present the unconditional relationship between economic segregation and average test scores (Spanish), on the left panel, and gap in test scores between the top and bottom socioeconomic groups, in the right panel. Socioeconomic segregation is measured here as the proportion of the total variance in family background that is ‘explained’ by the variance between schools. At the national level, a third of the overall disparities in family background are due to differences across schools. Within provinces, the degree of segregation varies between 14 percent in Santa Cruz to 32 percent in Salta (upper panel). Within localities, the proportion of the variance explained by the variation between schools fluctuates from 0 to 84 percent (lower panel). Provinces and localities that are highly segregated report not only a larger achievement gap between rich and poor pupils, but they also post higher average test scores overall. The relation appears to be stronger be- 9 tween segregation and test score gaps than with test score mean. A one standard deviation change in the level of segregation raises the locality mean test scores by .78 points while it will increase the gap between the wealthi- est and the poorest groups by 2.6 points, i.e. the change in the gap is three times as large as the change in average test scores. Similar effects are found using provincial data. The model of peer effects estimated in this paper is able to account for the observed pattern. Figure 2: Segregation and school achievement by province 75 CFE 15 GBA BUE Gap in test scores BUE Mean test score CHU SFE 70 MEN TFG TUC LPA CRB SGO RNG LRI SJN ERI 10 65 SLS SAL SCR LPA TFG CRB SFE MEN RNG SLS COR ERI SAL SCR MIS SJN GBA CFE CAT CHA 60 CHU JUJ TUC MIS FOR JUJ 5 55 LRI CHA CAT SGO COR FOR .15 .2 .25 .3 .35 .15 .2 .25 .3 .35 Segregation index Segregation index slope= 24.4 (p-value = .16) slope= 30.15 (p-value = .19) by localities 100 40 Gap in test scores Mean test score 80 20 60 -20 0 40 -40 20 0 .2 .4 .6 .8 0 .2 .4 .6 Segregation index Segregation index slope= 6.91 (p-value = .013) slope= 22.8 (p-value = .001) Note: OLS regressions, with no controls. Standard errors are clustered at the provincial level. Gap in (Spanish) test scores is the difference between the average achievement among the upper group of socioeconomic status and the lower group. Segregation index is the proportion of between school variance in number of books over total variance. Source: ONE 2000, Argentina. 10 3 Beyond linear-in-means peer-group effects This section briefly presents the standard approach to estimating peer-group effects and the alternative formulation proposed which accounts for het- erogeneity in the effects. The more flexible structure will turn out to be important in explaining the stylized facts presented above. In the context of schools, peer-group effects refer to spill-overs between students within a classroom that affect student performance. There are several ways in which the characteristics of others are assumed to affect a pupil’s performance. Interaction with peers that perform well at school can lead to higher quality of discussion in the class, and faster or better development of children’s knowledge. Conversely, disruptive behavior may have a negative impact on other students’ capacity to learn. Additionally, a richer classmate can provide alternative role models, access to home tutoring or technology, and social networks that otherwise the child would not be exposed to. Furthermore, it is argued that more affluent parents are able to effectively monitor the school and ensure a better provision of education (Wilson 1997, Mayer 2002). An alternative view holds that less advantage kids compare themselves with better off ones and this comparison may lead to ‘unhappiness, stress and alienation’ (Mayer 2002, p. 155) which will ultimately be detrimental to the performance of the student (Jencks & Mayer 1990, Runciman 1966). In this case, the presence of wealthier classmates will have a negative effect of the pupil’s performance at school, while there is no significant (negative) effect on the better-off student. In between these two extreme views (positive effect/role model versus competition/relative deprivation model), Akerlof presents a model of social distance whereby the strength of the effect of peers depends on how distant the pupils are in the social spectrum. People have a tendency to move closer to others – hence those at the bottom move up, those at the top move down. But the benefits from interacting diminishes as the social distance between them expands. The bulk of evidence of within school peer effects supports the idea of positive effects from a wealthier and better performing class. Yet, some recent studies focusing on endogenous effects (peers attainment) and allowing for heterogeneity find that good performers can have a detrimental effect on low achievers (Duflo, Dupas & Kremer forthcoming, Brown n.d.). 11 The general reduced-form specification of an education production func- tion can be written as aics = f (Xics , Zcs , X(−i)cs ) (1) where aics is the academic achievement of student i attending classroom c at school s; Xics is a vector of individual and family characteristics, such as gender, age, ethnicity, family income; and Zcs includes school or classroom- level variables such as class size, teacher’s experience, level of state funding, among others. The variable of interest here is X(−i)cs , a vector of charac- teristics of classmates of student i, excluding the student, such as students’ performance, parental background, household income, ethnicity or gender. In the present study, the peer variable will be a proxy of socioeconomic background. The standard approach to peer-group effects is to use a linear-in-means model (Manski 1993). Classmates characteristics X−i are included in a lin- ear regression model using the average level of the variable x within the classroom. The linear-in-means model is valid under two key assumptions: (1) that all students are equally sensitive to other people in the classroom; (2) that the particular composition of the classmates is irrelevant. Both these assumptions seem to be at odds with the competition/relative depri- vation and social distance views described above. Most importantly, the linear-in-means formulation implies by construc- tion that the precise allocation of children across schools on economic grounds cannot affect the average achievement in the country; it only may affect the extent of the gap in achievement among different sectors of society. Consid- ering exclusively within school peer effects, a more segregated society will lead to a broader different in attainment between the worse-off and better- off students than an integrated one, but the society’s average test scores will be the same. The present paper generalizes the linear-in-means model to investigate alternative ways in which class composition may affect performance. First, I allow for peer effects to be heterogenous across pupils and investigate whether children from different socioeconomic backgrounds are differently ∂P Ej ∂P sensitive to others. Formally, ∂XEi = ∂X−j , for at least some i, j, where P Ei −i is the peer effect on student i. Second, the composition effect will weight differently each classmates’ characteristics, depending on their position in the distribution of family backgrounds. The idea is that the effect that two middle-class children have on others might differ from the effect of one rich 12 and one poor child, even if the average classroom ‘wealth’ is identical. In terms of behavior, the idea is that the impact of a terribly wild student might be more disruptive than two moderately attentive, even if the disruptive pupil is compensated by an extremely studious companion. In other words, composition effects permit that ∂P Ei = ∂P Ei , for at least some j, k. ∂xj ∂xk The approach followed in this paper is related to that of Hoxby (2000). The paper estimates the effect of increasing the proportion of children from the different ethnic groups on the same and other ethnic groups’ test scores. In short, she finds that raising the proportion of black (Hispanic) students in the class has a negative effect on all students but that this negative effect is felt more acutely by other black (Hispanic) students. 4 Estimation strategy and empirical specification This section explains the specification of the equations to be estimated to test for the existence of heterogenous and composition effects, first sepa- rately, and then combined. The section also describes the strategy used to address various identification and estimation concerns when estimating peer effects. 4.1 Specification The validity of the linea-in-means model depends crucially on two assump- tions; first, that all the pupils are equally affected by others’ characteristics and, second, that the marginal effect of peers is linear across the socioeco- nomic background space. I will estimate peer-group effects using a more flexible formulation, relaxing both these assumptions. 1. Heterogeneous effects: Some people are more receptive than oth- ers to people’s influences. A direct way of testing this idea is to permit heterogenous coefficient on the peer-group variable, according to the recip- ient’s socioeconomic group. This is done by interacting class mean family background with the student’s own background. The latter component is included through an indicator function for the socioeconomic group the stu- dent belongs to. 8 Specifically, 8 A similar specification was used in Ammermueller and Pischke (2009). The interaction effect was found to be positive and significant for two of the six European countries under study. 13 G aics = α + βg ∗ gi ∗ Y(−i)cs + γ ∗ Xi + ηs + ics , (2) gi =1 where gi indicates the social group g of student i, gi = {1, ..G}, and Y(−i)cs is the average peer family background, excluding i. I also include a vector Xi of individual and class characteristics (gender, grade repeater, class size), and school fixed effects ηs . i is the unobserved individual term. It should be noted that the variable used to proxy socioeconomic sta- tus (number of books) is ordinal rather than a continuous. Averaging out the ordinal variable can be contested, as a poor indicator of books held by classmates’ families. Yet, this is the approach generally taken in the litera- ture (for instance, Ammermueller and Pischke, 2009). Although categorical, ordinality ensures that increasing values of this average does correspond to higher socioeconomic level of classmates, even when the cardinal differences between means are not precise. I will thus present results using the class average to serve as a benchmark, and show whether the effects of this av- erage differ depending of the family background of the recipient students. The final regression, combining heterogenous and composition effects, will replace the peer mean with a set of variables that do not attach cardinal meaning to the index. 2. Composition effects: The second relaxation of the classic assump- tion involves allowing for the marginal effect of peers on student i to differ depending on their position in the distribution. Using the average charac- teristics of peer implies that all classmates have the same marginal effect on i. Instead, I estimate peer-group effects using three variables representing the proportion of classmates in each socioeconomic group. The baseline cat- egory is ‘lower group’ and the included categories are ‘lower middle’, ‘upper middle’, and ‘upper’ groups. G aics = α + βg ∗ (Prop in gc ) + γ ∗ Xi + ηs + ics , (3) gc =2 where Prop in gc is the proportion of peers in group gc = {2, ...G}. Be- cause the original variable of socioeconomic background is ordinal, strictly speaking I will not be testing for linearity. Instead, I will focus on the mono- tonicity of the relationship, i.e. as more peers belong to higher categories, the effect should be larger. 14 3. Heterogenous composition effects: Most likely both effects are at work simultaneously. Students are differently receptive to others, depending on both their own social group and that of the peers. For instance, it is possible that pupils are more sensitive to changes in classmates that are socially closer than to those socially very distant to them. The simplest way to allow for heterogeneity in class composition effects is to estimate regression (3) separately for each socioeconomic group – as in Hoxby (2000). The baseline category is the proportion of peers in the same group as the student.9 G aics (g = 1) = α1 + βg1 ∗ (Prop in gc ) + γ1 ∗ Xi + ηs + ics (4) gc =1,gc =1 G aics (g = 2) = α2 + βg2 ∗ (Prop in gc ) + γ2 ∗ Xi + ηs + ics (5) gc =1,gc =2 G aics (g = 3) = α3 + βg3 ∗ (Prop in gc ) + γ3 ∗ Xi + ηs + ics (6) gc =1,gc =3 G aics (g = 4) = α4 + βg4 ∗ (Prop in gc ) + γ4 ∗ Xi + ηs + ics (7) gc =1,gc =4 Estimating equations (4)-(7) form the core of the analysis. The het- erogenous composition effects will enable me to explain the relation between segregation and both the mean and the gap in test scores seen in figure 2. 4.2 Identification and estimation concerns The literature on peer-group effects has paid careful attention to two key selection issues that hinder the identification of peer effects (Soetevent 2006). 9 To compare coefficients across regressions I estimate the model in a single regression; this is useful for the simulations in section 5. The implication is that we are forcing the non- β coefficients to be common for all groups. The baseline category used is the combination of the student coming from the lower middle group (the most populous group), and the proportion of peers in the lowest group. G G aics = α + βg ∗ gi ∗ (Prop in gc ) + γ ∗ Xi + ηs + ics . gi =1 gc =2 15 I will describe these concerns and explain the strategy followed in the paper to address them. Given the relative parsimony of the empirical models, the estimates of the parameters in (2) to (7) might suffer from omitted variable bias due to the failure to control for numerous student and school characteristic correlated with the peer variables. A first concern is related to the selection of children into schools. Parental choices of school are endogenous to the quality of education provided. The concern is that peer-group effects may simply reflect a tendency for eager parents to send brighter children to high-achieving schools with, say, wealth- ier peers and better principals. Or similarly, that well performing schools accept brighter students which are disproportionately represented in better off socioeconomic groups. Ignoring the selection problem can lead to large overestimation of the coefficients accompanying the peer-group variables. The approach followed in this paper is to include school fixed effects in the estimation (Hanushek, Kain, Markman & Rivkin 2003), and thus control for time-invariant characteristics at the school level. In as much as the parents’ selection of school due to reputation is common across all children that at- tend a particular school, the school fixed effect component is also able to individual unobserved characteristics that are common across pupils and affect test scores. It should be clear that with the inclusion of school fixed effects, econo- metric identification of peer effects relies on the existence of perturbations in the composition of students across classes within schools. As expected, these variations are small relative to the overall variation across schools, which makes the identification more demanding, but also more reliable.10 To give a sense of the consequences of using school fixed effects, Table 17 in the appendix presents the estimated values of the coefficient on the mean books index using OLS and school fixed effects. While in both cases positive and significant, the fixed effect coefficient is a quarter of the OLS estimate. This difference is similar to the one that Ammermueller and Pischke (2009) found for the average of six European countries. If the fixed effects esti- mation were correct, this would point to a large overestimation of the OLS results. Interestingly enough, while the peer effect is much reduced, it is 10 Other ways to address the problem of selection (not available with the current data) in- clude: model of selection of schools with Heckman’s classic selection correction (Ioannides & Zabel 2003, Ginther, Haveman & Wolfe 2000, Kingdon 1996, Kingdon 2006), individual fixed effects estimation with partial treatment of members of groups (Moffitt 2001) and instrumental variable estimation (Evans, Oates & Schwab 1992, Rivkin 2001). 16 still twice as large as the coefficient on own family background. This means that raising the average background of the peers has twice the effect on a student’s test score as increasing his own socioeconomic background. School fixed effects, however, cannot address all sources of bias. Appar- ent peer effects may still be driven by differences in the quality of instruction or school management across classes, whenever these differences are corre- lated with the peer variables. This concern would arise if, for instance, richer parents are able to exert pressure on the school for a better teacher or if the school tracks children according to their performance. In other words, there is the concern of selection of students into classes. In terms of the previous equation, the problem stems from the correlation between bcs and peer characteristics Ycs(−i) . To address these concerns, I exploit random variation in peer-groups between sixth-grade classrooms within schools. Two-thirds of the pupils in Argentina attend schools that have more than one class in each grade. The assignation of pupils to a class is determined at the beginning of primary school (at six to seven years of age) and remains unchanged throughout the whole seven years of education. Also, unlike the United States or the United Kingdom, in Argentina children are not streamed on the basis of their performance year by year. Instead, the initial allocation of classes is done randomly and it remains fixed for the entirety of primary education. In other words, bcs and peer characteristics are not systematically related. Combined with school fixed effects, the effect of the peers on a student’s performance is identified by the existence of variations in class composition across classes within a given school. The strategy of combining school fixed effects when there is a random assignment of students across classes has been used in a number of papers. Ammermueller and Pischke (2009) follow a similar approach for estimating peer effects using PIRLS data of primary school children across six Euro- pean countries. McEwan (2003) also bases his estimates of peer effects in Chilean secondary schools using school fixed effects, although the argument of randomization tends to be weaker at the secondary level, given that a frac- tion of all school aged children do attend this level. Vigdor and Nechyba (2004, 2006) take advantage of randomization in primary schools in North Carolina to estimate peer effects for fifth graders. In all of these cases, however, the peer effects were estimated using the mean of the peer charac- teristics and no composition or heterogenous peer effects were allowed.11 My approach is also close to Hoxby (2000) who estimates non linear peer effects 11 Ammermueller and Pischke also cast doubts on the validity of the random assignment 17 with school fixed effects, exploiting the variation across cohorts instead of across classes in the same cohort. Yet, she is interested in the effect of peers according to their gender and race instead of socioeconomic background. The randomization of students across classes within the school is the key element of the estimation strategy. The identifying assumption used is that students share unobserved characteristics at the school level, but not at the level of the classroom. It is not possible to test the randomization directly, but I can test whether the division of observed characteristics across classes in a school deviates significantly from what random assignment would yield. The procedure employed is similar to that used in Ammermueller and Pis- chke (2009). Firstly, I test for the independence of assignment of pupils based on their socioeconomic characteristics using Pearson’s chi-squared tests, for each of the schools that have more than one class. As shown in table 3, in more than 87 percent of the schools one cannot reject the hypothesis of independence at five percent significance level for number of books, and in 94 percent of the schools for parental education. In other words, the evi- dence is consistent with the absence of a systematic assignation of pupils to classes on socioeconomic grounds for a great majority of schools.12 Still, I will estimate all the equations for the complete sample and for a restricted sample that includes only the schools for which the chi-square test was not rejected. An additional concern is that even when the children are randomly al- located across classes, the school may choose to assign teachers based on the performance of pupils. For instances, one could imagine that a princi- pal might choose to allocate the more experienced teacher in a class where there is likely to be more disruptive behavior. Alternatively, teachers with in McEwan (2003) and in Vigdor and Nechyba (2004, 2006) for different reasons. Sacer- dote et al. (2001) is also in a similar vein, exploiting the random assignment of college students to dorms. Related approaches are used by Duflo and Saez (2003), Duflo et al. (forthcoming), Duflo et al. (forthcoming) and Miguel and Kremer (2004) all of which use partial population experiments with random assignment of treatment. 12 There is, instead, evidence consistent with the idea that children are allocated across classes within a school based on their age. This resonates with what teachers and princi- pals say in informal conversations with respect to class formation in kindergarten which may carry out onto primary school. Evidence from OECD countries suggests that ini- tial maturity differences can have long-lasting effects on students performance (Bedard & Dhuey 2006) so that even in sixth grade there might be a significant difference in children test scores due to the age difference. If the allocation of pupils is affected by age there could be correlated effects that could overestimate the peer effect coefficients. Thus, the regressions will control for the age of the student, even though the variable is expressed in years rather than months. 18 Table 3: Tests for independence of peer variables and class assignment and classroom resources (following AP2009). Family Chi-square test Teacher’s characteristics background % school passing test F-test Number of books All 87.4 0.213 Public 87.0 0.421 Private 88.8 0.484 Parental education All 94.1 0.577 Public 93.7 0.673 Private 95.3 0.947 Note: The first column corresponds to the proportion of schools for which the Pearson’s χ2 test cannot reject the null hypothesis of independence between the number of books (parental education) and class assigned within the school. The second column of the table reports the p-value of a joint test from a regression of number of books (parental education in years) on a set of teachers’ characteristics. These include teachers’ gender, degree obtained and years of experience. Source: ONE 2000, Argentina. more tenure in a school might be able to influence their assignation into classes based on, for instance, past performance of pupils so that the more experienced teacher is met with the best performing class. If this were the case, the estimates of the peer effects will be biased. Once again, I follow Ammermueller and Pischke to test this by regressing children socioeconomic characteristics (number of books and years of parental education) on a set of teachers’ characteristics, with school fixed effects. The explanatory vari- ables included in the model are teachers’ gender, degree obtained and years of experience. The p-values of the F-test of joint significance is reported in the right side of table 4. In all cases, there is no evidence of allocation of teachers based on students’ characteristics. A separate issue worth mentioning is the well-known problem of identi- fying separately endogenous and exogenous effects. This concern is known in the literature of social interactions as the reflection problem and has been introduced in the area by Manski (1993). The standard formulation of peer effects is a linear-in-means model, where the student’s achievement is deter- mined by the average classmates’ achievement and average predetermined characteristics. The estimated coefficient on the first term defines the en- dogenous peer effect and the coefficient on the second is the contextual effect. 19 The reflection problem is the inability to separate these two effects due to the fact that there is a feedback loop in the endogenous variables. A great number of papers have proposed alternative methods to tackle the reflec- tion problem and identify separately endogenous and exogenous peer-group effects.13 My approach here is, instead, to remain agnostic about which of these effects drives my results. Rather than estimating the endogenous and the contextual effects separately, my purpose is to estimate the reduced form equation, where the peer effects is often referred to as the general so- cial effects (Moffitt 2001). For the purpose at hand – the study the effect of increase segregation– estimating general social effects using a reduced form is sufficient.14 Finally, standard errors are clustered at the school level to adjust for intra-school correlation, and are robust to heteroscedasticity. 5 Results In this section I test the existence of heterogenous effects – differences in the impact of peers according to socioeconomic background of the recipient student – and composition effects of the peer-group effects – differences in the proportion of classmates in each socioeconomic group. I then estimate the combined effect of both these forces. The next section will show how these parameters may explain the relation in aggregate data from figure 2. Table 4 presents regressions of the standard peer-group effects using the mean classmates’ books index (columns 1 and 3) and combined with the socioeconomic group of the recipient student (columns 2 and 4) on student’s test scores in Spanish. The last two regressions correspond to the ‘restricted sample’ which includes only the schools for which the test of independence of peer variables across classes cannot be rejected. The mean family background of peers (Class mean SES ) has a positive and significant effect on a pupil’s test scores. Increasing the average socioe- conomic status of peers by one standard deviation (.52) raises the pupil’s 13 Manski (1993) suggest to find a variable that affects the achievement but not the contextual effect; Brock and Durlauf (2001, 2002) use a binary or multinomial choice model; Katz et al. (2001) take advantage of group-changing intervention in the Boston area; and again Katz et al. (2001) and Gaviria and Raphael (2001) employ instrumental variables. 14 “To the extent, therefore, that it does not matter for the purpose at hand whether social interactions are of the endogenous or exogenous type, estimation of the reduced form equations (...) is sufficient” (Moffitt 2001, 57). 20 test score by one and a half points. Column 2 shows that this effect is not constant across the level of education of the recipient student. (A for- mal test of equality of coefficients can be found in the appendix, table 13). Wealthier pupils are more sensitive to the background of their peers. One implication of this result is that children from more disadvantaged origins (Li , LMi ) not only perform worse than wealthier students (U Mi , Ui ) due to their own socioeconomic background, but also they lose out in the social in- teraction game, unable to benefit as much from the positive externalities of having wealthy peers. These results also point to the possibility that when rich and poor pupils interact, the losses incurred by the former are greater than the gains to the latter. All in all, the heterogeneity of the peer-group effect calls into question the validity of the standard, linear-in-means model of estimating the impact of social interactions. Table 4: Test of heterogenous peer-group effects Dependent vle: Test score in Spanish All Restricted sample (1) (2) (3) (4) Class mean SES 2.814 2.133 (.254)∗∗∗ (.297)∗∗∗ Class mean SES * Li 1.246 .553 (.293)∗∗∗ (.333)∗ Class mean SES * LMi 2.923 2.279 (.263)∗∗∗ (.304)∗∗∗ Class mean SES * UMi 3.744 3.108 (.277)∗∗∗ (.321)∗∗∗ Class mean SES * Ui 5.332 4.694 (.287)∗∗∗ (.329)∗∗∗ Obs. 389,513 389,513 333,276 333,276 e(N-g) 7,444 7,444 6,463 6,463 R2 .068 .069 .066 .068 Test diff coefficients Note: Standard errors are robust and clustered at the school level. Other variables included in the regressions are the student’s number of books at home, gender, repeat, class size and age. ONE, 2000. Argentina. The next set of results (table 5) estimates equation (3) to examine whether the influence of the peers on a student’s performance is mono- tonically increasing and linear. The evidence is consistent with the idea 21 that raising the proportion of wealthier classmates (while reducing that of the lower group), increases the test score of the average pupil. Contrary to what could be expected, the largest effect comes from increasing the pres- ence of children from the upper-middle group, and not the upper group. This points to a non-linear relationship between peers’ family background and student’s performance at school. The different between the coefficients on the two upper groups (.090 versus .067) is statistically different at an 0.6 percent level. See table 15 in the appendix. Table 5: Class composition effect Dependent vle: Test scores in Spanish (1) (2) Prop peers in LM .052 .044 (.006)∗∗∗ (.007)∗∗∗ Prop peers in UM .091 .082 (.008)∗∗∗ (.009)∗∗∗ Prop peers in U .068 .047 (.008)∗∗∗ (.009)∗∗∗ school fixed effects Obs. 389,501 333,267 e(N-g) 7,444 6,463 R2 .069 .067 Note: Standard errors are robust and clustered at the school level. Other variables included in the regressions are the student’s number of books at home, gender, repeat, class size and age. Source: ONE, 2000. Argentina At worst, the results indicate that there is no gain from having peers from the wealthiest group. At best, they suggest that it is preferable to integrate students from lower socioeconomic background with others from upper-middle class children than with those from the top of the distribu- tion. In both cases, the conclusion of non-monotonicity of peer-group effects remain. In other words, the specific conformation of the class in terms of the socioeconomic background of the students matters for the total effect on test scores. These coefficients estimate effect, however, for the average child. It might very well be that class composition have different effects depending on the parental background of the recipient student. I do this in the next table. Tables 6 and 7 present the estimation of four regressions – equations 22 (4) to (7) – one for each socioeconomic group of the recipient student. The baseline category is the proportion of peers from the same the socioeconomic group. All the estimated coefficients should be interpreted in reference to this category in each column.15 Table 6: Heterogeneous effects of class compositions Dependent vle: Test score in Spanish Lower Lower Middle Upper middle Upper (1) (2) (3) (4) Prop peers in L -.060 -.121 -.115 (.007)∗∗∗ (.014)∗∗∗ (.016)∗∗∗ Prop peers in LM .049 -.064 -.052 (.008)∗∗∗ (.012)∗∗∗ (.013)∗∗∗ Prop peers in UM .079 .025 .017 (.013)∗∗∗ (.009)∗∗∗ (.014) Prop peers in U .024 .008 -.049 (.014)∗ (.009) (.014)∗∗∗ Obs. 125,204 152,756 59,056 52,485 e(N-g) 7,225 7,414 6,973 6,735 R2 .048 .061 .068 .063 Test LM-UM (p-value) .018 Note: Standard errors are robust and clustered at the school level. Other variables included in the regressions are the student’s number of books at home, gender, repeat, class size and age. Source: ONE, 2000. Argentina From these tables three features emerge: (1) from the point of view of the two lowest groups, there is little or no benefit from mixing with the wealthiest students; (2) as previously noted, the upper-middle group has the strongest effects on others, including themselves; (3) the upper group is statistically indifferent between being with peers of their own group or the upper-middle one. The implication of these results is that mixing chil- dren from different socioeconomic groups, especially lower and upper-middle categories, will help narrow the gaps in performance observed while the ef- fect on the average test score is unclear. These results are reminiscent of the social distance model discussed earlier. Students at the bottom of the distribution perform better when surrounded by others from higher social groups, while those above perform worse, hence approaching one another. 15 I also include in the appendix a single regression (Table 17) using interaction terms, which will be used in the simulations in the next section. 23 Table 7: Heterogeneous effects of class compositions. Restricted Sample Dependent vle: Test score in Spanish Lower Lower Middle Upper middle Upper (1) (2) (3) (4) Prop peers in L -.055 -.114 -.098 (.009)∗∗∗ (.017)∗∗∗ (.019)∗∗∗ Prop peers in LM .039 -.077 -.044 (.009)∗∗∗ (.014)∗∗∗ (.015)∗∗∗ Prop peers in UM .073 .020 .020 (.015)∗∗∗ (.011)∗ (.016) Prop peers in U -.002 -1.12e-06 -.062 (.016) (.011) (.016)∗∗∗ Obs. 107,523 130,721 50,303 44,720 e(N-g) 6,274 6,444 6,052 5,846 R2 .047 .06 .068 .06 Test LM-UM (p-value) .021 Note: Standard errors are robust and clustered at the school level. Other variables included in the regressions are the student’s number of books at home, gender, repeat, class size and age. Source: ONE 2000. Argentina But the tendency for a child at the bottom to perform better diminishes when the social distance is too large. Finally, for robustness I ran the previous three regressions using alterna- tive definitions of the dependant variable, the main peer variable, and type of school attended, to see the robustness of results found. The results are summarized in table 8 with regressions results in the appendix (tables 18 to 21). One could argue that the rules of randomization across classes are more relevant in the context of public (state) schools than in private ones. When registering a new student into a school, principals are not meant to request information other than date of birth and formal education history of the child. Private institutions are not subject to such regulation. Indeed, some private establishments request the candidates to take tests, to provide in- formation of parents education and occupation, and to include references. Additional information of candidate’s ability and social background can be use to assign pupils into classes. I therefore run the same analysis exclusively for public schools. All the previous results hold. A second set of robustness analysis involves using alternative measures 24 of socioeconomic status for the student and peers. First, I use parental edu- cation, defined as the average between the two parents when information for both is available. The four socioeconomic groups are defined in relation to levels attained: incomplete primary, complete primary, complete secondary and complete tertiary. For the regression testing the existence of heteroge- neous effects, I compute the average years of education.16 For reference, I also include in the tables the results using the variable books but only for observations for which there is also information of parental education. This is to help distinguish between difference in results arising from the use of alternative proxies of socioeconomic status and those from a smaller sam- ple, given the presence of non-random missing values. The second proxy of socioeconomic status is based on assets and access to services at home. I construct an asset index as the first principal component of a factor analysis. The boundaries of the socioeconomic groups are such that the distribution of students is similar to that derived from the number of books, with 60 percent of pupils in the lower two groups. Table 8: Robustness Analysis Heterogenous Class composition Combined effects increasing not strictly monotonic UM largest L-LM vs U UM vs U Public schools Parental education Assets (PCA) ( ) Math Table 8 summarizes the results; details can be found in the Appendix. In all cases, the main conclusions remain. Crucially, they imply that increasing the degree of segregation of children on economic grounds will harm worse off students test scores while increase the attainment of richer pupils. In other words, the gap in performance would deteriorate. As mentioned before, the effect on the average test score is unclear. For this, I now turn to simulations. 16 The original variable is expressed in terms of levels attained. I impute years of edu- cation using data from the Permanent Household Survey (October 2000). For incomplete level categories, I impute using the median value of the category, by province and gender. 25 6 Simulations This last section of the paper uses the estimated coefficients from table 6 to compute, for each student, the total test-score gain attributable to peer effects. The aim is to analyze how the average performance and the gap between the poorest and the wealthiest group change as the degree of segregation varies. The change in mean test score will depend on whether the losses incurred by poor students are compensated by the gains accruing to the rich. Two elements are key: (1) the coefficients on peer variables for each group, and (2) the proportion of students in each socioeconomic group in the whole distribution. I will construct a (fictitious) sample of students which have the same proportion of pupils in each category of socioeconomic status as found in Argentina. Each student is then assigned randomly to a school of size equal to the median class size in the country (27). In the present exercise, each school has only one class. For each student, I compute the proportion of peers belonging to each socioeconomic group that results from the random assignment. The random allocation of pupils will determine a specific degree of seg- regation across schools. I use the same segregation index as in the introduc- tion, defined as the proportion of the total variance in number of schools that is ‘explained’ by the variance between schools. This particular degree segregation will be associated with a certain average gain in test scores from peers, and a gap between the average gain for the wealthiest group (upper) and the average gain for the poorest one (lower). I repeat this exercise re- allocating students into schools randomly, and computing once again the degree of segregation, the average gain in test score and the average gap in the gains. I do this fifty times, enough to have a sufficient spread in the segregation index. The results are presented in figure 3 and table 9. The vertical lines in the graph are used as reference and indicate the lowest and highest degree of segregation found in the Argentine provinces. Because these results are dependent on the original distribution of socioeconomic characteristics of pupils, I also repeated the same exercise for two other dis- tributions, one corresponding to one of the richest districts in the country, the City of Buenos Aires, and a second for one of the poorest provinces, Santiago del Estero. Two main features emerge from these graphs. First, as the degree of segregation across schools increases both the average and the gap in test scores rise. This means that, at least in Argentina, there is a trade-off between efficiency (increasing the mean) and equity (decreasing the gap). Second, the rate at which the gap increases is much larger in the case of 26 Table 9: Simulated peer effects on test scores in Spanish Full integregation Actual level Full segregation National distribution (32, 39, 15, 14) mean L 1.25 0.73 -1.15 mean LM 4.83 5.00 6.45 mean UM 6.27 7.44 12.85 mean U 4.61 6.67 10.83 overall mean 3.86 4.21 5.55 gap U - L 3.4 5.9 12.0 City of Buenos Aires distribution (12, 36, 25, 27) overall mean 6.70 6.96 8.26 gap U - L 5.7 7.2 12.0 Santiago del Estero (48, 34, 10, 8) overall mean 2.23 2.53 3.79 gap U - L 1.9 4.5 12.0 Note: Figures in the table represent the marginal effect of peers’ background on pupil test scores at various levels of socioeconomic segregation. The effects are calculated using the parameters from table in the appendix. As an example the upper-left cell shows that with full integration a pupil from the lowest socioeconomic group will receive a positive test score gain of 1.25 through peer effects, whereas with full segregation she or he will suffer a loss of 1.15 points. 27 Figure 3: Simulation, using national distribution of books National Santiago del Estero City of Buenos Aires 14 15 14 12 12 Gains in test scores 10 10 Gains in test score Gains in test score 10 8 5 6 8 4 6 0 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 segregation segregation segregation mean gap U - L mean gap U - L gap U - L mean Note: The vertical lines indicate the degree of segregation found in each case. the gap than for the average test scores. In other words, the equity aspect is significantly more sensitive to changes in the allocation of students on the basis of socioeconomic status than the mean is. Both these results are consistent with the pattern observed across localities in the introduction. Indeed, peer effects alone can account for, at least the sign of the relationship observed in the date. The variation across schools (with different human, social and physical resources and management) most likely works in the same direction, strengthening the results. Still, even disregarding the (certainly sizeable) differences across schools, there is a trade-off between gap and mean test scores that is entirely due to the effects that peers have on other pupils. 7 Conclusions and policy implications This paper has argued that the standard, linear-in-means approach to es- timating peer-group effects does not fully account for the consequences of economic segregation. After relaxing two assumptions of the linear-in-means model, I find that wealthier pupils are more sensitive to peer effects. Sec- ond, there is evidence that peer effects are non-monotonic, such that the composition of classmates’ backgrounds – rather than simply their mean – matters for test scores. Estimates using a more flexible functional form reveal a pattern that is consistent with the hypothesis of social distance. 28 When the social gap between two individuals becomes too wide, the posi- tive externalities for the poorer individual disappear. My results show that social interactions matter. While this is not the first paper to find significant peer effects, this paper adds to a small set of studies that can identify causal peer effects that are not contaminated by endogenous sorting of pupils between schools. The size and significance of these estimates in a developing-country school system have important implications for social policy. Segregation exerts an effect on pupils through peer effects that cannot be overcome by equalizing school funding, pupil- teacher ratios or syllabi. In that sense my results give concrete meaning to the U.S. Supreme Court’s dictum in 1954 that separate is inherently unequal. The results also show that segregation is not a zero sum game, but may have aggregate consequences. The functional form used in previous analyses has largely ruled out the possibility of aggregate consequences from segrega- tion by assumption. Relaxing these assumptions, I find somewhat troubling results, pointing to an equity-efficiency trade-off, with segregation yielding higher average scores and much higher inequality in scores between rich and poor. Nevertheless, the fact that inequality in achievement between rich and poor pupils appears in the simulation to be much more sensitive than mean scores to variation socioeconomic segregation may provide a cautionary note when formulating education policies. References Akerlof, A. (1997). Social distance and social decisions, Econometrica 65(5): 1005–27. Ammermueller, A. & Pischke, J.-S. (2009). Peer effects in European primary schools: Evidence from the Progress in International Reading Literacy Study, Journal of Labor Economics 27(3): 315–347. Bedard, K. & Dhuey, E. (2006). The persistence of early childhood matu- rity: International evidence of long-run age effects, Quarterly Journal of Economic 121(4): 1437–1472. Brock, W. A. & Durlauf, S. N. (2001). Discrete choice with social interac- tions, Review of Economic Studies 68(2): 235260. Brock, W. A. & Durlauf, S. N. (2002). A multinomial-choice model of neighborhood effects, American Economic Review 92(2): 298–303. 29 Brown, J. (n.d.). Quitters never win: The (adverse) incentive effects of com- peting with superstars. Kellogg School of Management, Northwestern University. Castell-Climent, A. (2008). On the distribution of education and democracy, Journal of Development Economics 87(2): 179–190. CIPPEC (2004). Los estados provinciales frente a las brechas socio- ıa educativas. Una sociolog´ pol´ ıtica de las desigualdades educativas en las provincias argentinas, Area de Pol´ıtica Educativa, Buenos Aires. Coleman, J. S. e. (1966). Equality of educational opportunity, Government Printing Office, Washington, DC: U.S. Duflo, E., Dupas, P. & Kremer, M. (forthcoming). Peer effects, teacher incentives, and the impact of tracking: Evidence from a randomized evaluation in kenya, American Economic Review . (also NBER Working Paper No. 14475). Duflo, E., Kremer, M. & Robinson, J. (forthcoming). Nudging farmers to use fertilizer: Evidence from kenya, American Economic Review . Unpublished Manuscript, Massachusetts Institute of Technology. Duflo, E. & Saez, E. (2003). The role of information and social interactions in retirement plan decisions: evidence from a randomized experiment, Quarterly Journal of Economic 118(3): 815842. Etchart, M., Gasparini, L., Bohorquez, P., Curia, J., Ferroni, B. & Hontakly, ıticas compensatorias mejoran la equidad? plan social P. (2004). Las po´ o educativo, educaci´n primaria. Evans, W. N., Oates, W. E. & Schwab, R. M. (1992). Measuring peer group effects: A study of teenage behavior, Journal of Political Economy 100(5): 966–991. Fiszbein, A. (1999). Institutions, service delivery and social exclusion: A case study of the education service in Buenos Aires. LCSHD Paper Series No. 47, Human Development Department, World Bank. Galiani, S., Gertler, P. & Schargrodsky, E. (2008). School decentralization: helping the good get better, but leaving the poor behind, Journal of Public Economics 92(10-11): 2106–2120. 30 Gaviria, A. & Raphael, S. (2001). School-based peer effects and juvenile behavior, Review of Economics and Statistics 83(2): 257268. Ginther, D., Haveman, R. & Wolfe, B. (2000). Neighbiurhood attributes as determinants of children’s outcome: How robust are the relationships?, Journal of Human Resources 35(4): 603–42. Gradstein, M. & Justman, M. (2000). Human capital, social capital, and public schooling, European Economic Review 44(879-890): 4–6. Gradstein, M. & Justman, M. (2002). Education, social cohesion and eco- nomic growth, American Economic Review 92(1192-1204): 4. Hanushek, E. A., Kain, J. F., Markman, J. M. & Rivkin, S. G. (2003). Does peer ability affect student achievement?, Journal of Applied Economet- rics 18(5): 527–44. Hanushek, E. A. & Woessmann, L. (2008). The role of cognitive skills in economic development, Journal of Economic Literature 46(3): 607–68. Hoxby, C. (2000). Peer effects in the classroom: learning from gender and race variation. NBER Working Paper No. 7867. o INDEC (2001). Censo Nacional de Poblaci´n y Vivienda 2001, Instituto ısticas y Censos, Buenos Aires, Argentina. Nacional de Estad´ Ioannides, Y. M. & Zabel, J. E. (2003). Neighborhood effects and housing demand, Journal of Applied Econometrics 18(1): 563584. Jencks, C. & Mayer, S. (1990). Inner-city poverty in the united states, National Academy Press, Washington, D.C., chapter The social conse- quences of growing up in a poor neighborhood, pp. 111–186. Jenkins, S., Micklewright, J. & Schnepf, S. (2008). Social segregation in secondary schools: How does England compare with other countries?, Oxford Review of Education 34(1): 21–38. Judson, R. (1998). Economic growth and investment in education: How allocation matters, Journal of Economic Growth 3(4): 337–59. Katz, L., Kling, J. & Liebman, J. (2001). Moving to opportunity in Boston: Early results of a randomized mobility experiment, Quarterly Journal of Economics 116(2): 607–654. 31 Kingdon, G. (1996). The quality an efficiency of public and private schools: A case study of urban India, Oxford Bulletin of Economics and Statis- tics 58(1): 55–80. Kingdon, G. (2006). Public-private-partnership in education in India. ıo o Llach, J. J. (2006). El desaf´ de la equidad educativa. Diagn´stico y propues- tas, (with collaboration from schumacher, f., de canavese, a., carrat´,u m. and gigaglia, m.) edn, Granica, Buenos Aires. Lopez, R., Thomas, V. & Wang, Y. (1998). Addressing the education puzzle : the distribution of education and economic reform, Policy Research Working Paper Series 2031, The World Bank. Manski, C. (1993). Identification of endogenous social effects: The reflection problem, The Review of Economic Studies 60(3): 531–42. Mayer, S. (2002). How economic segregation affects children’s educational attainment, Social Forces 81(1): 153–176. McEwan, P. (2003). Peer effects on student achievement: evidence from chile, Economics of Education Review 22(2): 131–141. Miguel, E. & Kremer, M. (2004). Worms: identifying impacts on education and health in the presence of treatment externalities, Econometrica 72(1): 159217. Moffitt, R. (2001). Policy interventions, low-level equilibria, and social in- teractions, in S. Durlauf & P. Young (eds), Social Dynamics, Brookings Institution Press and MIT Press, Washington, DC and London, Eng- land, chapter 3, pp. 45–82. Rivas, A. (2010a). Monitoreo de la Ley de Financiamiento Educativo. Cuarto o o informe anual 2010, Fundaci´n CIPPEC, Fundaci´n Luminis, Buenos Aires. ıa o o Rivas, A. (2010b). Radiograf´ de la educaci´n argentina, Fundaci´n o o CIPPEC, Fundaci´n Arcor, Fundaci´n Roberto Noble, Buenos Aires. Rivkin, S. G. (2001). Tiebout sorting, aggregation and the estimation of peer group effects, Economics of Education Review 20: 201–209. Runciman, W. (1966). Relative Deprivation and Social Justice: a Study of Attitudes to Social Inequality in Twentieth-Century England, University of California Press, Berkeley, CA. 32 Sacerdote, B. (2001). Peer effects with random assignments: Results for Dartmouth roommates, Quaterly Journal of Economics 116(2): 681– 704. Soetevent, A. R. (2006). Empirics of the identification of social interactions: An evaluation of the approaches and their results, Journal of Economic Surveys 20(2). o Veleda, C. (2003). Mercados educativos y segregaci´n social. las clases me- o dias y la elecci´n de la escuela en el Conurbano Bonaerense. Work- o ing Paper 1, Centro de Implementati´n de Pol´ u ıticas P´blicas para la Equidad y el Crecimiento (CIPPEC). Veleda, C. (2005). Efectos segregatorios de la oferta educativa. el caso del Conurbano Bonaerense. Working Paper 5, Centro de Implementati´n o de Pol´ u ıticas P´blicas para la Equidad y el Crecimiento (CIPPEC). Vigdor, J. L. & Nechyba, T. (2004). Peer effects in elementary school: Learning from apparent random assignment. Unpublished manuscript, Department of Economics, Duke University. Vigdor, J. L. & Nechyba, T. (2006). Schools and the equal opportunities problem, MIT Press, Cambridge, MA, chapter Peer effects in North Carolina public schools. Wilson, W. J. (1997). The Truly Disadvantaged: The Inner City, the Un- derclass, and Public Policy, University Chicago Press, Chicago. 33 Appendix Table 10: Segregation index Obs Mean Std. Dev. Min Max by provinces Segregation index 24 .2384 .0389 .1435 .3261 Number of students 24 25,579 33,453 2,417 154,542 Number of schools 24 594.6 584.4 39 2,378 Number of localities 24 149.6 107.6 3 391 by localities Segregation index 1,308 .1656 .1211 0 .84 Number of students 3,591 169.4 682.4 2 20,707 Number of schools 3,591 3.9 11.2 1 352 Note: Segregation index is computed for each province (locality) as the variance of test scores between schools over the total variance in test scores in the province (locality). Source: ONE, 2000. Argentina 34 Table 11: Summary statistics, by type of school. Total Public Private obs. mean st. dev obs. mean st. dev obs. mean st. dev Parental education (years) 300,477 10.72 4.40 231,050 10.13 4.33 69,427 12.68 4.04 Spanish test score 300,477 63.28 19.25 231,050 60.27 19.00 69,427 73.27 16.49 Maths test scores 288,404 59.81 20.40 220,643 56.98 20.26 67,761 69.03 17.98 Assets index 294,678 26.53 13.51 225,967 24.67 13.38 68,711 32.66 12.05 35 Male 300,477 0.49 0.50 231,050 0.50 0.50 69,427 0.48 0.50 Repeat 300,477 0.17 0.38 231,050 0.21 0.41 69,427 0.04 0.21 Private school 300,477 0.23 0.42 231,050 0.00 0.00 69,427 1.00 0.00 Class size 300,477 26.66 5.64 231,050 26.03 5.40 69,427 28.75 5.91 Peer parents’ education (years) 300,477 10.72 2.31 231,050 10.13 1.97 69,427 12.68 2.30 Source: ONE, 2000. Argentina Table 12: School fixed effects Dependent vle: Full sample Restricted sample Spanish Test scores (1) (2) (3) (4) Class mean SES 11.053 2.893 11.277 2.178 (.061)∗∗∗ (.256)∗∗∗ (.159)∗∗∗ (.298)∗∗∗ Own SES 1.651 1.330 1.664 1.306 (.031)∗∗∗ (.032)∗∗∗ (.033)∗∗∗ (.035)∗∗∗ Obs. 389,513 389,513 333,276 333,276 e(N-g) 7,444 6,463 R2 .201 .065 .204 .064 Note: Standard errors are robust and clustered at the school level. Other variables included in the regressions are the student’s number of books at home, gender, repeat, class size and age. Source: ONE, 2000. Argentina Table 13: Test of equality of coefficients, heterogenous effects (from Table 4, column (2) p-value F(2, 7468) L LM UM U L LM UM U L . 0.000 0.000 0.000 . 50.98 92.95 106.99 LM 0.000 . 0.000 0.000 50.98 . 21.13 42.49 UM 0.000 0.000 . 0.000 92.95 21.13 . 13.50 U 0.000 0.000 0.000 . 106.99 42.49 13.50 . Table 14: Test of equality of coefficients, heterogenous effects (from Table 4, column (4) p-value F(2, 7468) L LM UM U L LM UM U L . 0.000 0.000 0.000 . 87.50 118.98 273.56 LM 0.000 . 0.000 0.000 87.50 . 22.53 154.21 UM 0.000 0.000 . 0.000 118.98 22.53 . 62.57 U 0.000 0.000 0.000 . 273.56 154.21 62.57 . 36 Table 15: Test of equality of coefficients, composition effects (from Table 5, column (2) p-value F(2, 7468) Prop LM Prop UM Prop U Prop LM Prop UM Prop U Prop LM . 0.000 0.039 . 26.66 4.26 Prop UM 0.000 . 0.008 26.66 . 7.09 Prop U 0.039 0.008 . 4.26 7.09 . 37 Table 16: Heterogeneous effects of class compositions. One regression hline All Restricted sample (1) (2) Li*(Prop peers in L) -.010 -.008 (.003)∗∗∗ (.003)∗∗∗ Li*(Prop peers in LM) .024 .014 (.006)∗∗∗ (.007)∗ Li*(Prop peers in UM) .052 .052 (.010)∗∗∗ (.012)∗∗∗ Li*(Prop peers in U) .011 -.012 (.011) (.012) LMi*(Prop peers in L) LMi*(Prop peers in LM) .067 .059 (.007)∗∗∗ (.008)∗∗∗ LMi*(Prop peers in UM) .093 .080 (.008)∗∗∗ (.010)∗∗∗ LMi*(Prop peers in U) .073 .057 (.008)∗∗∗ (.010)∗∗∗ UMi*(Prop peers in L) .005 .007 (.004) (.004) UMi*(Prop peers in LM) .077 .068 (.007)∗∗∗ (.008)∗∗∗ UMi*(Prop peers in UM) .132 .125 (.009)∗∗∗ (.011)∗∗∗ UMi*(Prop peers in U) .095 .075 (.009)∗∗∗ (.010)∗∗∗ Ui*(Prop peers in L) -.034 -.035 (.004)∗∗∗ (.005)∗∗∗ Ui*(Prop peers in LM) .065 .059 (.007)∗∗∗ (.008)∗∗∗ Ui*(Prop peers in UM) .123 .111 (.009)∗∗∗ (.011)∗∗∗ Ui*(Prop peers in U) .110 .090 (.009)∗∗∗ (.010)∗∗∗ Obs. 389,501 333,267 e(N-g) 7,444 6,463 R2 .07 .068 Source: ONE, 2000. Argentina. Standard errors are robust and clustered at the school level. Other variables included in the regressions are number of books at home, gender, repeat, class size and age. 38 Table 17: School fixed effects: Alternative specifications Dependent vle: Public Public Math Math PEdu PEdu Assets Assets Spanish Test scores (1) (2) (3) (4) (5) (6) (7) (8) Class mean SES 9.728 3.265 10.787 2.529 2.206 .208 4.347 1.130 (.209)∗∗∗ (.293)∗∗∗ (.179)∗∗∗ (.298)∗∗∗ (.016)∗∗∗ (.055)∗∗∗ (.031)∗∗∗ (.136)∗∗∗ SES 1.692 1.422 1.951 1.615 .149 .070 1.464 .883 (.037)∗∗∗ (.037)∗∗∗ (.033)∗∗∗ (.034)∗∗∗ (.008)∗∗∗ (.009)∗∗∗ (.038)∗∗∗ (.039)∗∗∗ school fixed effects 39 Obs. 298,405 298,405 395,906 395,906 288,306 288,306 296,429 296,429 e(N-g) 5,780 7,444 7,442 7,460 R2 .14 .068 .163 .042 .171 .064 .204 .063 Note: ‘Public’ uses number of books and only public schools. ‘PEdu’ uses parental education (in years). ‘Assets’ uses an index constructed as the first principal component of a set of variables on durable assets at home. Standard errors are robust and clustered at the school level. Other variables included in the regressions are student’s own group, gender, repeat, class size and age. Table 18: Heterogenous effects: Alternative specifications No miss p.edu Public Public Math PEdu PEdu Assets Assets (1) (2) (3) (4) (5) (6) (7) (8) Class mean SES 3.214 .216 1.125 (.289)∗∗∗ (.055)∗∗∗ (.136)∗∗∗ Class mean SES * Li 1.342 1.373 .320 -.419 .605 (.314)∗∗∗ (.334)∗∗∗ (.342) (.074)∗∗∗ (.150)∗∗∗ Class mean SES * LMi 3.124 3.287 2.630 -.059 1.251 (.280)∗∗∗ (.303)∗∗∗ (.305)∗∗∗ (.059) (.142)∗∗∗ Class mean SES * UMi 3.781 4.777 3.823 .507 1.620 (.303)∗∗∗ (.341)∗∗∗ (.319)∗∗∗ (.059)∗∗∗ (.145)∗∗∗ 40 Class mean SES * Ui 5.460 6.769 5.619 .961 1.991 (.310)∗∗∗ (.356)∗∗∗ (.324)∗∗∗ (.063)∗∗∗ (.162)∗∗∗ Obs. 279,531 348,361 298,405 395,906 288,306 288,306 296,429 296,429 e(N-g) 7,439 10,603 5,780 7,444 7,442 7,442 7,460 7,460 R2 .074 .071 .073 .046 .065 .067 .063 .064 Note: ‘Public’ uses number of books and only public schools. ‘PEdu’ uses parental education (in years). ‘Assets’ uses an index constructed as the first principal component of a set of variables on durable assets at home. Standard errors are robust and clustered at the school level. Other variables included in the regressions are student’s own group, gender, repeat, class size and age. Table 19: Class composition effect: Alternative specifications No miss p.edu Public Math PEdu Assets (1) (2) (3) (4) (5) Prop peers in LM .054 .056 .042 .027 .027 (.006)∗∗∗ (.007)∗∗∗ (.007)∗∗∗ (.008)∗∗∗ (.006)∗∗∗ Prop peers in UM .094 .096 .073 .043 .055 (.008)∗∗∗ (.009)∗∗∗ (.009)∗∗∗ (.008)∗∗∗ (.007)∗∗∗ Prop peers in U .073 .071 .062 .031 .063 (.008)∗∗∗ (.010)∗∗∗ (.009)∗∗∗ (.009)∗∗∗ (.008)∗∗∗ Obs. 279,531 298,394 395,892 288,288 296,389 e(N-g) 7,439 5,780 7,444 7,441 7,456 R2 .073 .072 .045 .065 .063 Note: ‘Public’ uses number of books and only public schools. ‘PEdu’ uses average parental education and is grouped as incomplete primary (lower), complete primary (lower middle), complete secondary (upper middle) and complete tertiary (upper). ‘Assets’ uses an index constructed as the first principal component of a set of variables on durable assets at home and is grouped so that the proportion of pupils in each group is approximately similar to that in the number of books. Standard errors are robust and clustered at the school level. Other variables included in the regressions are student’s own group, gender, repeat, class size and age. Table 20: Heterogeneous effects of class compositions: Parental education Lower Lower middle Upper middle Upper Prop peers in L -.028 -.054 -.063 (.010)∗∗∗ (.011)∗∗∗ (.017)∗∗∗ Prop peers in LM .017 -.021 -.041 (.014) (.008)∗∗ (.011)∗∗∗ Prop peers in UM .017 .010 .007 (.014) (.007) (.011) Prop peers in U -.010 -.021 -.005 (.018) (.009)∗∗ (.009) Obs. 40,039 108,228 86,163 53,858 e(N-g) 6,530 7,318 7,381 6,932 R2 .049 .064 .069 .064 Note: Parental education is the average of both parents when present, and is grouped as incomplete primary (lower), complete primary (lower middle), com- plete secondary (upper middle) and complete tertiary (upper). Standard errors are robust and clustered at the school level. Other variables included in the regressions are student’s own group, gender, repeat, class size and age. 41 Table 21: Heterogeneous effects of class compositions: Assets’ index (PCA) Lower Lower middle Upper middle Upper Prop peers in L -.035 -.077 -.088 (.008)∗∗∗ (.010)∗∗∗ (.020)∗∗∗ Prop peers in LM .011 -.038 -.011 (.009) (.009)∗∗∗ (.013) Prop peers in UM .025 .029 -.003 (.011)∗∗ (.009)∗∗∗ (.011) Prop peers in U .037 .048 -.002 (.017)∗∗ (.011)∗∗∗ (.010) Obs. 75,419 93,332 79,847 47,791 e(N-g) 6,716 7,252 7,128 5,723 R2 .048 .066 .068 .064 Note: ‘Assets’ uses an index constructed as the first principal component of a set of variables on durable assets at home, and is grouped so that the proportion of pupils in each group is approximately similar to that in the number of books. Standard errors are robust and clustered at the school level. Other variables included in the regressions are student’s own group, gender, repeat, class size and age. 42