64381




Michael Bamberger



Many international development agencies and some national governments base future budget planning and
policy decisions on a systematic assessment of the projects and programs in which they have already invested.
Results are assessed through Mid-Term Reviews (MTRs), Implementation Completion Reports (ICRs), or
through more rigorous impact evaluations (IE), all of which require the collection of baseline data before the
project or program begins. The baseline is compared with the MTR, ICR, or the posttest IE measurement to
estimate changes in the indicators used to measure performance, outcomes, or impacts. However, it is often the
case that a baseline study is not conducted, seriously limiting the possibility of producing a rigorous assessment
of project outcomes and impacts. This note1 discusses the reasons why baseline studies are often not conducted,
even when they are included in the project design and funds have been approved, and describes strategies that
can be used to “reconstruct�? baseline data at a later stage in the project or program cycle.



                                                           be adapted to the special characteristics of each.
                                                           Projects often introduce new M&E systems cus-
                                                           tomized to the project’s speci�?c data needs, but
                                                           often with signi�?cant start-up delays, which can
Baseline data can come from the project’s moni-            be problematic for collecting baseline data. In
toring and evaluation (M&E) system, rapid as-              contrast, ongoing programs can often build on
sessment studies, surveys commissioned at the              existing M&E and other data collection systems as
start and end of the project, or from secondary            well as have access to secondary data and sampling
data sources. Whatever the source, the availabil-          frames, although these systems are often not suf-
ity of appropriate baseline data is always critical        �?cient for the purposes of evaluation and tend to
for performance evaluation, as it is impossible to         be diﬃcult to change. Nongovernmental organi-
measure changes without reliable data on the situ-         zations (NGOs), important development players
ation before the intervention began. Despite the           in many countries, may face diﬀerent issues with
importance of collecting good baseline data, there         respect to baseline data for their activities.
are a number of reasons why they are frequently
not collected, and the purpose of this paper is to
present a range of strategies that can be used for
“reconstructing�? baseline data when they are not
available.                                                 Although most interventions plan to collect
    The strategies for reconstructing baseline             baseline data for results monitoring and possibly
data apply to both discrete projects and broader           impact evaluation, often data are not collected or
programs (the term “interventions�? is used here            collection is delayed until the intervention has
to cover both), although they must sometimes               been underway for some time. The reasons may
include a lack of awareness of the importance         that roads, water supply, or other services are to be
of baseline data, a lack of �?nancial resources, or    provided to certain communities, speculators may
limited technical expertise. Even when manage-        begin to buy land and families may start to make
ment recognizes its importance, administrative        improvements to their property. If the baseline is
procedures (for example, recruiting and training      not conducted until the oﬃcial program launch,
M&E staﬀ, purchasing computers, or commis-            many of these important changes may not be
sioning consultants) may create long delays before    captured. Using techniques such as recall or key
baseline data can be collected.                       informant interviews to capture information on
                                                      these early changes should be considered.
                                                      Using secondary data to
                                                      reconstruct the baseline
M&E systems collect baseline information on           There are many documentary sources that may
indicators for measuring program outputs and          provide information on the bene�?ciary popula-
outcomes for the target population. Impact evalu-     tion or comparison groups around the time the
ations collect similar information, but from both     intervention began. Censuses covering areas such
bene�?ciaries and a comparison group. Informa-         as population, agriculture, industry, education,
tion is also collected on the social and economic     and environment may be available. Other useful
characteristics of individuals, groups or communi-    sources are household socioeconomic surveys,
ties; on contextual factors such as local economic    the largest of which are the Living Standards
conditions; and on political and organizational       Measurement Surveys (LSMS), which have been
factors that might explain variations in outcomes     conducted in at least 35 countries. When surveys
and impacts among diﬀerent project locations.         are repeated periodically, it may be possible to
    The World Bank and other development              �?nd a reference point close to the intervention
agencies incorporate this information into a          launch date. However, while many surveys have
Results-Based Monitoring and Evaluation System        a large enough sample to generate a comparison
(RBME). Kusek and Rist (2004) describe a 10-          group, the samples are often too small or do
step system for implementing RBME, 3 of which         not contain suﬃciently detailed information to
involve the creation of a baseline:                   generate a sample of the bene�?ciary population
Step 2: Agreeing on the outcomes to monitor and       (particularly when this population is relatively
evaluate                                              small).
Step 3: Selecting key indicators to monitor out-          Ministries of education, health and agricul-
comes and performance                                 ture, among others, publish annual reports that
Step 4: Collecting baseline data                      can provide baseline reference data, and they can
                                                      sometimes provide information on particular
                                                      schools, health centers, or other facilities in the tar-
                                                      get areas. Donor agencies, NGOs, and universities
                                                      also conduct studies providing useful reference
                                                      data. Birth and death certi�?cates can be used to
This section presents some practical strategies for   examine life expectancy, family size and common
estimating (“reconstructing�?) conditions of the       causes of death, while legal documents relating to
project, and sometimes also the comparison group,     marriage and divorce can provide information on,
at the time the intervention is launched. Most of     for example, the property rights of women. Mass
these are economical, relatively simple to apply,     media also provide information on issues concern-
and do not require too great an investment of time.   ing local schools, clinics, public transport, and so
                                                      forth that can provide background information on
Timing of the baseline                                conditions at the start of the intervention. Box 1
Evaluations often implicitly assume that an in-       presents two examples where secondary data were
tervention only starts to produce impacts after it    used to reconstruct baseline data for matched
oﬃcially begins, but, in fact, changes may occur      project and comparison groups using propensity
long before this. For example, once it is known       score matching.
                                                        to particular bene�?ciaries. Sometimes the applica-
     There are a number of factors aﬀecting the         tion forms for people not accepted can provide a
utility and validity of secondary data sources:         comparison group of nonparticipants.
                                                            While administrative data are a potentially
the data cover the wrong reference period; key
                                                        valuable source of baseline data, the data are often
information is missing; information was not col-
                                                        not available in a convenient format for analysis.
lected from the right people (for example, only
                                                        Often the evaluator must work closely with pro-
the household head was interviewed); the sample
                                                        gram staﬀ to ensure that administrative data are
does not cover the whole population of interest or
                                                        collected and �?led in a usable format (discussed
is too small; or the information is not reliable or     further later in this note). Often when the evalu-
complete. These factors must always be assessed         ator discovers that the expected administrative
before utilizing any of these sources.                  records have vanished or are not organized in
                                                        a usable format, staﬀ respond “No one told us
Using administrative data
                                                        that this information would be required for a
from the intervention
                                                        future evaluation.�? Better coordination between
Many interventions collect monitoring and other
                                                        the evaluators and the program staﬀ might have
kinds of administrative data that could be used
                                                        ensured the information would be available.
to estimate baseline conditions for the target
population (box 2). For example, socioeconomic          Recall
data included in the application forms of people,       Recall techniques ask individuals or groups to pro-
communities, or organizations applying to partici-      vide information on their social or economic con-
pate or receive bene�?ts; planning and feasibility       ditions, their access to services, or the conditions
studies; monitoring reports; and administrative         of their community at a particular point in time
records providing information such as changes in        (for example, project launch) or over a particular
project eligibility criteria or the services provided   period of time. Recall is used in poverty analysis,
demography, and income expenditure surveys                  Recall always involves a risk of bias due to
(Deaton and Grosh 2000) to elicit information          memory or distortion. Unintentional distortion
on behavior (for example, contraceptive usage or       occurs when, for example, people romanticize
fertility) or economic status (household income or     the past (“when I was young there was much
expenditure). Several comparative studies (for ex-     less crime in the community�?) or unintention-
ample, Deaton and Grosh [2002]; Belli, Staﬀord,        ally adjust their response to what they think the
and Alwin [2009]) have concluded that recall,          researcher wants to hear. Intentional distortion
when carefully designed and implemented, can be        occurs when, for example, families are reluctant
a useful estimating tool with predictable and, to      to admit their children had not been attending
some extent, controllable errors, and a potentially    school, or they might underestimate how much
valuable way to reconstruct baseline data.             they spend on water to convince planners they
    Recall can be applied through questions in         are too poor to pay the water charges proposed
surveys and individual or group interviews (box        in a new project. The reliability of recall data also
3). In addition to collecting numerical data such      depends on the nature of the outcome variable
as income or farm prices, recall can also be used to   being studied. For example, families will usually
obtain estimates of major changes in the welfare       be able to recall major events such as a death in
conditions of the household, such as which chil-
                                                       the family or enrollment of a child in school,
dren attended a school outside the village before
                                                       but it may be more diﬃcult to obtain reliable
the village school opened and the travel time and
                                                       responses on nutrition questions or changes in
costs of getting there. Families can also provide
                                                       the frequency of diarrhea or other very common
information on questions such as access to health
                                                       ailments.
facilities and where they previously obtained
                                                            A challenge in using recall is the absence of
water and how much it cost.
                                                       studies providing guidelines for estimating or
                                                       adjusting for systematic bias. The most detailed
                                                       research on this question was conducted on the
                                                       recall of expenditures in national household
                                                       income and expenditure surveys and studies on
                                                       fertility. The income and expenditure studies
                                                       identi�?ed some consistent biases that can be used
                                                       to adjust estimates: “telescoping,�? that is, report-
                                                       ing major expenditures as being more recent than
                                                       they actually were, and underestimating small
                                                       expenditures. Also, men and the better oﬀ are
                                                       more likely to report they have been sick than are
                                                       women and poorer people. Other areas where
                                                       research on the validity and reliability of recall
                                                       is available include: substance abuse, adolescent
                                                       health research, assessment of stressful events,
                                                       and time use. Belli, Staﬀord, and Alwin (2009)
                                                       report that the reliability of recall is signi�?cantly
                                                       enhanced when using the calendar method of
                                                       life course research (in which topics of interest
                                                       are linked to critical events in the life course of
                                                       the subject: birth, death, marriage, enrollment
                                                       in school, and changing employment) compared
                                                       to conventional recall questions in a structured
                                                       questionnaire.
                                                            Recall can sometimes provide better self-
                                                       assessment estimates of behavioral changes and
                                                       knowledge (for example, child care and nutrition,
leadership skills) than pre- and post-test compari-
sons. People often overestimate their behavioral
skills or knowledge before entering a program
because they do not understand the tasks being
studied or the required skills. After completing
the program, they may have a better understand-
ing of these behaviors and provide a better as-
sessment of their previous level of competency
or knowledge and how much these have changed
(Pratt, McGuigan, and Katzeva 2000).

Key informants
Key informants (box 4) can provide knowledge
and experience on a particular agency and the
population it serves, an organization (such as a
trade union, women’s group or a gang), or group
(such as mothers with young children, sex work-
ers, or landless farmers). For example, when
                                                      sectors are sampled and that responses provide a
evaluating a program to increase secondary school
                                                      representative snapshot of each group. However,
enrollment, key informants could include: school
                                                      readers of evaluation reports should be aware
directors, teachers and other school personnel,
                                                      that focus groups are often used in develop-
parents of children who do and do not attend
                                                      ment evaluation as a fast and economical way to
school, students, and religious leaders.
                                                      obtain general information on the opinions of
    Key informants combine “factual�? informa-
                                                      the target population with very little attention
tion with a particular point of view, and it is
                                                      to participant selection or ensuring balanced
important to select informants with diﬀering
                                                      participation in the discussion. Market research
perspectives. For example, low- income and
                                                      companies make extensive use of focus groups,
higher-income parents may have diﬀerent opin-
                                                      developing sampling frames to select samples
ions on programs to increase school enrollment,
                                                      with the socioeconomic characteristics required
as may those from diﬀerent ethnic or religious
                                                      by diﬀerent clients. If funds are available, con-
groups.
                                                      tracting a market research company to design and
Group interview techniques for                        implement focus groups for a program evaluation
reconstructing baseline data                          could be considered.
Focus groups are used in market research and              Participatory assessment techniques (PRAs),
program evaluation to obtain information on           originally meaning “participatory rural ap-
socioeconomic characteristics, attitudes, and be-     praisal,�? is now used as a generic term for all
haviors of groups that share common attributes        participatory studies in which communities or
(Krueger and Casey 2000). Groups, usually             groups report on their conditions, problems, and
with �?ve to eight persons per group, are selected     changes over time. Groups can provide estimates
to cover different economic strata, as well as        on things such as the volume and quality of water,
people who have and have not participated in the      crop production and sales, travel time and costs,
project or who received diﬀerent services. The        and time use. PRAs are widely used with poor
group moderator goes systematically through a         rural and urban communities with low literacy
checklist of questions making sure each person        levels or where participants have diﬃculties in
responds to every question. For the purposes          expressing complex ideas (such as changes in
of reconstructing baseline data, participants         environmental conditions). PRAs include con-
could be asked to provide information on, for         struction of charts, maps, or tables where the
example, conditions of their household, group,        group agrees on the placement of familiar objects,
community, or agricultural production at some         such as stones or seeds, on a chart to illustrate
point in the past. When properly designed and         trends, important events, magnitude, or causal
implemented, focus groups ensure that all key         patterns. Timelines, trend analysis, historical
transects, seasonal diagrams, and daily activity       stakeholders to reconstruct the implicit program
schedules can be used to assess changes over time      theory on which the program is based. Sometimes
or the situation at the baseline reference point       there is agreement among staﬀ concerning the
(Kumar 2002).                                          underlying theory model and all that is needed
    These PRAs have several bene�?ts. Respondents       may be a short workshop to put this on paper.
may feel more comfortable expressing themselves        However, in other cases, staﬀ may have diﬃculty
in a group with their peers, rather than in a one-     articulating the model or there may be disagree-
on-one interview with an outside researcher. The       ments concerning the purpose of the program,
group consensus can also provide a cost-eﬀective       how it will achieve its outcomes, and the critical
way to obtain an approximate estimate of average       assumptions on which it is based.2
travel time, volume and quality of water con-
sumed, volume of agricultural production, and
average crop prices rather than having to use a
sample survey. Synergistic group interaction also
generates new ideas that might not have come up
in one-on-one interviews. There are also potential
risks: the group may be dominated by a few vocal       There is a wide variety of evaluation designs for es-
people; participants may defer to politically power-   timating project impacts and eﬀects ranging from
ful, wealthier, or more educated group members;        strong statistical designs with before-and-after
or the group facilitator may inadvertently direct      comparisons of project and comparison groups,
the group toward certain decisions.                    to statistically weaker quasi-experimental designs
                                                       that may not include baseline data on the com-
                                                       parison or project groups, and nonexperimental
                                                       designs that do not include a comparison group.
                                                       Diﬀerent baseline reconstruction strategies can
                                                       be applied to diﬀerent evaluation designs. For
M&E systems often take some time to get estab-         the weaker quasi-experimental designs and non-
lished, so there may be a period at the start of the   experimental designs where no baseline data have
intervention when monitoring data are not being        been collected for the project and/or the compari-
collected. So when setting up the RBME, a �?rst         son group, all of the baseline reconstruction tech-
step should be to check: What are the key indica-      niques discussed earlier could be considered. On
tors on which baseline data are required? Which        the other hand, the stronger quasi-experimental
indicators are available and which are missing?
                                                       and the experimental designs all include baseline
Why are the data missing and how easily can the
                                                       data for both project and control groups. However,
problems be overcome? Is any important informa-
                                                       in most cases only quantitative data are collected
tion not being collected during the interim period
                                                       (for example, the number of students enrolled in
before the monitoring system becomes fully op-
                                                       school or patients visiting health centers), and the
erational? All of the techniques for reconstructing
baseline data can be applied to �?lling in RBM          design would be strengthened by complement-
baseline data gaps.                                    ing this with qualitative data such as the quality
    RBME systems are usually based on a program        of services, women’s participation in household
theory model that includes: how the program is         decision making at the time the project began, and
intended to achieve its objectives, implementation     how diﬀerent ethnic groups were received when
and outcome indicators that should be measured,        they visited health clinics.
key assumptions to be tested, and the time horizon         Quantitative and qualitative evaluations
over which diﬀerent results are to be achieved         rely on diﬀerent types of data and data collec-
(Bamberger, Rugh, and Mabry 2006, chapter              tion procedures. When quantitative researchers
9). Often the program theory model was not in          collect primary data to reconstruct baselines,
fact de�?ned or fully articulated at the start of the   they are likely to incorporate recall questions
project. In these cases, the evaluator may need to     into a structured questionnaire. In contrast,
work with the implementing agency and other            qualitative researchers use a wider range of
techniques, including key informants, in-depth        • De�?ne funding arrangements that avoid long
individual interviews, focus groups, and PRAs.            delays in contracting monitoring unit staﬀ and
Both quantitative and qualitative research                commissioning evaluation consultants.
designs can bene�?t from incorporating mixed-          • Begin recruiting M&E staﬀ before interven-
method approaches to baseline reconstruction              tion launch.
so as to combine depth of understanding with          • Arrange for M&E staﬀ to receive basic training
generalizability of the �?ndings (Bamberger, Rao,          before intervention launch.
and Woolcock 2010).                                   • Early recruitment of an experienced M&E
                                                          staﬀer.
                                                          Having staff on board who are familiar
                                                      with the practical and technical problems
                                                      faced when trying to reconstruct baseline
                                                      data can avoid many of the problems that
Selecting a well-matched baseline comparison          typically occur when generalist task managers
                                                      attempt to handle these problems themselves.
group presents special challenges. Participant
                                                       There are a number of practical ways to enhance
selection procedures often result in project par-
                                                      an agency’s ability to generate baseline data. Using
ticipants having special attributes that aﬀect, and
                                                      evaluation funds to contract additional adminis-
frequently increase, the probability of success-
                                                      trative staﬀ may remove bottlenecks and facilitate
ful program outcomes. Often these attributes,
                                                      good quality data collection. In other cases, base-
termed “unobservables�? or “omitted variables,�?        line data on target households, communities, or
are not included in the baseline surveys. For         organizations such as schools, health clinics, or
example, in a microcredit program for women,          agricultural cooperatives may not be organized
many of the women who are successful in start-        or archived in a way that facilitates identi�?cation
ing or expanding small businesses might come          of a comparable sample one or two years later for
from households where they have more control          repeat interviews. Discussions with agency staﬀ
over household decision making than is normally       at the planning stage could ensure that valuable
the case in their community, or they may have         data such as application forms that include socio-
previous experiences with a small business. These     economic data on households or communities
characteristics might aﬀect project outcomes, but     applying to participate in a project or program,
this information will usually not be included in      or feasibility studies for the selection of roads to
the baseline data. The following methods could        be built or upgraded, are not discarded once ben-
be used to assess the importance of these omitted     e�?ciaries have been selected or the sites for road
variables: key informant interviews (for example,     improvements chosen. Effective coordination
staﬀ of microcredit and other economic develop-       with agency staﬀ is critical.
ment programs); administrative data from the              M&E systems compare progress at diﬀerent
loan programs; focus groups with women par-           points over the life of the project, and “baseline�?
ticipants and nonparticipants; in-depth interviews    data for these comparisons must be collected
with participants and nonparticipants; and PRAs.      throughout the life of the project. So it is impor-
                                                      tant to ensure M&E systems continue to provide
                                                      good quality data. The following are recommenda-
                                                      tions that can help sustain M&E systems:
Even when an agency is strongly committed to          • Check the budget allocated to eﬀective M&E
setting up an M&E system to generate the baseline         systems in other organizations and ensure suf-
data required for results-based management and            �?cient resources are allocated in the present
impact evaluation, there are often other pressing         program.
staﬃng, organizational and �?nancial matters, so       • Ensure that speci�?c and adequate budget line
there will often be considerable delays before the        items for M&E are approved and reauthorized
M&E systems are operational. The following are            when necessary in the relevant government
measures that can be taken to increase the likeli-        budgets.
hood that the M&E systems will be in place from        • Organize workshops for management and
the time of program launch:                               policy makers to explain the bene�?ts of good
    M&E data and explain how the costs of both            requires carrots (for example, budgetary in-
    monitoring and evaluation are calculated.             centives and greater management autonomy
    Prepare case studies on how M&E systems               to programs that use M&E well); sticks (for
    were organized and used in other projects, and        example, laws and regulations mandating
    establish contact with these agencies through         M&E or withholding funding from agencies
    study tours, videoconferencing, or visits of          that fail to implement M&E); and sermons (for
    resource persons.                                     example, high-level endorsements of M&E
• Ensure that stakeholders are actively involved          importance).
    in the planning and design of the M&E systems
    and that the systems respond to their informa-
    tion needs (Patton 2008).
• Use clients’ preferred communication style for
    presenting evaluation �?ndings so that stake-      National sample surveys conducted at least once
    holders are able to use information generated     a year on topics such as income and expenditure,
    from the M&E system and are motivated to          access to health or education, or agricultural pro-
    support the continued collection of the data      duction provide very valuable baseline data for
                                                      results-based management and impact evaluation.
    (Vaughan and Buss 1998; Patton 2008).
                                                      Household income and expenditure surveys are
• A continuing evaluation capacity develop-
                                                      one example that has proved very valuable. If these
    ment (ECD) program is essential to ensure
                                                      surveys can be used in the evaluation of several dif-
    upgrading of the evaluation skills of agency
                                                      ferent development programs, they become very
    and consultant staﬀ involved with M&E.
                                                      cost-eﬀective and they also can provide a larger
    The willingness of agency staﬀ to continue        and methodologically more rigorous comparison
to collect and deliver good quality data to the       group sample than an individual evaluation could
M&E unit is critical. How can staﬀ be motivated       aﬀord. Regularly repeated surveys provide a very
to continue to produce this information month         valuable longitudinal database that can control for
after month and year after year?                      seasonal variation and economic cycles.
• Collection and transmission of M&E data                 The value of these surveys for results-based
    should be simple and rapid.                       management and impact evaluation can be greatly
• Provide evidence to staﬀ that the informa-          enhanced if they are planned with this purpose
    tion they collect is used. Staﬀ should receive    in mind and in coordination with the agencies
    regular feedback on issues or questions arising   and donors who may use the surveys to generate
    from their data, and they should be asked for     baseline data and comparison groups. Some of the
    further information on examples of successes      ways to enhance their utility include:
    or unanticipated problems.                        • Ensure the sample is suﬃciently large and
• Staﬀ should receive recognition through per-            has a suﬃciently broad regional coverage to
    sonal thanks from headquarters, invitation to         generate subsamples covering particular target
    prepare an article for a newsletter, or a small       populations with suﬃcient statistical power to
    prize from time to time.                              be used for major program evaluations.
• Provide evidence to staﬀ showing that the data      • Include, in consultation with social sector
                                                          agencies, core information on topics such as:
    helps improve the quality of the programs. For
                                                          school enrollment, access to health services,
    example, the evaluation of the Uganda Educa-
                                                          and participation in major development pro-
    tion for All Program made extensive use of
                                                          grams. This would facilitate selecting samples
    monitoring data in the follow-up evaluations
                                                          of participants and comparison groups for
    at the district level. Local staﬀ reported this
                                                          impact evaluations.
    was the �?rst time they had seen their data        • Include one or more special modules in each
    being used and this gave them an incentive to         round of the survey to cover the needs of a
    improve the quality of data collection (Bam-          particular evaluation that is being planned.
    berger and Kirk 2009).                            • Document the master sampling frame to facili-
• Mackay (2007) argues that a strategy of incen-          tate its use for selecting samples for particular
    tives to develop and sustain an M&E system            evaluations.
     Many of these approaches can only be con-          weaknesses are well understood, others such as
sidered for large and expensive evaluations or for      recall or the systematic use of key informants have
studying issues that are of high priority to govern-    often been used in a somewhat ad hoc manner and
ment agencies and/or donors. Also, national statis-     more work is required to test, re�?ne, and validate
tics oﬃces are typically overburdened, so they can      the methods. Finally, there are many potentially
only be expected to help out when the program           valuable sources of administrative data from the
is particularly important or when special funding       project itself that tend to be underutilized and
can be arranged to cover the costs of additional        more attention should be given to the develop-
staﬀ for data collection or analysis.                   ment and use of these valuable and relatively
                                                        accessible sources of information.

Good quality baseline data that measure the con-
ditions of the target population and the matched        Michael Bamberger has a PhD in Sociology from
comparison group are an essential component of          the London School of Economics. He worked for
eﬀective monitoring, results-based management,          23 years with the World Bank as advisor on moni-
and impact evaluation. Without this reference           toring and evaluation to the Urban Development
information, it is very diﬃcult to assess how well      Department, training coordinator for Asia and
a project or program has performed and how ef-          senior sociologist in the Gender and Development
fectively it has achieved its objectives or results.    Department. Since retiring in 2001, he has worked
     However, many projects and programs fail to        as an evaluation consultant and evaluation trainer
collect all of the required baseline data. While        with 10 United Nations agencies, the World Bank,
some of the reasons for this can be explained by        the Asian Development Bank, and a number of
inadequate funding or technical diﬃculties in col-      bilateral development agencies and developing
lecting the data (particularly for control groups),     country governments. He has published exten-
many of the causes could be at least partially cor-     sively on evaluation and is on the editorial board
rected by better management and planning. Many          of several evaluation journals.
reasons relate to administrative delays in releas-
ing funds and recruiting and training staﬀ and
contracting consultants. While administrative
procedures (such as those relating to personnel         1. The author wishes to thank these colleagues from the
                                                        Poverty Reduction and Equity Group: Jaime Saavedra
and consultants) are often diﬃcult to change, ways
                                                        (Acting Sector Director), Gladys Lopez Acevedo (Senior
could probably be found to reduce some of these         Economist), Keith Mackay (Consultant), Emmanuel
delays. Other issues concern the relatively low         Skou�?as (Lead Economist), Philipp Krause (Consul-
priority that is often given to M&E, particularly       tant), and Helena Hwang (Consultant) for comments.
when there are so many other urgent priorities          2. See Bamberger, Rugh, and Mabry (2006, 179–82) for
during the early stages of a project or program.        a discussion of the diﬀerent strategies for reconstructing
     Even with the best of intentions, these ad-        a program theory model.
ministrative challenges will never be completely
resolved and there will continue to be many situa-
tions where the collection of baseline monitoring       Bamberger, M., and A. Kirk. 2009. Making Smart Policy:
data is delayed and the commissioning of baseline           Using Impact Evaluation for Policy Making; Case
studies for impact evaluations never takes place.           Studies on Evaluations That Inﬂuenced Policy. PREM
Included in this note are a range of strategies, many       Thematic Group for Poverty Analysis, Monitoring
of them relatively simple and cost-eﬀective, for            and Impact Evaluation, Doing Impact Evaluation
                                                            Series No. 14, World Bank, Washington, DC.
reconstructing baseline data when necessary. It
                                                        Bamberger, M., Rugh, J., and L. Mabry. 2006. Real
is recommended that appropriate tools should be
                                                            World Evaluation: Working under Budget, Time, Data
built into RBME and impact evaluation systems               and Political Constraints. Thousand Oaks, CA: Sage
as contingency tools for reconstructing impor-              Publications.
tant baseline data. While some of the statistical       Belli, F., F. Staﬀord, and D. Alwin. 2009. Calendar
techniques such as propensity score matching                and Time Diary Methods in Life Course Research.
have been widely used and their strengths and               Thousand Oaks, CA: Sage Publications.
Bourguignon, F. 2009. “Toward an Evaluation of Evalua-                Lessons from the Nicaraguan Social Fund.�? World
    tion Methods: A Commentary on the Experimental                    Bank Economic Review 16 (2): 275–95.
    Approach in the Fields of Employment, Work, and               Pratt, C., W. McGuigan, and A. Katzeva. 2000. “Mea-
    Professional Training.�? Journal of Development Ef-                suring Program Outcomes: Using Retrospective
    fectiveness 2 (3): 310–19.                                        Pretest Methodology.�? American Journal of Evalu-
Deaton, A., and M. Grosh. 2000. “Consumption.�?                        ation 21 (3):341–49.
    In Designing Household Survey Questionnaires for              Van de Walle, D. 2009. “The Poverty Impact of Rural
    Developing Countries: Lessons from 15 Years of the                Roads Projects.�? Journal of Development Eﬀectiveness
    Living Standards Measurement Study, Vol. 3, ed. M.                1 (1): 15–36.
    Grosh and P. Glewwe, 91–134. Washington, DC:                  Vaughan, R., and T. Buss. 1998. Communicating Social
    World Bank.                                                       Science Research to Policymakers. Thousand Oaks,
Gibson, J. 2006. “Statistical Tools and Estimation                    CA: Sage Publications
    Methods for Poverty Measures Based on Cross-                  White, H. 2006. “Impact Evaluation: Experience of the
    Sectional Household Surveys.�? In Handbook on                      Independent Evaluation Group of the World Bank.�?
    Poverty Statistics: Concepts, Methods and Policy Use,             IEG, Evaluation Capacity Development Series. The
    128–205. United Nations.                                          World Bank.
Krueger, R., and M. Casey. 2000. Focus Groups: A
    Practical Guide for Applied Research, 3rd Edition.
    Thousand Oaks, CA: Sage Publications.
Kumar, S. 2002. Methods for Community Participation:              Gorgens, M., and J. Z. Kusek. 2010. Making Monitoring
    A Complete Guide for Practitioners. Rugby, Warwick-               and Evaluation Systems Work: A Capacity Develop-
    shire: Practical Action.                                          ment Toolkit. Washington, DC: World Bank.
Kusek, J., and R. Rist. 2004. Ten Steps to a Results-Based        Khandker, S., G. Koolwal, and H. Samad. 2009. Hand-
    Monitoring and Evaluation System. Washington,                     book on Impact Evaluation: Quantitative Methods and
    DC: World Bank.                                                   Practices. Washington, D.C: World Bank.
Mackay, K. 2007. How to Build M&E Systems to Support              Pretty, J., I. Guijt, J. Thompson, and I. Scoones. 1995. A
    Better Government. Washington, DC: World Bank                     Trainer’s Guide for Participatory Learning and Action.
    Publications, Independent Evaluation Group.                       London. International Institute for Environment
OED (Operations Evaluation Department). 2005.                         and Development.
    Maintaining Momentum to 2015? Impact Evalua-                  Silverman, D. 2004. Qualitative Research: Theory,
    tion of Interventions to Improve Maternal and Child               Method and Practice, 2nd Edition. Thousand Oaks,
    Health and Nutrition in Bangladesh. Washington,                   CA: Sage Publications.
    DC: World Bank.                                               Teddlie, C., and A. Tashakkori. 2008. Foundations
Patton, M. 2008. Utilization-Focused Evaluation, 4th                  of Mixed Methods Research: Integrating Quantita-
    Edition. Thousand Oaks, CA: Sage Publications.                    tive and Qualitative Approaches in the Social and
Pradhan, M., and L. Rawlings. 2002. “The Impact and                   Behavioral Sciences. Thousand Oaks, CA: Sage
    Targeting of Social Infrastructure Investments:                   Publications.




                         This note series is intended to summarize good practices and key policy �?ndings on PREM-related topics. The
                         views expressed in the notes are those of the authors and do not necessarily reﬂect those of the World Bank.
                         PREMnotes are widely distributed to Bank staﬀ and are also available on the PREM Web site (http://www.
                         worldbank.org/prem). If you are interested in writing a PREMnote, email your idea to Madjiguene Seck at
                         mseck@worldbank.org. For additional copies of this PREMnote please contact the PREM Advisory
                         Service at x87736.

                                   This series is for both external and internal dissemination