WPS6097


Policy Research Working Paper                     6097




      Collecting High Frequency Panel Data
     in Africa Using Mobile Phone Interviews
                                 Kevin Croke
                               Andrew Dabalen
                             Gabriel Demombynes
                               Marcelo Giugale
                             Johannes Hoogeveen




The World Bank
Africa Region
Poverty Reduction and Economic Management Unit
June 2012
Policy Research Working Paper 6097


  Abstract
  As mobile phone ownership rates have risen in Africa,                              the mobile phone interviews, they tend not to drop out:
  there is increased interest in using mobile telephony                              even after 33 rounds of interviews in the Tanzania survey,
  as a data collection platform. This paper draws on two                             respondent fatigue proved not to be an issue. Attrition
  pilot projects that use mobile phone interviews for data                           and non-response have been an issue in the Tanzania
  collection in Tanzania and South Sudan. The experience                             survey, but in ways that are related to the way this survey
  was largely a success. High frequency panel data have                              was originally set up and that are fixable. Data and
  been collected on a wide range of topics in a manner                               reports from the Tanzania survey are available online and
  that is cost effective, flexible (questions can be changed                         can be downloaded from: www.listeningtodar.org.
  over time) and rapid. And once households respond to




  This paper is a product of the Poverty Reduction and Economic Management Unit, Africa Region. It is part of a larger
  effort by the World Bank to provide open access to its research and make a contribution to development policy discussions
  around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org. The author
  may be contacted at jhoogeveen@worldbank.org.




          The Policy Research Working Paper Series disseminates the findings of work in progress to encourage the exchange of ideas about development
          issues. An objective of the series is to get the findings out quickly, even if the presentations are less than fully polished. The papers carry the
          names of the authors and should be cited accordingly. The findings, interpretations, and conclusions expressed in this paper are entirely those
          of the authors. They do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
          its affiliated organizations, or those of the Executive Directors of the World Bank or the governments they represent.


                                                        Produced by the Research Support Team
            Collecting high frequency panel data in Africa

                           using mobile phone interviews



                     Kevin Croke, Andrew Dabalen, Gabriel Demombynes,

                            Marcelo Giugale and Johannes Hoogeveen1




    Key words: data collection, mobile phone, survey, Listening to Africa

    JEL classification: (C81, C83, D04)

    Sector Board: Poverty Board



1
 All authors work for the World Bank with the exception of Kevin Croke who works at the Gates Foundation.
Correspondence regarding this paper can be sent to Johannes Hoogeveen: jhoogeveen@worldbank.org. This paper
has benefitted from support from the PSIA Trust Fund (TF099681) and the Demand for Good Governance group.
     1. Introduction

Timely, high quality information about well-being, service delivery, income, security, health and
many other topics is not readily available in Africa. One reason why this is the case is because
such data are typically collected by nationally representative, face-to-face household surveys.
Such surveys are expensive and time-consuming and are, for this reason, not implemented very
frequently.

Needless to say that there is huge (latent) demand for up-to-date welfare information, and the
provision of such data should be an essential cornerstone of any modern statistical system.
Decision makers need timely data to monitor the situation in their country. How else can they
know, for example, whether reports about a looming crisis are overblown extrapolations based
on (newspaper) stories, or signs of an emerging disaster? Statisticians too, will benefit from more
frequent information, for instance to estimate changes in employment or to validate GDP
estimates with farmer-based crop forecasts and price information. Program managers, too, stand
to benefit from rapid feedback on the success of their activities, while civil society can put
representative information on service delivery to good use by demanding better services or
improved policies.

The scientific community could equally take advantage from high frequency panel surveys.
Their availability opens a new field of research and offers opportunities to assess the trajectory
of effects in impact evaluations. It would also offer opportunities to make impact evaluations
more efficient. McKenzie (2012), for instance, argues that when outcome measures are relatively
noisy and weakly auto-correlated, such as is the case with business profits, household incomes
and expenditures, and episodic health outcomes, impact evaluations that use smaller samples and
multiple follow-ups are more efficient than the prototypical baseline and follow-up model.

This paper presents an approach to collect a wide range of data related to household welfare at
high frequency and at low cost. The approach combines a standard baseline survey with regular
interviews (weekly, every two weeks, monthly) conducted over the mobile phone. During the
mobile phone interview a wide range of questions can be asked, including questions that are
comparable to those asked in the baseline (to track changes) or questions never asked before to
collect data on emerging issues.


2|
This paper is not the first to suggest that a mobile phone platform can be used to collect high
quality panel data. Brian Dillon (2009) for instance, used mobile phones to carry out 14 rounds
of interviews (every three weeks) to track how farmer expectations of their upcoming harvest
change with time. This paper draws from two mobile phone panel surveys one implemented in
South Sudan and the other in Dar es Salaam, Tanzania. Of these, the survey in Tanzania has been
running longest (33 rounds to date), while the survey in South Sudan is the one that operates
under the more difficult conditions. These two surveys –though quite successful in their own
right, are pilots for a much bigger initiative. The Africa Region in the World Bank intends to roll
out mobile phone panel surveys to a large number of countries in Africa in an exercise that data
users refer to as „Listening to Africa‟ and data producers as „Meeting the high frequency data
challenge‟.

The structure of the paper is as follows. In section 2 we explain how mobile phone surveys work.
Section 3 discusses the Listening to Africa initiative, following which section 4 presents some
results from mobile phone panel surveys in Tanzania and South Sudan. Section 5 discusses non-
response and attrition, while section 6 discusses the representativeness of the Tanzania survey.
Section 7 talks about other aspects of data quality. Section 8 discusses experiences with data
dissemination and their use for accountability purposes, drawing particularly from the Tanzania
experience. Section 9 discusses the costs of mobile phone surveys after which conclusions are
presented in section 10.

     2. Mobile phone surveys

Conducting surveys by phone is standard practice in developed countries, but has typically not
been done in poor countries because phone ownership rates are too low (especially in the pre-
mobile phone era). In Tanzania, for example, just 1% of households own a landline phone (DHS
2010). However, the rapid rise of mobile telephony in Africa has changed this. In Tanzania,
mobile phone ownership increased from 9% of all households in 2004-05, to 28% in 2007-08.
By 2010, this number had almost doubled again, to 46% of households. 2 Unsurprisingly phone
ownership is particularly high in urban areas: it is was 28% in 2004-05, increased to 61% in



2
 This figures are from the 2004/5 DHS, the 2007-08 Tanzania HIV/AIDS and Malaria Indicator Survey and the
2010 DHS.

3|
2007-08 and reached 78% by 2010. In the baseline survey for the mobile phone survey in Dar es
Salaam, mobile ownership was found to be as high as 83%.

Cell phone ownership is widespread, and also poor households have access to mobile phones. In
Tanzania, again, one in every three households in the poorest wealth quintile owns a mobile
phone (Figure 1).

          Figure 1: Cell phone ownership in Tanzania in 2010/11, by wealth quintile
                                      100%
                                                                                                 89%
                                       90%
           Cell phone ownership (%)




                                       80%                                        71%
                                       70%
                                                                      57%
                                       60%
                                                          47%
                                       50%
                                       40%    31%
                                       30%
                                       20%
                                       10%
                                        0%
                                             Poorest      2nd         3rd         4th          Wealthiest



                                             Source: Tanzania National Panel Survey 2010/11.

In Kenya, the Sub-Saharan country that is leading in terms of mobile phone ownership, the
Afrobarometer survey of November 2011 shows that households own on average 2.4 mobile
phones and that 80% of Kenyan adults have their own mobile phone. Phones are actively used:
only 7% report that they never use a mobile phone while 81% make at least one daily call using
their mobile.3

With such high rates of mobile phone ownership, representative household surveys using mobile
phones become an option. Phone ownership rates above 80% are at or beyond the threshold at
which reliable survey research can be conducted: for example, only 80% of US households own
a landline, but political polling typically uses landline samples only. The point estimates




3
 61% send or receive a text message at least once a day and a remarkable 23% sends or receives money or pays a
bill via mobile phone at least once a day.

4|
provided by these surveys are widely considered reliable when corrected by re-weighting.4 This
suggests that phone ownership in Kenya or in urban Tanzania is already high enough for
reasonable inferences to be made from surveys that exclusively rely on mobile phones. In many
rural settings, mobile phone surveys could equally be used provided one ensures that a
representative sample owns or has access to a mobile phone. This is affordable as reliable phones
can be bought for $20 or less, so that respondents selected for participation in a mobile phone
survey and who do not own a phone, can be given one. Only respondents living in areas not
covered by a mobile phone signal would be left out of such surveys, but even these respondents
could be included if, for instance, use was made of a local enumerator who visits the
respondents, collects their responses and then finds a place with a cell phone signal to relay the
responses to the survey administrators.

Another reason to consider using mobile phones for household surveys is because mobile phone
interviews have been found to produce quality data. Lynn and Kaminska (2011) investigated
whether data collected through interviews using mobile phones differs from data collected using
landlines. They started by identifying four reasons why the quality of data collected using mobile
and fixed phone interviews might differ: line quality, the extent of multi-tasking amongst survey
respondents, the extent to which survey respondents are distracted from the task of answering
questions, and the extent to which other people are present and able to overhear what the survey
respondent is saying. The authors evaluated the extent to which differences in these features
affect survey measures by analyzing data from a randomized experiment. In the experiment, a
sample of people who had both mobile and fixed phones were randomly assigned to be
interviewed either on their mobile phone or on their fixed phone. They found only few and small
differences in survey measures between mobile phone interviews and fixed phone interviews.
The few differences that were found suggest that data quality may be higher with mobile phone
interviews. This they attribute to survey respondents having greater control over whether other
people are within earshot and whether others can listen in from another line. When other people
can hear the responses being given – which may be more likely when responding on a fixed line



4
 Blumberg et al, “Wireless Substitution: Early Release of Estimates from the National Health Interview Survey,
July-December 2008.�? Atlanta, GA: Centers for Disease Control. For an example of a repu table survey firm
applying these approaches see: http://www.ipsos-na.com/download/pr.aspx?id=11397.

5|
– respondents may have a tendency to censor their responses to avoid socially undesirable
answers.

One aspect to be aware of is that mobile phone panel surveys, that give phones to respondents
and that incentivize respondents with call credit after the completion of an interview are, by their
very nature, an intervention. Particularly for respondents that did not own a phone prior to
participation on the survey, their ability to access information and to connect with others changes
significantly. Moreover, all respondents are asked to consider aspects of their lives and state facts
or opinions about it at a frequency that is much higher than in ordinary (panel) surveys. In future
research we intend to explore the degree to which this empowers respondents and changes their
behavior. A randomized experiment in which some respondents in the baseline participate in the
mobile phone panel and others don‟t, followed by a second face-to-face interview a year later,
would be a good way to assess the impact of participation in a mobile phone panel survey.

     3. Listening to Africa

There are two ways to set-up a representative mobile phone panel survey. One approach
exclusively relies on mobile phone interviews and creates a representative sample by calling
potential respondents to assess their core characteristics (location, gender, age, education, wealth
etc.) and their willingness to participate in the survey. This requires high mobile phone
penetration rates and the availability of a data base of telephone numbers from which an
unbiased sample can be drawn. Another, more conventional approach does not exclusively rely
on phone interviews but combines face-to-face interviews during a baseline survey with mobile
phone interviews.

Both approaches are feasible but with the current penetration rates of mobile phones (particularly
in rural areas and amongst poor households), Listening to Africa prefers the second approach
that makes use of a baseline survey. This baseline will often be a new survey, but it could be that
households from an existing survey are revisited. The latter may seem a good way to reduce cost
but as households need to be revisited in any case, to select the respondent (Listening to Africa
aims to create representative samples of the (adult) population thus necessitating the random
identification a respondent within the household), to obtain permission for participation, to
familiarize respondents with the mobile interview and to agree on a good time for phone


6|
interviews, the cost advantage of using an existing survey is likely to be small. The baseline
survey is also the time to distribute phones and, in locations with limited access to electricity,
solar chargers. Alternatively, a village kiosk owner who provides phone charging services could
be contracted to offer free phone charging to participating households.

Mobile phones offer a multitude of opportunities to obtain feedback. Life interviews carried out
by a mobile phone enumerator is one way. SMS, WAP, IVR and USSD are other approaches.5
Listening to Africa intends to collect its data using life interviews, whereby enumerators in a call
center call respondents, ask the relevant questions and enter the responses into a database using a
CATI (computer aided telephone interview) system. The decision to rely on call centers for
mobile phone data collection is informed by experiences with WAP, IVR and USSD in the early
stages of the Dar es Salaam mobile phone survey. The flexibility life interviews offer to conduct
interviews in different languages, to vary questions from one round to the next, the ability to ask
complex questions (which may require explanation), the possibility to accommodate illiterate
respondents and respondents owning low end phones without internet connectivity, makes life
interviews the technology of choice for most of sub-Sahara Africa. Moreover, good enumerators
can build rapport with the respondent, supervisors can re-call (instead of revisit) respondents for
quality control purposes, and life interviews offer the opportunity to ask in-depth (qualitative)
questions if this were desired.

Reliance on voice to collect data does not mean that other opportunities offered by mobile
phones remain unexploited. Respondents can be alerted that an interview is due through SMS,
and following the successful completion of an interview, respondents can receive phone credit
that is transferred directly to their mobile phone. Another way to motivate respondents is by
keeping them informed about how data they provided is being used: in Tanzania, for instance,
respondents are notified by SMS when newspapers report stories based on information provided
by the respondents.

Listening to Africa is not only about collecting quality data. It embraces open data principles and
is committed to releasing all (anonymized) data within four weeks of its collection. Where
possible, Listening to Africa will integrate its data collection into national statistical systems and


5
    Smith et al. (2011) provide an overview of different ways to gather data using mobile phones.

7|
work with national steering committees to oversee data collection, dissemination and the
identification of questions. More information about these or other aspects of Listening to Africa
can be obtained from the corresponding author. Finally by implementing mobile phone panel
surveys in many countries in Africa, Listening to Africa aims not only data that is country
specific, it equally intends to collect comparable data that allows cross country comparisons.

     4. Selected results from the Tanzania and South Sudan mobile phone panel surveys

The mobile phone survey in South Sudan revisited, late 2010, 1,000 respondents in 10 urban
areas covered in 2009 by the National Baseline Household Survey. During the revisit
respondents were identified, mobile phones were handed out (half of them with integrated solar
chargers) and agreements were reached when respondents could best be called. Respondents
were called on a monthly basis using a call centre operating from Nairobi using interviewers
capable of speaking South Sudan‟s main languages. Respondents who successfully completed an
interview were rewarded with an amount varying from $ 2 to $ 4.

The survey in Tanzania visited in August 2010, 550 households in Dar es Salaam, administered a
new baseline survey, randomly selected an adult respondent from the household roster to be
included in the mobile phone panel, and called respondents on a weekly basis (25 rounds), and
later (8 rounds) every two weeks. The survey in Dar es Salaam did not distribute phones. Only
recently after round 33, have some phones been distributed to selected respondents. Respondents
were rewarded with phone credit varying between $ 0.17 to $ 0.42 per successful interview.6
Both surveys are still running.

The mobile phone survey interview format does not appear to pose major limitations on what
can be asked, except that the length of an interview should probably not be more than 20 – 30
minutes (Dillon‟s interviews lasted 27 minutes on average; interviews in the Dar es Salaam
survey are generally somewhat shorter). So an elaborate consumption module or a detailed health
module with birth histories are less suited for this type of survey.7 Mobile phone surveys in
South Sudan and Tanzania collect information on a wide variety of issues including on health,

6
  Remarkably in both Sudan and Tanzania the amount of the reward did not have a discernable impact on response
rates (see Table 1 for evidence from Tanzania).
7
  This raises another issue for future research: whether it is possible to track changes in consumption by using
poverty mapping techniques (Elbers, Lanjouw and Lanjouw 2002) with a set of correlates that is more sensitive to
changes in consumption levels than assets which is currently used in poverty mapping.

8|
education, water, security, nutrition, travel times, prices, electricity and governance. The surveys
have been used to ask perception questions on topics varying from what respondents considered
the most pressing problems to be addressed by the city government to opinions about the draft
constitution. They have also been used to collect baseline information for a large scale program
on food fortification. One of us, Kevin Croke, used the panel to collect additional data for a
research paper, when it turned out that the baseline survey lacked some variables needed to
answer a particular research question.

  Figure 2: In the last month, how often if ever, have you or a member of your household
                                 gone without enough food to eat?

                                    Never                    Once, Twice or Several Times
                                    Many Times or Always

                          100



                           80
            Percentage




                           60



                           40



                           20



                            0
                                December       January     February          March




                         Source: South Sudan mobile phone survey, 2010-11

Data collected by the mobile phone survey can easily be used to report on a single issue (e.g.
only one in ten households did experience power cuts during the seven days prior to the
interview) but becomes of greater interest when the same information is tracked over time. For
instance, Figure 2 shows how food security in South Sudan improved between December 2010
and March 2011.

By combining information collected during the (short) mobile phone interviews with the more
elaborate information collected in the baseline more meaningful results are obtained: below
results are disaggregated by wealth quintile, using an indicator that was constructed using asset
information collected during the face-to-face interviews of the baseline survey. Such information



9|
can be useful to monitor the distributional impact of large scale programs, or to track well being
of the poorest in a society during a crisis.

               Figure 3: In the last week, did your child receive any homework?

                 80%                                                       75%
                 70%                    62%                      62%
                                                    58%
                 60%       53%
                 50%
                 40%
                 30%
                 20%
                 10%
                  0%
                          Poorest         2          3            4     Wealthiest

                                               Wealth quintile



                          Source: Tanzania mobile phone survey, 2012

Because questions can be changed every round of the survey, it is possible to accommodate new
data requests or to respond to emerging issues. After Dar es Salaam was hit by major floods in
December 2011 for instance, the mobile phone survey asked questions to estimate the fraction of
people that had been affected, finding that almost 7 percent of households had been forced to
leave their home. Had the sample size been somewhat larger, it would even have been possible to
estimate the percentage of affected households that received assistance from the government,
proving real time impact on an important and salient government activity.

A recent innovation that has been successfully tried in the Dar es Salaam survey is to ask the
respondent to pass the phone to someone else in the household, for questions that cannot be
answered by the respondent. Figure 5, for instance, presents responses to questions asked to
children attending primary school about the presence of their teacher and the use of books while
at school.




10 |
Figure 4: Questions asked early January 2012 in response to the December floods in Dar es
                                                         Salaam

                  16                  14.9
                  14
                  12
                  10
                Percent



                     8                                          6.7
                     6
                     4
                                                                                       1.4
                     2
                     0
                              Did water leak into     Did you have to leave Are you still unable to
                                 your home?                your home?       return to your home?



                                Source: Tanzania mobile phone survey, 2012

  Figure 5: Questions asked to primary school children about teacher presence and use of
                                                  books while in school

             90%              83%
             80%
             70%
             60%                                                                 54%
             50%                                                                                46%
             40%
             30%
             20%
                                             8%          9%
             10%
              0%
                             Taught     Taught for Didn't teach                Did use       Did not use
                          through all part of the     at all                   books           books
                          class period class period yesterday

                                             Teacher presence                      Use of books



                                Source: Tanzania mobile phone survey, 2012

Other, as yet untried approaches can be imagined. For instance tried yet, mobile phone
interviews can be used to ask screening questions to identify respondents who qualify for in-


11 |
depth interviews. In this way, qualitative and quantitative research methods can be integrated
seamlessly. Or selected respondents can be asked to carry out specific monitoring tasks. What
are the prices of certain goods, how much rain fell during the past week, is the water source in
the village functioning or are specific drugs available at the health facilities. The possibilities are
many.

       5. Non-response and attrition

A key challenge for high frequency mobile phone panel surveys is non-response (a respondent
participates in some but not all rounds) and attrition (a respondent drops out of the survey
completely). Attrition and non-response are challenges for all panel surveys, but may be
particularly an issue for mobile phone panels due to the number of times respondents are invited
to participate in an interview.

In this section we rely on data from the Tanzania survey as this is the longest running mobile
phone panel. In considering this survey it is important to realize that when this survey was
initiated by Twaweza (one of us, Johannes Hoogeveen worked at Twaweza at the time), a main
objective was to explore which technology would be most suited for a nationally representative
mobile phone survey and to identify the systems needed to collect, process, analyze and
disseminate survey data on a weekly basis. It explains why no mobile phones were distributed
and why households without a mobile phone were allowed to drop out of the mobile part of the
survey. Bearing this in mind, there is much that can be learned from this survey.

During the Tanzania baseline, households were assigned one of four technologies: Interactive
voice response (IVR), USSD (an approach allowing direct transmission of questions from a
phone company server to the respondent‟s phone; this technology also works on low-end
phones), WAP (web-based mobile phone surveys, suited for high-end phones with internet
capability) and voice (a call centre).8 Following the baseline and during the first 7 rounds of the
mobile phone panel there were numerous problems with the different technologies: the fraction
of respondents owning internet enabled phones turned out to be very low (eliminating WAP),
support from the phone company to run USSD was minimal (especially once mobile banking
started to claim the available bandwidth), IVR turned out to be clumsy as questions had to be

8
    Because of its limitations SMS was not considered.

12 |
broken down to avoid too many response options. Voice did not have any of these drawbacks.
Hence after a relatively short period of time, life phone interviews became the technology of
choice and all those who were reachable and had access to a mobile phone were put through a
basic call centre which consisted of a group of enumerators who each had multiple phones (one
for each phone network allowing cheaper within network calls) and a computer with a standard
data entry screen.

                                    Figure 6: Number of respondents per round (starting round 8)

                                    370

                                    350
            Number of respondents




                                    330

                                    310

                                    290

                                    270

                                    250
                                          Round 17




                                          Round 30
                                          Round 10
                                          Round 11
                                          Round 12
                                          Round 13
                                          Round 14
                                          Round 15
                                          Round 16

                                          Round 18
                                          Round 19
                                          Round 20
                                          Round 21
                                          Round 22
                                          Round 23
                                          Round 24
                                          Round 25
                                          Round 26
                                          Round 27
                                          Round 28
                                          Round 29

                                          Round 31
                                          Round 32
                                          Round 33
                                           Round 8
                                           Round 9




                                            Source: Tanzania mobile phone survey, 2012

Following this decision the survey ran for another 18 weekly rounds before it was discontinued
by Twaweza.9 Management of the survey was then transferred to the World Bank who had
indicated interest in using the survey to generate feedback for its own programs. The World
Bank appointed consultants who were tasked with identifying questions and making the
anonymized data publicly available. Consultants were also facilitated to prepare and publish
(independent) reports on the findings (Kevin Croke one of the co-authors is the lead consultant).
The original survey firm, DataVision, was contracted to continue to carry out the mobile phone
interviews, and after a gap of four months the survey was restarted. Under this new arrangement


9
 Based on the experience with mobile phone surveys in Dar es Salaam, Twaweza is currently in the process of
setting up a nationwide mobile phone survey.

13 |
interviews were conducted every two weeks. At the time of writing (February 2012) 7 rounds
had been completed.10

So what does the Tanzania mobile phone survey tell us about attrition and non-response? On the
negative side, there was a large initial burst of attrition. This can largely be attributed to the fact
that the survey team did not hand out phones. When the team initially visited and administered
the baseline survey to 550 respondents, it was found that 418 of them owned a phone, 69 had a
household member who owned a phone, 6 could access a phone through a friend and 57 had no
phone or access to a phone. Obviously, owning a phone is different from using someone else‟s
phone and when the mobile phone survey switched exclusively to life interviews in round 8, it
was assessed that 458 respondents could realistically be reached.11

     Figure 7: Number of rounds respondents participated in the 26 rounds of the Tanzania
                                                              mobile phone survey

                                        60

                                        50
                Number of respondents




                                        40

                                        30

                                        20

                                        10

                                         0
                                             0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
                                              Number of rounds (out of 26) in which a respondent participated



                                               Source: Tanzania mobile phone survey, 2012

Between round 8 (when the mobile phone panel began in earnest) and round 26 (before the
survey was transferred to the World Bank) an average of 304 respondents, or 66%, participated

10
   Reports and data produced, including the baseline data can be obtained from: http://monitor.public-
transparency.org/
11
   Some respondents could not be reached either because their numbers had been captured incorrectly, or because
they never seemed to have their phones on.

14 |
in the survey. After the survey had been put under World Bank management and oversight was
tightened (but after a four months gap in interviews!) the number of respondents increased to 343
(75% of the sample).

So after 33 rounds of mobile interviews, the overall non-response rate is 25% of the 458
households in the sample that had access to phones. The rate of attrition, defined as those
amongst 458 who did not respond at all to the mobile phone panel is much lower: only 4% or 18
out of the 458 households never responded to a request for a mobile phone interview.

While we are not aware of comparable cases involving mobile phone panels, the rate of non-
response and attrition appear comparable to what has been attained by a number of non-mobile
phone (i.e. face-to-face) panel surveys. For example, the Cebu Longitudinal Health and
Nutritional survey in Philippines had almost 66% attrition (Miguel et al. 2008), while Alderman
et al. (2001) note that the Bolivian Preschool Pre-School Program Evaluation Household Survey
had 35% attrition. The Kenya Ideational Change survey had 28% attrition for women (and 41%
for couples), over a two year interval. Panel surveys that revisit respondents after extended
intervals often have relatively high attrition; for example the Kagera Health and Development
Survey lost 31% of their respondents between 1994 and 2004. However, specially designed
panels such as the Indonesia Family Life Panel or the Kenya Family Life panel, which place
high priority on minimization of attrition (though tracking of migrants, for example), have
achieved much lower attrition rates: the Indonesia panel attrition rate was only 9%-13% over 4
separate survey waves (Thomas et al. 2010), while the Kenya Life Panel Survey had 17%
attrition over seven years (Miguel et al. 2010), and the South Africa KIDS survey had 16%
attrition over 5 years. Dillon (2010) achieved an attrition rate of 2%. Over shorter periods of time
(comparable to our survey‟s 1-2 year period), many recent randomized controlled trials have
managed to track the vast majority of their beneficiaries from baseline to follow up.12

If one takes into account that there was a considerable time lag between the baseline survey and
the start of the mobile phone interviews and another four months lag when management of the
survey was transferred to the World Bank, the rate of non-response and attrition is not only
relatively low, there is ample room for improvement. Distributing phones and enhanced

12
  See Duflo, Glennerster, and Kremer, “Using Randomized Experiments in Development Economics Research: A
Toolkit�?, section 6.4 for a discussion of attrition issues in randomized controlled trials.

15 |
enumerator and respondent training should make it feasible to largely avoid the initial reduction
in the sample from 550 to 458. Distributing solar chargers, for instance to those with limited
access to electricity would enhance response rates further: for instance those with access to
electricity answered on average in 18.6 rounds versus 16.4 for those without access to electricity.
Even the choice of phone provider seems to matter, as those using the premium network
respondent significantly more often (20.1 times) than those using any of the other networks (16.9
times).

       6. Is the Tanzania mobile phone panel survey representative?

Attrition and non-response are particularly problematic when they occur in a non random
manner. If attrition is truly random, then the representativeness of the post-attrition sample is
comparable to that of the baseline sample, meaning that while sample size has decreased (and
standard errors have increased), the point estimates of the follow up survey are still unbiased
estimates of the true population mean. If attrition and non-response are non-random and are
associated with observable characteristics of respondents which have been recorded in the
baseline survey, then it is also a manageable problem, and can be addressed by re-weighting the
remaining respondents by the inverse of the probability of attrition.13 A final possibility is that
attrition is non-random but associated with unobservable characteristics of respondents. In this
case, attrition is quite harmful to the representativeness of the survey: since attrition is based on
unobservable characteristics, the survey sample cannot be reweighted according to these
(unknown) characteristics.14 This is certainly possible in our survey, as in any panel survey, but it
is essentially an non-testable assertion. The question that we address here is whether, given the
sizeable attrition and non-response in the Tanzania mobile phone survey (at least in comparison
to the baseline), the Tanzania panel can still be considered representative. Given our detailed
baseline survey, we use regression analysis to assess whether attrition is closely linked to
observable demographic and behavioral characteristics, or whether it appears to be largely
random.

13
   Alderman et al (2001) suggest that even where attrition is non-random, key parameter estimates are often not
affected, using examples from Bolivia, Kenya, and South Africa. Fitzgerald et al (1988) draw similar conclusions
from the US-based Panel Survey on Income Dynamics, as does Falaris (2003) with respect to surveys in Peru, Cote-
D‟Ivoire and Vietnam (cited in Thomas et al 2010.)
14
   Frankenberg et al al and Beegle, DeWeerdt, and Dercon 2008 suggest that attrition in developing country settings
is likely to be related to unobservable traits, in part because attrition is often linked to migration.

16 |
Table 1 presents regression analysis of the determinants of attrition. In the three regressions
presented below, the dependent variable is the number of rounds (out of 25) in which the
household participated.15 Column one presents a model including all 550 households visited in
the baseline. In this model, economic status is a significant predictor of survey participation:
households without a phone, those using non-premium phone providers and those in the second
poorest income quintiles are significantly less likely to participate relative to households of
median wealth.

Unsurprisingly for a survey that did not distribute mobile phones, wealth is found to be
correlated with survey participation. In column two, when we restrict the regression to
households that were identified as reachable in round 8, we find that the impact of wealth largely
disappears. In this regression, location (living in rural Dar es Salaam) and using the premium
provider remain significant variables.

In column 3, finally, the model is altered to include information about whether the respondent
could be reached when the mobile survey started to exclusively use life interviews (during round
8). Not being reachable may reflect a lot of aspects: phone network, access to electricity, phone
habits as well as issues related to enumeration. This specification is included to show that if this
mobile phone survey had been able to avoid having „unreachable‟ respondents at the start
(through better training and protocols for instance) the remaining sample would have been
representative, as no other variables show up as significant. Other observations from these
regressions worth noting are that years of education, gender and even the amount given as
reward do not explain non-response.

The message we take from these regressions is that with phone distribution and by paying more
attention to ensuring a smooth transition from inclusion in the baseline to inclusion in the mobile
survey, non-response could have been significantly reduced.




15
  As will be explained later, in the third regression round 1 is used to determine persistence in non-response. To
avoid autocorrelation, this round is omitted from the sum of rounds in which the household participated.

17 |
Table 1: Three OLS regressions on participation in the mobile phone survey. Dependent
variable is the number of times a respondent participated in the last 25 survey rounds

                                         Regression 1        Regression 2             Regression 3
D-unreachable in round 1                                                                 -9.63266
                                                                                             -15.5
D-male                                                -1.176             -0.017             -0.315
                                                         -1.5                0.0               -0.5
D-owns phone                                           6.003              1.468              0.792
                                                          6.3                1.4                0.9
Age                                                   -0.008             -0.017             -0.012
                                                         -0.3               -0.6               -0.6
Years of schooling                                     0.048              0.031              0.166
                                                          0.4                0.2                1.6
D-rural                                               -0.640             -1.786             -1.011
                                                         -0.6               -1.7               -1.2
D-house has electricity                               -0.074             -0.130             -0.125
                                                         -0.1               -0.3               -0.3
D-poorest quintile                                    -2.126             -0.815             -0.634
                                                         -1.6               -0.6               -0.6
D-second quintile                                     -2.181             -1.774             -0.650
                                                         -1.8               -1.5               -0.7
D-fourth quintile                                      0.890              1.164              1.048
                                                          0.7                1.0                1.1
D-wealthiest quintile                                 -1.086             -0.576              0.097
                                                         -0.8               -0.5                0.1
D-receives Tshs 300                                   -0.247              0.150              0.639
                                                         -0.3                0.2                0.9
D-receives Tshs 400                                   -1.144             -0.635             -0.823
                                                         -1.2               -0.7               -1.1
D-Vodacom                                              5.755              2.016              0.637
                                                          4.7                1.7                0.7
D-Tigo                                                 2.580             -0.907             -0.803
                                                          2.7               -0.9               -1.0
Contant                                                9.313             17.399            19.908
                                                          4.3                7.5              10.7
Obs                                                      542                450                450
R-squared (adj)                                         0.18               0.04               0.38
T-values in italics underneath the coefficient. Significant coeffients (p<0.01) in bold.




18 |
                             Figure 8: Changing wealth composition of sample
                        25


                        20


                        15                                                             Poorest
              Percent



                                                                                       Second

                        10                                                             Middle
                                                                                       Fourth

                         5                                                             Wealthiest


                         0
                             Full sample Mobile survey   Round 26Round 26 reweighted



                                Source: Tanzania mobile phone survey, 2012

Another important conclusion is that for this survey to remain representative, it is necessary to
reweight responses ex-post. Figure 8, illustrates how reweighting is able to restore the survey‟s
representativeness by showing how the changing composition of the sample affects the percent
of households allocated to different wealth quintiles. The first column presents the survey
baseline (550 respondents) with the sample is divided (by definition) equally among the 5 wealth
quintiles: each quintile has exactly 20% of the sample. When we look at the breakdown across
the 458 respondents that were included in the mobile phone survey, one notes that poor
households are underrepresented. The distribution becomes more skewed towards wealthier
households in round 26 (341 respondents). The final set shows what the final distribution looks
like once the mobile phone sample has been reweighted using the code presented in annex 1. It
shows that the original distribution is essentially restored. One conclusion that we draw from this
is that, while we can never control for selection based on unobservables, reweighting based on
observables should be a standard procedure after every survey round in a mobile phone survey.

       7. Other aspects of data quality

One concern one might have about mobile phone interviews is that the respondent could change
from one wave to the next. Evidence of validation data suggests that this does not happen.
Respondents identified during the baseline are the same as the respondents answering the phone.

19 |
Figure 9 presents data comparing respondents‟ age, 4 rounds into the survey. It demonstrates that
apart from white noise, respondents are the same.

            Figure 9: Comparing respondent ages as given in the baseline survey and in the mobile
                                                          phone survey
                                  100
                                   80
 Age during baseline




                                   60
                                   40
                                   20




                                        20                 40                      60               80
                                               Age during mobile survey (4 months after baseline)




                       Source: Dar es Salaam baseline survey and round 4 of the mobile phone survey

However, when the same exercise was repeated in round 32, 18 months into the survey, the
amount of white noise had increased considerably, suggesting a need for more quality control by
the call center to ensure that the respondent selected during the baseline is answering the phone
call. It is a good illustration of how high frequency surveys provide opportunities to identify
data quality issues and to correct them in subsequent survey rounds. In this instance, the survey
firm has been instructed to be more careful about who is the respondent and to change the
beginning of the interview to ensure that the original respondent answers the phone (when
available) or to clearly indicate that we are dealing with a another adult replacing the original
household member. In round 35 of the survey, this exercise will be repeated to assess whether
the change in protocol has been successful.

A change of respondents, by the way, does not necessarily affect the representativeness of the
survey. When questions are asked about household characteristics such as access to electricity or
water, for example, any adult should be able to answer. Since insisting that the same respondent

20 |
always answers the questions is likely to lead to non-response, there is a trade-off between non-
response and insisting that the same respondent always answers. In fact, as long as a change of
respondent is captured in the data, it could make sense for a mobile phone survey protocol to
allow the original respondent to be replaced by another adult household member.

  Figure 10: Comparing respondent ages as given in the baseline survey and in the mobile
                                                            phone survey
                                     100
                                      80
       Age during baseline




                                      60
                                      40
                                      20




                                           20                40                      60                80
                                                 Age during mobile survey (18 months after baseline)




                        Source: Dar es Salaam baseline survey and round 25 of the mobile phone survey

Finally, it is worth stressing that respondents also make errors. This is illustrated with responses
to questions about food fortification in the Tanzania survey. In this round of the panel, baseline
information was collected in preparation of a large food fortification program. In Dar es Salaam,
almost all salt on sale is fortified (iodized) while wheat flour, maize flour and cooking oil are
not. But when asked about whether each of these foods was fortified, significant fractions of
responses stated that cooking oil (23%), wheat (9%) and maize flour (12%) were fortified, while
in fact they are not. In the case of salt, few respondents mistakenly claimed that salt was not
fortified (0.3%). Interestingly, no significant differences in errors could be observed between
respondents with primary or secondary education, or between respondents who had and had not
heard about food fortification. It re-affirms the need to remain vigilant when interpreting survey
results.



21 |
                      Figure 11: Responses to questions about food fortification

                 100%
                  90%
                  80%
                             51.6%
                  70%
                  60%                      75.3%                      80.4%
                                                         84.9%
                  50%
                                                                                     Don't know
                  40%
                                                                                     Yes
                  30%
                             48.1%                                                   No
                  20%
                                           22.8%          9.2%        12.1%
                  10%
                   0%         0.3%         1.9%           6.0%         7.5%
                              Salt      Cooking oil      Wheat        Maize

                            Fortified                 Not fortified



                              Source: Tanzania mobile phone survey, 2012.

       8. Use of the mobile phone survey data

Ever since management of the Tanzania survey was transferred to the World Bank, active efforts
have been made to disseminate the data widely to ensure that the survey results were used by
decision makers or for accountability purposes. Questions are carefully identified for their
interest to program managers as well as for their potential use as „accountability tools‟. Once the
data are collected, easy to understand, factual reports are prepared presenting the findings. These
reports are disseminated through a dedicated website from which all survey data, baseline survey
as well as data from the mobile phone interviews can be downloaded (www.listeningtodar.org).
The reports are shared by email using a distribution list that includes journalists and other
potentially interested parties. A twitter account broadcasts the main findings
(www.twitter.com/darmobilesurvey).

So does it work? The website itself attracts limited traffic and few people download the data
available on the site. More successful have been attempts to get media attention. Reports
produced have been discussed on blogs16 and in academia17, they have been re-broadcast

16
     E.g. http://blog.daraja.org/2012/02/independent-monitoring-of-dars-water.html
17
     E.g. http://www.viewtz.com/2012/02/13/world-bank-cause-of-economic-hardships/

22 |
integrally18, and have led to various newspaper articles. It is hard to assess what happens once
information is published, but there are indications that the information is „received‟ by those
responsible for results. For instance, the managing director of Tanzania‟s electricity company felt
compelled to explain to the media why so many households connected to electricity are
experiencing power cuts and what his company is doing about them.19

Once published, information travels fast and far. A brief about food price increases, was used in
a front page article in Tanzania‟s the Citizen newspaper, was picked up by others the Rwandan
Times20 and is ended up in the World Bank‟s 2012 Global Monitoring Report.

Information also tends to go in unexpected directions. A brief about the limited increase in water
connections in Dar es Salaam despite a large scale investment program, received media attention
because of the discrepancy it showed between the data reported by the mobile phone survey and
official government statistics.21

So what lessons can we draw from actively disseminating 8 rounds of data? One lesson is that
providing access to the raw data is not sufficient. Say‟s Law, suggesting that supply will create
its own demand, does not seem to hold for the data produced by the Tanzania mobile phone
survey. Another lesson is that good analysis and easy to access reporting on the data does make
dissemination much easier. The uptake of the data from the last 8 rounds of interviews is
encouraging, especially as much more can be done. The website could be promoted more
vigorously. Facebook remains an unexploited tool, press conferences can be organized, the email
list expanded and the twitter account could become more active. All this is needed if the
objective is to ensure that the data, once collected, is utilized by a wide range of people.

       9. Cost effectiveness of mobile phone surveys

How cost effective are mobile phone surveys? Our data give some indication of the marginal
cost of a mobile phone survey. The call center contracted to implement 12 survey rounds, does
so at a rate of $1,400 per round. If one adds the cost for consultants to maintain a website,


18
   http://godfreynnko.blogspot.com/2012/02/most-citizens-do-not-know-if-their-food.html
19
   http://www.ippmedia.com/frontend/index.php?l=39618 and
http://www.ippmedia.com/frontend/index.php?l=39551.
20
   http://www.newtimes.co.rw/news/index.php?i=14892&a=11334
21
   E.g. http://www.thecitizen.co.tz/sunday-citizen/-/20105-govt-figures-on-access-to-clean-water-inflated

23 |
supervise data collection and to analyze the data, the marginal cost per round increases to $2,500.
Given that these rounds averaged 343 respondents, this comes to about $4.10 - $ 7.30 per
interview. Dillon (2010) notes a relatively similar marginal cost per survey: $6.98.

In addition to these marginal costs, one needs to include the cost of a baseline, which will often
be between $ 50 and $ 150 per respondent, depending on the complexity of the survey and the
distances that have to be covered.

Whether this is cost effective or not depends a lot on the purpose of the survey. The ability to
carry out an entire survey in Dar es Salaam and to report on its results for $ 2,500 is remarkably
cost effective. But if one keeps in mind that the typical round in the Dar es Salaam survey asks
17 questions (with a maximum of 44), then the cost per question is about $ 0.42. This is
relatively high, so if the intention is to ask many questions it may be more cost effective to opt
for a face-to-face interview.

       10. Conclusion

The evidence presented demonstrates that mobile phone panel surveys have great potential to
provide rapid feedback and to address existing data gaps at limited expense. Mobile phone panel
surveys should not be considered substitutes for household surveys; rather they will often make
use of an existing household survey to serve as baseline. Moreover mobile phone surveys are not
the right platform for lengthy interviews; when interviews are lengthy, face-to-face interviews
are probably more cost effective.

The evidence from the Tanzania and Sudan surveys suggests that mobile phone surveys can
collect quality data in a timely manner. The Tanzania panel survey pointed towards the
importance of putting in place mechanisms that avoid attrition right from the implementation of
the baseline. Much attrition in the Tanzania survey can be explained by choices made in the
organization of the survey (such as not to distribute mobile phones) and the work by Dillon,
demonstrates that it is feasible to get low initial drop out. Our work suggests that once
households are included in the mobile phone survey, they are likely to remain in the survey:
respondent fatigue was not found to be an issue, even after 33 rounds of interviews. The work
also suggests that because of their high frequency, quality control of mobile phone surveys is
dynamic and issues identified in one round can be corrected in the next.

24 |
Finally, the success of mobile phone panels will not only be measured by whether relevant and
quality data is being produced in a timely manner, but also by how many people actually use the
results from the surveys. Our work suggests that making the data publicly available is not
sufficient. Analysis and active dissemination are needed to ensure that data finds its way into the
public domain.




25 |
References

Alderman, Harold, Jere Behrman, Hans-Peter Kohler, John A. Maluccio and Susan Watkins
(2001). "Attrition in Longitudinal Household Survey Data." Demographic Research, Max Planck
Institute for Demographic Research, 5(4), 79-124.

Baird, Sarah; Joan Hamory, and Edward Miguel (2008). “Tracking, Attrition and Data Quality in
the Kenyan Life Panel Survey Round 1.�? University of California Berkeley: CIDER Working
Paper.

Blumberg et al., “Wireless Substitution: Early Release of Estimates from the National Health
Interview Survey, July-December 2008.�? Atlanta, GA: Centers for Disease Control.

Dillon, Brian (2009). Using Mobile Phones to Conduct Research in Developing Countries.
Journal of International Development.

Elbers, Chris, Jean O. Lanjouw, and Peter Lanjouw. 2002. Micro-Level Estimation of Poverty
and Inequality. Econometrica, 71(1): 355-64.

Lynn, Peter and Olena Kaminsha (2011). The impact of mobile phones on survey measurement
error. Institute for Social and Economic Research No. 2011-7.

McKenzie, David (2012). Beyond Baseline and Follow-up: The Case for More T in Experiments
(forthcoming Journal of Development Economics).

Smith, G., MacAuslan, I., Butters, S. and Tromme, M. (2011). New Technology Enhancing
Humanitarian Cash and Voucher Programming, a Research Report commissioned by CaL

Thomas, Duncan; Witoelar, Firman; Frankenberg, Elizabeth et al (2010). Cutting the Costs of
Attrition: Results from the Indonesian Family Life Survey. BREAD Working Paper No 259.




26 |
Annex 1: Stata code to reweight the sample




*new weights (iwt2) using participation in week 26

logit week26_dum poor average rich richest sex age rural voda tigo electricity years_school

predict ps

xtile deca=ps, nq(10)

bys deca: egen ac=mean(ps)

replace ac=(1/ac)

gen iwt2=iwt*ac




27 |