So, your learning assessment was postponed…
Though many of the issues we are facing in ensuring children worldwide are in school and learning are not new, COVID-19 has magnified the challenges. This pandemic has impacted nearly all facets of the world’s education systems and programs at the same time. Assessing children’s learning achievement is one important aspect of education support and reform; it is essential to know if children are learning. Yet many assessment efforts can no longer take place as planned. How can we gauge learning when best laid plans are out? While assessments can be delayed for various reasons, this generalized challenge we are facing presents an opportunity to gather our best thinking and experience, so that we can make informed decisions and potentially improve our practice as a result.
The challenge
Your activity was slated to conduct a student assessment (reading, math, etc.) in the coming months but schools are closed, learning has slowed or stopped – through the end of the school year in some contexts, spanning terms or semesters in others.
What do you do?
The solution depends on the original purpose of your assessment along with other technical and political considerations. In this brief, we lay out considerations along with a menu of possible solutions.
Start with a clear understanding of the assessment purpose
This is very important. During the scenario planning process, continually return to the purpose of the assessment, in order to ensure that plans address current needs. In some cases, the purpose (and priority) might have shifted along with implementation plans and calendars. Either way, the key is to design a new assessment plan that aligns with the purpose. It is important to ask “what is the purpose of the assessment now”? It could be that students don’t come back to schools as expected (or return may be inequitable) and learning does not resume as usual, in which case more in-depth retooling of assessment efforts is needed (beyond the technical alternatives presented here)[1]. Additionally, while our recommendations serve as a useful starting point for scenario planning, specific plans will be impacted by your evaluation design (e.g. repeated cross-section versus longitudinal, availability of a comparison or control group, etc.).
Key principles as touchstone – same time next year
We start with some key principles, to be adhered to as the ideal:
- baseline data should be collected before implementation begins;
- evaluation data should be collected at the same time of the year for each administration for comparability (unless the intention is to measure within-year gains or when using a longitudinal design);
- and data should be collected in the target grade, identified by the assessment’s purpose
In order to determine a best-case scenario, we begin with the premise that the original schedule was purposefully chosen as an optimal timeframe for your data collection, aligned with the purpose. Therefore, we propose that collecting data in the same grade(s) and at the same time of year as originally planned, one year later. The following discussion is in the event that the data collection cannot be postponed for the same timeframe the next year.
When delaying until the same time next year is not an option
There are a range of reasons why it may not be feasible to hold off on conducting your assessment until the same time next year. The question is: what is the next best option? We propose important considerations and potential solutions for a few of the most commonly encountered scenarios, below.
Scenario 1: Conducting a baseline when implementation is also delayed
While COVID-19 school closures may cause delays in your baseline data collection timeline, implementation efforts may also be delayed. These implementation delays could allow for more time to collect a true baseline and should be carefully considered, as opposed to just assuming that baseline data collection must occur as soon as schools re-open. This comes down to understanding the purpose of your assessment and recognizing the desire to collect data just prior to the start of implementation.
Scenario 2: Conducting a baseline when the intended time of the school year will be missed
Even if the baseline can take place before implementation, it could be the case that the assessment may no longer be able to occur at the originally planned point in the school year. For example, let’s say you were intending to collect baseline data from grade 2 students at the end of the 2020 school year. Due to school closures, you will not be able to collect data before the start of the 2021 school year.
In this case, it is important to think creatively about solutions that will match the purpose of the data collection and the design of the study. It may be preferable to conduct the baseline data collection at the beginning of the 2021 school year with students at the start of grade 2, followed by midline and endline assessments that assess students at the end of grade 2 (in ensuing years). This would allow for initial measurement of within grade gains, while still allowing for accurate reporting on end of grade 2 performance at midline and endline. Another possible modification would be to collect data at the beginning of the 2021 school year using grade 3 as a proxy for end of grade 2, but the proxy grade approach is not recommended in most scenarios (discussed below).
Scenario 3: Conducting a baseline when implementation has already begun
In some situations, it might be necessary to conduct a baseline after implementation has begun. Depending on the dose and duration of implementation prior to the delayed baseline, this scenario may make it difficult to cleanly estimate the full extent of change that occurs as a result of the program. This is likely to underestimate program impact if, for example, teachers are already using training guidance or program materials are in the classroom by the time the baseline is administered. In this scenario, it is advisable to consider supplementing baseline estimates with implementation fidelity measures that can help to gauge the extent to which implementation has already begun (e.g. books are in use or new methods are being used in the classroom).
Scenario 4: Conducting a midline/endline for which data are needed as soon as possible
Since the purpose of a midline or endline evaluation is to estimate the gains achieved by a program, it is recommended that these data collections either occur with the same participants or the same schools as prior time points (e.g. longitudinal or cohort design) or with new participants at the same point in the school year (e.g. repeated cross-sections). In either case, it is essential to revisit your research questions and to determine if a delayed data collection effort can still address your purpose. Due to the shifting timeline, it may be advisable to supplement your midline/endline with monitoring or other data collection efforts (e.g. internal reflection and evaluation).
Considerations for longitudinal or cohort evaluation designs
For a longitudinal design, the implications of delaying a midline or endline are rather straightforward. The later data collection activity will simply provide an estimate of the same students over a longer period of time than originally anticipated (clearly strengthened by the inclusion of a control group). However, it will be important to factor school closures and breaks into account, as the longer time span may not actually equate to additional instructional time (and in some cases will actually mean less instructional time than intended).
Considerations for repeated cross-sectional evaluation designs
With this type of design, comparable estimates are significantly complicated by data collection delays. For example, let’s assume that baseline data for your program were collected at the end of the school year two years ago (ex. July 2018), using students in grade 2. Midline or endline data collection was expected to occur at the end of this school year (ex. July 2020), with grade 2 students. However, due to school closures, this window will be missed. Due to political imperative (or the end of a program in the case of an endline), it is also not possible to delay data collection until July 2021 (i.e. the “gold standard”). Therefore, midline/endline data collection will now occur at a different time of year than baseline and will undermine the assumptions underlying the originally-designed comparison.
As previously mentioned, a potential solution would be to conduct the midline assessment at the beginning of next school year, using students who had graduated to the next grade (i.e. grade 3 students in September 2020—at the start of the school year). However, this choice has serious limitations. Literature on summer learning loss has shown that students at the end of the school year tend to perform better on reading and math assessments than they do at the start of the following school year. In addition, many low-income countries are susceptible to large demographic shifts between the end of one school year and the beginning of another (stemming from issues such as dropouts and large fluctuations in enrolment and attendance at the start of each school year) – and these shifts might be greater and unexpected given the potential impact of COVID. Moving up a grade for the assessment will not provide a strong proxy for where students would have been at the end of the prior year. While having a control group does mitigate some of these concerns, progress toward targets will likely still be impacted. For those with pre-post only designs, it is even more essential to revisit the purpose of your assessment and determine if this strategy meets your needs. There are instances where the best decision may be to forgo the assessment.
Final Thoughts
It is essential to have an understanding of the implications of any decision that is made with regard to delayed data collection efforts and these decisions cannot be made in isolation. Typically, this scenario planning will include a range of stakeholders, from governments to donors to implementing partners. In some cases, due to political or funding realities, it may not be possible to shift the dates considerably. However, it is essential to plan carefully and thoughtfully, with a focus on the purpose of your evaluation and what would be lost by forcing data collection at a time that undermines the intended and current purpose. We need to resist the temptation to retro fit some type of “box checking” assessment, simply because it was in the original plan or to conduct an assessment as soon as schools reopen simply because it seems convenient and/or politically expedient. If the purpose of an assessment has changed, don’t be afraid to change the design and timing of the assessment – including smaller scale, qualitative efforts.
COVID 19 presents many challenges to education and learning but it also provides an opportunity to rethink conventional ways of doing business. When revising assessment schedules, it is important to continually return to the purpose of the data collection and what is possible NOW in order to collect meaningful data to ensure an accurate picture of learning.
Co-authored by Tracy Brunette and Jonathan Stern.
This blog has benefitted from input from Kellie Betts, Matthew Jukes, Ben Piper, and Maitri Punjabi
[1] Here we focus on assessments that have one of the following three (common) purposes: determining baseline estimates; evaluating the impact of a program at a later time point (midline or endline); and monitoring program progress.