Measuring What Works


Educational research has existed for hundreds of years, with people from various backgrounds using a large variety of methods to try to answer a set of simple questions: “How do students learn? What contexts/curricula/teacher behaviors are associated with promoting student learning?”

While I have spent many of the last years thinking about research on educational contexts and practices, two separate articles that I read this week brought a clear perspective to my thinking about the role of research in education. The first was published in the New York Times as a part of a special issue Learning What Works.  This article describes the Institute of Educational Sciences, an office within the federal Department of Education, and the use of the “gold standard” in research: the randomized clinical trial. For a long, long time, almost all research in education was done using small scale, often qualitative, and rarely rigorous research methods. Full disclosure: I am a full-blown quantitative researcher whose graduate work is currently sponsored by the Institute of Educational Sciences, so I am not the least bit impartial in this discussion, but hear me out. I say that the previous research was not “rigorous” because in educational contexts, along with almost any other real world situation, there are a vast number of influences on students’ learning and teachers’ practices, and it is very difficult to isolate the effect of a particular teacher training program or reading curriculum on the outcome of interest. What works in one environment may fail horribly in another. This is why the randomized trial is trusted throughout so many fields of research. Random assignment is one of the few ways of actually isolating the effect of a certain treatment on a population. However, there are all kinds of ethical and logistical problems with random assignment in many educational contexts, which is partially why the implementation of this method in education has been very slow.

This point brings me to the second article of the week, an article by Atul Gawande  published in the New Yorker in July 2013. The article contains a long but fascinating discussion of why certain ideas and innovations take off and are implemented widely in a short period of time, while others languish and are not implemented for decades or sometimes even at all. The author follows the trajectories of the use of surgical anesthesia and antiseptics in medicine. Both were discovered in the nineteenth century, and while anesthesia was routinely used across hospitals in the U.S. and Britain within seven years, the use of carbolic acid and other cleansers for cleaning hands and wounds during surgery took decades to truly catch on. The article goes into a lot of details about why certain ideas, including simple, lifesaving solutions to medical problems, are developed by scientists and researchers but don’t catch on with the general public, and I am not going to discuss most of it here (read the article! It’s worth your time.). To summarize one of the main points, the ideas that often stall “attack problems that are big but, to most people, invisible; and making them work can be tedious, if not outright painful.”

It may be a stretch, but to me, that sounds awfully like the problem of how so much educational research is being outputted without the use of strong research designs that allow for the ability to truly decide whether a program/practice/curricula works and if so, with whom. It has been known for decades, if not centuries, that randomized trials, when available, provide the most definitive results. Before I create a giant uproar, I will add the caveat that other methodologies can provide a great deal of more detailed, nuanced information about students and characteristics of programs that impact the effectiveness. But to truly test if something is working, randomized control trials are the best approach.    Furthermore, when randomized trials are not ethically or logistically feasible, there are other rigorous approaches that can mimic the effect of randomization and lead to stronger causal inferences than a standard non-randomized treatment vs. control group comparison. Yet doing either of these approaches can be expensive, difficult, and for those without training, nearly impossible. Part of the problem is that so many researchers in education do not receive training in rigorous methods, and are unexposed to the pitfalls of many research practices. However, another problem mirrors the problems described in the New Yorker article, which is convincing people who already know something is the best practice that it is worth their time to do it. This issue is not just on the researcher side, but also among educators who trust programs or curricula that have never been tested or are not supported by rigorous evidence. As Joseph Merlino is quoted as saying in the NY Times article, ‘“A lot of districts go by the herd mentality,’ citing the example of a Singapore-based math program now in vogue that has never been rigorously compared with other programs and found to be better. ‘Personal anecdote trumps data.’”

But how do we change practice to emphasize rigorous methods? The New Yorker article states that the key is not big public awareness campaigns but rather a one-on-one approach. As the author writes, “Simple ‘awareness’ isn’t going to solve anything. We need our sales force and our seven easy-to-remember messages. And in many places around the world the concerted, person-by-person effort of changing norms is under way.” So far, the Institute of Educational Sciences (IES) has not done a very good job of getting the word out on why it is so important to rigorously review the programs implemented in our schools. I will be interested to see in the next decade whether IES and like-minded educational researchers across the country are able to use the lessons learned from other fields to promote rigorous research methods in education, or if these methods continue to be used by just a few isolated researchers in the ivory tower with little effect on educational practice across the country.


– Megan


One thought on “Measuring What Works

  1. Reminds me of a story I just heard: we have almost no experimental data on the question of which charities work. A few people have started Give Directly, on the premise that the best way to help poor people is to simply give them money. They want to run direct experiments – for instance giving one poor village money, and giving another village cows and training, etc., but many people in the aid community seem appalled, one saying “these are people. We can’t experiment with their lives…”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s