An IEP is not a SEP…


By definition, standardized tests are designed in such a way that the questions, conditions for administering, scoring procedures, and interpretations are consistent and predetermined.  Look carefully at this description: “consistent,” “predetermined.”

Having met Mr. Duncan some time ago, I am quite disturbed by his recent decision to expand the uses, interpretations, and accountability measures associated with the test scores of Students with Disabilities (not to be confused with the term, “SPED” students).  The reality is that Students with Disabilities have long been exposed to the wonderful world of high-stakes testing.  And do you know what they get labeled?  We talk about these students as a “subgroup,” often “below average,” “below basic.”  And do you know why?  Because many of them enter the world of academic rigor “below the norm,” as measured and defined by psychological and psycho-educational evaluations, eligibility reports, and a host of other nationally normed evaluations.

When we engage in a conversation about the true academic abilities of Students with Disabilities, we have to consider more than their performance on random, out of context reading passages about “Astronauts!” on standardized tests.  We have to consider the whole child.  We have to consider the nature of these assessments, because paper and pencil tests and scantrons are not for everybody.  But Duncan’s decision doesn’t seem to take that into account.

As Special Educators, we fight tirelessly.  Do you know how long it takes to get the student who is three grade levels behind to a point where he is only one grade level behind?  All we have is 180 days…  180 days to rewrite what this student has been thinking about himself for quite some time: “I’m below basic, I’m not good enough” … “inadequate” … “failure.”

I am not saying that Students with Disabilities should not be exposed to standardized testing.  They have been for years.  But what I am saying is that this “standard” decision needs to consider the very nature of an IEP: it’s an “Individualized Education Plan,” not a “Standardized Education Plan.”

How can we as Special Educators work year after year to help students master goals that are individualized only to turn around and say, “I know you can’t add two-digit numbers, but I want you to take this standardized test where half the questions involve adding with two-digit numbers so that I can see where you are.”  Umm, what?

Look, after certain early developmental stages, students recognize and realize when they just can’t do something.  And they know that their teacher knows… Think of what it does to the trust and understanding between a student and teacher when that student has to sit in front of that teacher and repeatedly fail standardized tests.

Mr. Duncan needs to be in the presence of Students with Disabilities who are assessed with DIBELS, for example, an early literacy assessment.  These students’ IEPs may stipulate “extended time,” but DIBELS administration prohibits it.  The students never get close to scoring “benchmark,” and they know it.

Mr. Duncan needs to have a conversation with the students who are overwhelmed on a standardized test of 40 questions.  If these students’ IEPs mandate chunked and tiered assignments, how can we be surprised when they are unable to finish the test?

Mr. Duncan needs to spend time with the student who has limited working memory and processing speed, so that he knows how this student feels when trying to respond to even a single question in a “standard” way.  How can we be surprised if the student quickly bubbles in answers, just hoping to get the process over with quickly?

Mr. Duncan needs to be in a room with a student who cries from anxiety during a high-stakes testing session because he is overwhelmed trying to decode all the words in non-fiction passage, just so that he can finally get to all the comprehension questions.  True story.

Again, I am not saying that Students with Disabilities should not be exposed to standards-based measures, but I am opposed to Mr. Duncan’s one-size-fits-all approach.  I think the addition of a portfolio assessment, for example, would give us a more robust view of what students are capable of doing.

The reality is that Students with Disabilities sometimes come into our schools with disadvantages that are beyond their control (i.e., Autism Spectrum Disorders).  As Special Educators, it is our job to assure our students that growth is possible, growth matters.  But this must be individualized growth, not standardized growth.

 Alexis Mays-Fields is a, elementary school Special Education teacher in Washington D.C.



Last week, Russ Whitehurst (director of the Brown Center on Education Policy at the Brookings Institute in Washington, DC) published an interesting essay proposing a novel approach to standards-based accountability.  I’ll summarize his main point below, but what I’d really like to talk about in this post is something that happens in the eighth paragraph of his several-page essay.  In two sentences, Whitehurst makes a common rhetorical move that I’ll call “bracketing.”  I want to discuss how this move affects our discourse and thinking about educational quality.

Whitehurst expresses concerns about melding the new rigor of the CCSS with the impractical “100% proficiency for 100% of students” approach of NCLB.  His novel solution involves a two-tiered accountability system: states and the federal government would be in charge of minimum competency standards, and schools and districts would take care of anything above and beyond the basics.  I trust this is a fair, if brief, summary of his article.  Now, on to his eighth paragraph…

“Note that my focus is test-based accountability.  Other things I’ll not cover here, such as students’ aspirations and soft skills, are important too.”

In his title (“The Future of Test-Based Accountability”), Whitehurst makes it clear that his discussion is about how to use test scores, not about what test scores tell us (or don’t), how they shape teaching and learning, or their unintended consequences.  These are all concerns that he chooses to “bracket.”

Careful thinking about the complex issues involved in education reform often requires us to set aside (or “bracket”) certain issues in order to narrow our focus and examine a particular issue in depth.  It is certainly defensible, then, for Whitehurst to “bracket” what he considers to be nonessential concerns within the context of his paper.

How does “bracketing” work?  Basically, by contracting the scope of the conversation.  It is a powerful silencing move because it essentially disallows discussion of the “bracketed” topic.  When the same issue is bracketed again and again, in a variety of contexts and discussions, then bringing it up can become difficult… you start to feel like that student who keeps raising his hand to say the same thing.  At some point, you begin to sense that the other students are rolling their eyes and getting irritated, and so you decide just to let it go.

This can become problematic in conversations about educational quality when citizens, policymakers, and researchers habitually (almost reflexively) “bracket” the same set of concerns and – this is the dangerous part – neglect to “un-bracket” them.  It seems to me that this has happened with the very concerns that Whitehurst “brackets,” concerns about the centrality of test scores in our concept of educational quality.

Recently, Arne Duncan announced a shift in federal policy that involves an unprecedented use of special-needs students’ standardized test scores. Predictably, he “brackets” the same issues Whitehurst does.

Next week we’ll hear from a special educator from Washington, DC, who will argue passionately against Duncan’s “bracketing.”  She’ll paint a very real picture of “students’ aspirations and soft skills” and argue that, particularly for special-needs students, the consequences of “bracketing” are just too high.


The Added Cost of Data


As a new contributor, let me just start with a big thank you! The amazing thing about this blog is it’s willingness to consider all voices and the value placed on the voice of the teacher. ¡Gracias!


This post is a response/addition to “The Cost of Data” written on June 16.

As a classroom teacher, I can very must attest to the ‘cost’ of data in terms of instructional time; however that is not the only cost of data. As data continues to be used to secure funding, open and close schools, and hire and fire teachers, it’s important to also consider the cost following: the type of assessments collecting the data, and how the data is being used in the context of the day-to-day teaching of children.


In terms of the types of assessments being used, I think that there should be more transparency between the big corporations who profit from selling standards aligned, PARCC aligned resources and those who decided to use the new kinds of assessments. While I think PARCC is heading in the right direction with the type of performance tasks that are based in the real world and promote critical thinking and problem solving, there is disturbingly little information on how these assessments will be adapted and used in the primary grades (a shocking trend). After witnessing testing anxiety in my six-year-old students, we have got to change the way we talk about testing and consequently data. The tests need to be developmentally appropriate and vetted by the people who actually have to use them: teachers. Amazingly enough, PARCC has posted sample assessments and asked for feedback, though I wish more teachers knew about it so they could actually give feedback.


When teachers don’t understand where the tests came from, the background of their development, or even how they address the standards, the cost is a wealth of information that doesn’t know how to be used to inform instruction or shared with parents about student performance. The cost is not only a rise in student anxiety, but also teacher anxiety as they teach to and prepare for a test they don’t really understand simply because the results are so important.


This leads me to the ‘how’ of data. In the daily life of teaching, data is used many ways: to make small groups, to decide what and when to reteach, to differentiate instruction, and to create and monitor interventions. When used correctly, data can be the best tool to tailor instruction for students. I’ve used it myself to engage parents in supporting students at home. I’ve seen it create ‘lightbulb’ moments where parents really see their children for the first time. Yet, there is a dark side to data as well. I have watched administrators sit in ESL or SPED meetings and use data to stereotype and pigeon-hole students. I’ve seen it used to bully parents instead of inspire them. I’ve seen it used to scare and intimidate teachers instead of using it to help them grow. I’ve seen it shared with students in an effort to ‘motivate’ them to do better only to leave them crying and wounded.


Data can be a wonderful tool used to make schools and teachers better, but there is an added cost when the assessments are foreign and tied to high-stakes and the data isn’t shared in a constructive way. The only way to keep making progress while using data is to have a conversation about it. Kudos Glo for keeping the dialogue going 🙂



Should we shoot for the moon, or aim for improvement?


A few weeks ago, I read a post that is still sticking in my craw.

Shooting Bottle Rockets at the Moon: Overcoming the Legacy of Incremental Education Reform

The author, Thomas Kane, argues that we need to stop tinkering and institute more drastic reforms in order to catch up to the highest-performing countries. He has written and researched extensively on teacher evaluation systems, so his voice is an important and informed one.

But I disagree with nearly everything he said.

I found only one area of agreement:  “In education…we do not pause long enough to consult the evidence on expected effect sizes and assemble a list of reforms that could plausibly succeed in achieving our ambitious goals.” Most of us can probably agree that education reformers do not pay enough attention to the relevant evidence, and I think this lack of attention extends beyond the expected effect sizes into things such as the limitations of the evidence base, the generalizability of the findings, and the extent of contradictory evidence.

But the parts that are sticking in my craw are pretty much everything else.

I’ll start with the title and the underlying premise. Kane argues that we have a “legacy of incremental education reform” that needs to be “overcome.” As a classroom teacher for nine years and a researcher and policy analyst for seven more, I nearly choked on reading that headline and I still can’t get over it. What legacy is he talking about? When I was young, education reform lurched from one end of the pendulum to the other, from whole language to phonics, from new math to back to basics, with lots of debate and idoelogical rancor in between. As I got older, education reform shifted to lofty platitudes with few specifics. Remember Goals 2000? That was the plan that basically said we’re going to fix everything about education by the year 2000. That was followed by No Child Left Behind, then Race to the Top. Do those sound incremental? Currently, our high-profile reform efforts center on the Common Core. Overall, I think the Common Core is a positive step to move toward deeper conceptual development, but it’s hard to argue that overhauling the standards, curricula, and assessments of 44 states within the space of a couple of years is “incremental” change. It seems much closer to the truth to suggest that we are rushing things just a bit and might want to consider “pausing” enough to move at a more incremental pace in order to implement the Common Core more carefully.

Turns out that Kane is talking about reforms such as “better professional development for teachers, higher teacher salaries, incrementally smaller class sizes, better facilities, stronger broad-band connections for schools, etc.,” But those are hardly our highest profile education reforms.

Second, Kane states that those incremental reforms he mentions aren’t big enough to get us where we need to go, and he then proposes a set of four elements that “could provide the needed thrust.” Two problems with this:

A) It’s not at all clear that those “not-big-enough” reforms that Kane disparages are actually incremental. They are mostly things that we haven’t even tried to do on a large scale. And it’s not even certain that we know how to do them on a large scale. Can anyone point me to an example of a large district or state that has actually implemented “better professional development?”  Does anyone know how to implement better professional development on a large scale? What about $130,000 teacher salaries? If the state of North Dakota, flush with money from natural gas drilling, implemented $150,000 teacher salaries across the board, would anyone call that an “incremental” change? The class size example might even be more pertinent because we had good evidence that lowering class sizes in Tennessee worked, but when that evidence was applied to the Californian context where the policy resulted in hiring many thousands of rookie teachers, the inexperience of all the new teachers appears to have wiped out any benefits that might have been created by the lower class sizes. I don’t think we are in a position to quibble over effect sizes because we are still largely in the dark about the effects themselves.

B) The four “elements” of reform with sufficient “magnitude” and “thrust” are themselves incremental improvements at best. Kane argues that the best of them might produce .045 standard deviations of improvement per year. I don’t buy the evidence for his argument on some of his preferred reforms, but even if he is right, it’s hard to describe .045 standard deviations of improvement as anything more than an “incremental” improvement. Especially when we consider that in the one large random experiment that we have, class size reduction in Tennessee Kindergartens resulted in about 0.2 standard deviations of improvement after one year. For those skimming, that means class size reduction produced an improvement over 4 times larger than the largest of Kane’s preferred reforms. It is important for me to note that this result faded out somewhat over time, additional years of small class sizes did not add to this effect, and these results have not been replicated in other studies. But then, those other studies are widely considered to be less reliable. So the weight of the evidence might suggest that we should drastically lower class sizes in all U.S. Kindergartens.

On to my third big gripe with Mr. Kane. The reforms he proposes are not any bigger in size, nor any better in terms of their evidence base. They’re just more controversial. And that seems to be his true subtext. We can’t be so namby-pamby in education. We gots to start hurting people’s feelings and firing teachers if we want to compete with the Chinese. But of course, the policies he is proposing are controversial for many good reasons, not the least of which is that we really have no idea of what the unintended consequences would be if we were to, say, take his suggestion and not retain (or “fire”) the bottom 25 percent of teachers on value-added measures at tenure time (usually after two to three years of teaching). We have a difficult time recruiting top-notch students into teaching as it is. Will anyone with a modicum of understanding of statistics choose to enter a job knowing that they may be fired after two years based on a measure that has so much noise that a teacher who is rated at the 43rd percentile has a margin of error that ranges from the 15th percentile to the 71st (Corcoran, 2010)?

Another problem with his premise: why should we accept that incremental reform is somehow less than some grandiose promised moon shot? A large part of our ongoing crisis in American education stems from our propensity to lurch from one silver bullet solution to the next without enough focus to actually make any solution workable. Teachers know this. A continual gripe from teachers is how the district has abandoned last year’s pet reform in favor of a new approach that teachers are expected to quickly master with little to know support, all the while knowing that this new approach is almost sure to be forsaken within mere months. Instead, the international evidence suggests that countries such as Japan have achieved long-term, ongoing growth by providing a structure in which teachers work together to create incremental improvements in instruction. These incremental improvements add up to real learning. It’s hard to see what other types of improvements could really be possible in a field as complex as human learning. So I take back my third big gripe with Mr. Kane. Sort of. It’s ok that he didn’t find any bigger-than-incremental reforms to promote. There aren’t any. But it’s not ok for him to pretend that he has found some giant-sized solutions when he really hasn’t.

And, yes, I’ve got more gripes. Such as, why is closing the gap with China a “necessary goal?” If the Chinese are truly improving their education system (which is, by the way highly debatable since there is a lot of evidence that the highly touted results in Shanghai come from only testing a small slice of the best students, but anyway), if China is improving their educational system, we should celebrate that fact and rejoice in the hope that poverty, hunger, and human misery will be substantially reduced. The same is true for the improvement in any country. The growth of other countries is much more likely to lift all of humanity than it is to prove a threat.

We do, however, face real threats. Huge ones. Here are three that spring to mind: Climate change, income inequality, and the rapid pace of technological change that is projected to eliminate 50 percent of all current jobs within a generation. What are we doing to prepare for those threats? Are any of them likely to be met by increasing our PISA scores? Or do we need to begin to focus our educational reform efforts more broadly? Perhaps we should be developing involved citizens who are able to think critically and resolve political disagreements amicably. Or stretching children’s creativity and ability to adapt to new situations?

If those types of reform goals were met, they might very well bring along with them improved PISA scores and a closing of the gap with China. But they might not. And if we were able to end global warming, reduce income inequality, and find new jobs for all of our children, why would we care?

– Kevin