The Muddy Language of Teacher Evaluation


“We need high-quality teacher evaluation systems to make sure that every student is taught by a quality teacher.” 

[Note: For those of you prefer your commentary in cartoon form, click here.]

We see statements like this everywhere in education reform, and they’re worded in such a way that it’s hard to disagree with them.  Do you want to make sure every student has a quality teacher?  Of course, who doesn’t?  To do this, you need a way to gauge teacher quality, right?  Sure.

But there’s something else going on here.  Something that gets blurred by the vague language of the statement.  Something that has to do with the mechanism behind “making sure that…”

When we evaluate something (our dietary habits, for example, or a car we’re considering buying, or a teacher’s practice), it’s helpful to figure out whether we’re seeking information in order to improve the thing (“formative evaluation”) or decide whether to keep or reject the thing (“summative evaluation”).

Presumably, when we evaluate our eating habits, we have no intention of stopping eating; rather, we want to see what’s going on, what’s working, what’s not working, what we can change.  That is, we’re carrying out a “formative evaluation.”  But when we go car shopping, we’re looking to make a decision, a definitive, thumbs-up-or-thumbs-down decision; we’re engaged in “summative evaluation.”

What about when we evaluate teachers?  Clearly, we need to evaluate for both of these purposes.  Teachers need information about their practice so that they can reflect on it and improve.  And administrators need information about the quality of work their employees are engaged in, in order to make personnel decisions.

Formative teacher evaluation makes everyone feel warm and fuzzy inside.  We picture dedicated, reflective teachers, studying their craft, honing their skills, becoming the best that they can possibly be and raising up their students in the process.  And summative teacher evaluation is, honestly, a little awkward.  Because, let’s face it, it amounts to firing people.

The clever double-speak that we sensed in the opening quote (which nearly everyone, from every corner of the reform debate, is patently guilty of) is this: there is no acknowledgement of the difference between formative and summative teacher evaluation.  All too often, ambiguous language masks the very real question: How exactly we are going to make sure that every student has a quality teacher?

By improving our existing teaching force?  Or by “firing our way to the top,” as Mr. Duncan so delightfully put it?  The solution is, of course, that we need both formative and summative teacher evaluation.  But unless we recognize that both exist, unless we get behind the blurry language and articulate this distinction, we risk derailing potentially productive policy conversations and descending into sound-bite-ridden shouting matches.  That’s a reform strategy, I think we can agree, goes absolutely nowhere.



End Homelessness…Leave No Child Behind…


Putting my feet where my words have been, my family and I will be walking in the LA Homewalk this Saturday to raise money to end homelessness.  Here’s the link to our page in case anyone wants to support the cause with a donation.

But I’m a pretty terrible salesperson, and I can’t say that I’m convinced that the goal is doable.  Can we really end homelessness?

I guess we could, if we could all, or enough of us, agree that it was a priority.

But it’s gotten me thinking about all these audacious goals:

  • No Child Left Behind: no child will be below grade level by 2014
  • John Deasy: every student in LA will be taught by a good to great teacher
  • United Way: end homelessness in LA
  • Teach For America: One Day, all children will have an equal opportunity for an excellent education

Used judiciously, lofty goals inspire collective action.  But, used wantonly, they risk insulting the  efforts of people who have been struggling to make little differences every day, and disillusioning others who might otherwise want to pitch in and help out.

Don’t Throw the Standards Out With the Bathwater


An essay by Carol Burris was recently published in the Washington Post that discussed elementary school math tests aligned with the Common Core State Standards (CCSS).  I love reading Burris’s commentary.  She’s a talented writer, an astute observer and critic of current education reform policies.  But this essay doesn’t sit quite right with me…

Regardless of whether you are a fan of the CCSS or not, it’s really important to distinguish between the Standards and their implementation.  That is, let’s be clear that there is an important difference between

·         the actual document (you know, the one that teachers are going to keep in a binder on their desk and refer to when they have questions about what their students are supposed to learn during the year)

·         all the other stuff (you know, the fist-banging, hackles-raising, growl-inducing stuff…  like roll-out policies; professional development for teachers; implications for teacher evaluation systems; high-stakes, standardized tests; intended and unintended consequences; implications of inflexible grade demarkations; blah; blah; blah)

I’m afraid Ms. Burris’s essay conflated these two things.  What she did was: 

(1)   analyze a particularly heinous example of a CCSS-aligned, first grade math test

(2)  trash it (justifiably) for its developmental inappropriateness

(3)  parlay this into beef with the actual CCSS document

In a nutshell, I think what Ms. Burris overlooked, and what it would behoove us to remember, is that there are developmentally appropriate ways to assess rigorous, conceptually distinct  standards for young students.  These methods are certainly not the norm, but they exist.

In my teaching career, I found two early childhood assessments that were so developmentally appropriate, that were so well designed, that yielded so much immediately useful information for me as a teacher, that I would have chosen to administer them to my students even if they hadn’t been mandated by my district. 

These two assessments were administered one-on-one, involved myriad manipulatives (counters, dice, little plastic bears, flashcards, letter cards, game boards, tokens, geometric blocks, etc.), and were structured as interactive activities.  Honestly, students thought they were games and would argue and clamor to be tested.  And because these assessments were so thoughtfully designed, when the testing window closed, I would have a very clear picture of my students’ current levels of understanding.

My point is this: don’t throw the baby out with the bathwater. I will be the first in line to voice my concerns about how the Standards are being implemented.  But it’s worth remembering, in any discussion about the CCSS, that at the heart the debate, behind all the yelling, is a surprisingly traditional, well-articulated standards document that is endorsed by the vast majority of teachers.




When I was a kid in Michigan and we drove over a bump in the road, or someone slipped, or made a mistake, my mom would exclaim, “whoops-a-daisy!”  Just last week I learned the correct spelling.

John Deasy, LAUSD Superintendent, almost resigned last week, but then didn’t.  According to a local paper, he threatened to resign earlier this summer, too.   In September, he began distributing iPads to every student.  A week later, implementation stopped in its tracks.  (Something about shocking revelations that students hacked into the devices to play games, and apparently no one had thought about what to do if the iPads got lost, broken, or stolen.)  In August, the iPad plan was going to cost $1 billion.  In October, cost estimates rose.

What to make of John Deasy?  He speaks earnestly of improving student outcomes.  Test scores and graduation rates are up, suspensions down.  Yet, he keeps rocking the old LAUSD Impala into every iPothole in the road, careening from one controversial decision to the next.

Perhaps it’s time to ask some deeper questions: Why do we keep bouncing from crisis to crisis? How might we change this pattern?

As I’ve thought about these questions, I keep coming back to two things:

First, the time I heard John Deasy speak at UCLA about a year ago.

Second, a paper that Donald Campbell originally wrote in 1971, which, coincidentally, was the year I was born.

First, when I heard John Deasy speak, he started his speech by asking if anyone in the audience was a student of mythology.  I thought he said methodology, so I raised my hand, and then was rather embarrassed when he asked me to comment on the Greek myth he had used as an analogy.  I say this by way of struggling to excuse my silence when he said something that actually did need correcting.  I’ll try to atone for that now.  But I’m also struggling to give you a sense of John Deasy’s charisma.  He’s a good speaker.  He tells interesting stories, and he cares about the things he talks about.  That night, he was talking about improving teaching, improving teacher evaluation, and not accepting excuses.

“I know some people are uncomfortable about this,” Deasy said.  “But if you want ‘uncomfortable,’ come with me to visit some of our 23,000 students who are homeless.  Not roofless, homeless.  These are kids who have no shelter to sleep in.  They are coming to school with incredible challenges in their lives.  And the only hope they have is to graduate career and college ready.  And the only hope they have for that is to be in a class with a good or great teacher, not just one year, but every year.  That’s our challenge.  We need to provide every student with a good or great teacher.”

This is not an exact quote, but it is close.  And he has expressed similar sentiments on other occasions.  23,000 children in LAUSD are homeless, and the solution is…create a teacher evaluation system so that they have better teachers.

He probably didn’t quite mean that a teacher is more important than a home, but it got me thinking.  From a certain perspective, his logic is compelling and even unassailable.  The Superintendent’s primary, perhaps only, responsibility is to ensure that every student gets a good to great education, and the teacher is the person primarily responsible for delivering that.

Yet, when John Deasy wonders why substantial numbers of teachers are suspicious of his plans (a union poll in April found that 16,000 teachers had “no confidence” in him while only 1,600 expressed support), he might also wonder why he speaks of the problem of homeless students as a reason to revamp teacher evaluation, while other area leaders, such as Kobe Bryant, are working on more direct solutions.  There are realistic, research-based plans that hope to provide homes to every child, and every person, in LA.  But no one questioned Dr. Deasy’s rhetorical flourish that day.

Perhaps we’ve reached a point where no one is actually listening to anything.  One side favors reform.  The other side supports teachers.  Winning requires exaggerating your side, and vilifying the other.  Listening not required, even to your own words.

It’s clear that John Deasy is committed to making teacher evaluations tougher and making teachers better.  But what if, fresh from the stress of facing his own rigorous evaluation, he stopped and said out loud,

“Whoops-a-Deasy!  Teacher evaluation in LAUSD needs changing.  But, we can’t ignore the need to end homelessness, fight poverty, and address the many other needs of children and parents in LA.  Schools and teachers can’t fix all our problems by themselves, and a good teacher is no substitute for a good roof.  So this year, I’m going to ask Warren Fletcher, and every teacher, principal, nurse and bus driver, to join me (and Kobe) at Homewalk to raise money to end homelessness…”?

What if, in other words, John Deasy used his high profile position to fight alongside teachers for broader improvements in the lives of children?

And, here’s a second what if: What if we all stopped making exaggerated claims for our preferred solutions in our perpetual debates, and instead admitted that every idea has limitations?  We might then try to design policies that give us some information.

This brings me to Campbell’s 1971 paper.  Campbell argues that social scientists should prepare for, and strive to develop, an “experimenting society,” a society that is “scientific in the fullest sense of the word,” based on “honesty, open criticism, experimentation, willingness to change once-advocated theories in the face of [new] evidence.”  Campbell contrasts such a society from “an earlier use of the term scientific.”  In those early days, before I was born, people apparently applied the word “scientific” to talk about how one scientific theory is judged as true, and then “on the basis of this scientific theory, extrapolations are made” to design optimal programs.  In other words, in the old days, some people used “scientific evidence” solely to justify their own preferred theory.

But what if, if we have come to believe that every student might need an iPad, we followed Campell’s advice and continued to test this idea as we implement it?  What if certain schools were assigned iPads for every student, other schools were assigned Nexus tablets, and other schools were provided with a school nurse or a music teacher?  Is anyone on the School Board or in the Superintendent’s office completely convinced that the students who received iPads would be better off?  Campbell hopes that, “Just as in science objectivity is achieved by the competitive criticism of independent scientists, so too the experimenting society will provide social organizational features making competitive criticism possible at the level of social experimentation.”  An iPad experiment seems like just about the perfect place to start such a society, since no one really knows what the effects of giving every student an iPad will be.

Campbell’s vision could work in other areas as well.  Instead of rhetorical battles over competing theories of teacher evaluation, why not test out the district’s evaluation plan in some schools and UTLA’s plan in others?  (Disclosure: I helped work on UTLA’s plan, so I happen to think it’s a pretty decent one.)

But instead, we still seem stuck in the “earlier use of the term scientific,” stuck in the days before I was even born.  Stuck in a society that pays homage to scientific evidence, but plays hooky on practicing the scientific method.

In fact, it may be worse than that.  We may get the worst of “science” without its benefits.  We are addicted to technocratic solutions, so we rush to embrace sophisticated value added models and smooth iPads as the answers, but we have no patience for scientific debate, so we shout loudly for our preferred solutions and never take the time to figure out what might really be working.

Three more lessons we can take from Dr. Campbell’s birthday present to me: First, he speaks of the real possibility that he is wrong in advocating for an experimenting society: “we should keep open the possibility that we will end up opposing [the idea we are now advocating]”.  So we all need to be quick to exclaim, “Whoops-A-Deasy!  My bad.  Let’s try that again in a different way.  What do you think we ought to do?”

Related to that, Campbell lays out the drawbacks of an experimenting society, including the genuine fear of evaluation and the “measurement machinery” that “is understandably feared, and is more to be feared the more elaborate and scientific it appears to be.”  Sounds like a dead ringer for many people’s fears about value added models for teacher evaluation.  And it seems like Dr. Deasy is less than fond of harsh evaluation himself, so perhaps we can all remember to assume the best of intentions in our rivals, even when they seem afraid or suspicious of “reform.”

Finally, he calls out Machiavelli by reminding us that we will never get to the “asymptote of perfection….Ends cannot be used to justify means, for all we can look forward to are means. The means, the transitional steps, must in themselves be improvements.”

So, if all we can be sure of is that neither a new evaluation system nor a new iPad will bring about nirvana, then perhaps we’d better be sure that we decide on the new gadgets to try out through transparent procedures which will at least teach us something in the process.  After all, isn’t this whole thing supposed to be about learning?