Insight
ELA scores — the uncut version
May 21, 2009 12:49 PM
If you are in an elementary or middle school, you’ve probably been talking about the 3rd-to-8th-grade ELA test scores, with their good news for schools that showed large gains, and with piles of new data to digest.
Most of the talk will be about that “bottom line” number, the percentage at Levels 3 and 4. But that number is like the trailer for the movie. It shows the clincher — the big kiss, the final gunfight, or the momentous turning point. (Hey, we try to make these scores sexy.)
But the story behind the trailer — how the characters got to this defining moment — requires at least two hours of seat time. The story is always more complex, less clear-cut, and ultimately more revealing than that “bottom line” snapshot. In it resides a fuller, truer version of events.
The back story
This year, state officials reported the ELA scores in far more detail than ever before. With four years of test results available from new state tests first launched in 2006, the state education commissioner and new chancellor of the Board of Regents believed it was time to examine the scores in two additional ways: looking at actual “scale” scores rather than just percentages at Levels 1-4; and looking at groups of students over time as they move through the grades.
This is not to say that the percentages of students at Levels 3 and 4 is not an important indicator. In a standards-based system such as New York’s, the proportion of students who perform at or above the cutoff level that state testing experts determine is “meeting or exceeding standards” is basic accountability information. And it tells the public whether students are on track. But the cutoff points are somewhat arbitrary, even if the experts say they are scientific. And they give a pretty black-and-white view of what is really a very colorful universe.
These new analyses should help teachers get a more honest and detailed view of how their students and their schools are doing. And in truth, this kind of analysis is not new. It’s been done before — you may also have seen it in ARIS — but it wasn’t widely publicized.
Scale scores: A sober view
Scale scores are the actual numerical grades that students get on the tests, on a scale of about 400 to 800. They are not exactly the number of right answers, since the test makers adjust for the difficulty of each question. But scale scores are more detailed than performance levels, or even than those levels broken into tenths (such as a 2.8 or 3.5). And they work better as an indicator of students’ progress over time.
On the state tests, a scale score of 650 is the cutoff for Level 3.
Judging by scale scores, New York City students have shown steady but moderate progress over the last four years, not the irregular leaps and bounds that the changing percentages at Levels 3 and 4 show. Chart 1 [above] shows the mean (average) scale scores in each grade from 2006 to 2009. Except for some early falloff in grades 3 and 4, the pattern shows that each grade’s scale scores were higher than for that grade in the previous year. For example, 5th-graders in 2006 had an average score of 654, while in 2009 5th-graders averaged 669, an increase of 15 scale points.
There is some very good news here in the middle schools. The 6th, 7th and 8th grades show scale score gains of 16, 18 and 14 points, respectively, over the last four years, far more than the three- and six-point gains in grades 3 and 4. And, in fact, the state education commissioner singled out the middle schools for special recognition when scores were announced.
Scale scores would not correct for score inflation — the possibility still exists that scores rise simply because students are more familiar with the test. But they do correct for the distortions that occur when only levels are used to judge progress. A class whose average went from 649 last year to 651 this year may earn a pizza party, but that small bump over the Level 3 line may not reflect a lot more learning.
Pseudo cohorts: A growth story
A bit harder to follow but very important for gaining insight is a so-called “cohort” analysis. This tracks the performance of a group of students, starting when they are in 3rd grade and then following them through 4th, 5th, 6th, etc. The State Education Department presented what it called a “pseudo” cohort analysis this year, following groups of students over four years, while acknowledging that they may not be the exact same kids. Still, this way of looking at scores allows us to observe real growth.
What this analysis showed is quite remarkable. In general, as a group of students move up the grades, they do better on their annual tests. This is evidence that schools and teachers are adding value every year, more than simply helping students to get through the grade’s curriculum.
Chart 2 [above] shows this. For example, the Class of 2015 are 6th-graders this year, and they had an average scale score of 662. When they were in 5th grade they averaged a point lower (661); in 4th grade they averaged quite a lot lower (654); though in 3rd grade they were up at 661. So they have mostly trended upward and gained a point overall.
Notice, too, that the younger the class at present the higher its overall scores. The Class of 2015 is doing better than the class of 2014, which scores higher than the class of 2013. This year’s 3rd grade is starting at a scale score of 664, much higher than all three previous 3rd grades.
The growth is not uniform. The Class of 2013, which is in 8th grade this year, is down a point after four years. The class of 2014 recovered this year in 7th grade, but showed declines from 4th to 5th grade and from 5th to 6th. Still, the overall trend is up.
What these numbers cannot tell us, of course, is why the growth occurred. The mayor and chancellor would say it is because of mayoral control. Others find the progress random. But in presenting the results, State Education Commissioner Richard Mills was not in doubt. He pointed out that cities like Syracuse and Buffalo had made even greater gains and did not have mayoral control. Instead, he cited large increases in state education spending, the state’s implementation of a grade-by-grade curriculum linked to the standards, and teachers’ increasing skill at teaching it. Merryl Tisch, the chancellor of the Board of Regents, was equally certain that it was teaching that caused the gains. She told the UFT Spring Conference, “I come here to set the record straight. Scores went up because all of you went to work every day, and worked hard and worked as professionals. You all made it happen.” Second that.

