Data: Is it Reliable? And What do We do with it?

It’s been almost a couple of months since my last post, and I find myself thinking of data again.

Earlier this month, the Gates Foundation released its cumulative findings on its 3-year Measures of Effective Teaching (MET) research project. They recommend a balanced approach which included observations and student perception surveys in addition to achievement test scores. If you look at the data in the report, much can be gleaned, yet it’s easy to see that effective teaching is a very complex thing to measure.

Also in the local news this week, teachers from a two different Seattle Public Schools, for various reasons, have stated they are going to boycott the district standardized test known as the Measures of Academic Progress (MAP).

There are many reasons standardized tests cause anxiety among students, teachers, parents, and school leaders. Often they are used as sorting mechanisms (admissions into schools, teaching effectiveness, and putting students on a certain track are just a few examples). Yet, if one approaches the data from these assessments with more purpose (to set new goals, to inform ones teaching, provide meaningful feedback, or guide learning), these measures can be useful.

Data today is abundant, but is it the right data? How data is collected, analyzed, and interpreted; how reliable it is; and what we do with it can make all the difference. Though the Gates Foundation and those Seattle Public School teachers are doing it differently, I’m glad there are many out there asking these questions.

Is Quantifying Teacher Performance Akin to Flipping A Coin?

Last week, on the way home from school, I tuned into a story on the radio titled: “Seattle Releases First Teacher Ratings Based on Student Performance.” Data is great, but if you paid attention to the elections a few weeks ago, there were two kinds of math going on. Nate Silver’s Five Thirty Eight blog predicted 50/50 states. Karl Rove’s analysis of the data had him flummoxed. The difference was that Rove was emotionally attached, was eager to win, and for some reason his analysis of the same polls was way off. Alternatively, Silver simply plugged numbers into complex algorithms.

Mathematicians have noted that test scores and teacher performance don’t necessarily have a strong correlation, yet an incredible weight and cost is attributed to these standardized tests. Math professor Johh Ewing says, “You might as well look at all the teachers and flip a coin and those that get heads, say, are good, and those that get tails are bad, and it’s not much different from using one year of growth to measure teachers,”

Ewings paper, “Mathematical Intimidation: Driven by Data,” Looks at the potential pitfall of trying to create Value Added Measures to teacher evaluation.

Like the election examples earlier, we often attach a lot of emotion to the data creating a lot of noise. This noise had the potential to lead to bias. When a teacher says, “But I’ve done this for 20 years. I know this works,” it is evident that experience plays an important role. But is there bias involved. During those 20 years, did that teacher ever once control the experiment by not utilizing a particular skill? If so was the result the same, better, worse. Without trying to control for various things, how does one really know if what you do works. Is it just a feeling or is it based on empirical data.

Finally, there are so many things that make a good teacher: relationships with students, high expectations, integrity, care, leadership, collaboration, etc. Yet all of these traits can’t be tested for.

Standardized test scores are a reality and here to stay. As long as graduate schools use test scores as a tool to help with admissions, and undergraduate schools do the same, high schools and middle schools won’t have much of a choice. Elementary schools just follow.

There’s a dark side to this. Children as early as Pre-K are getting tutored in test preparation. Like the qualities of teachers, students have many amazing strengths and skills. However just because they struggle with test taking, potential doors my be closed without even giving the child a chance to show the brilliance that lies within.

And what about those 21st Century Skills – Critical Thinking, Communication, Collaboration, Creativity, etc. Will teachers drop integrating teaching these skills in order to meet the demands of the test scores? I hope not.

What is Assessment Literacy?

“Assessment illiterates do not understand how to produce high-quality achievement data and do not evaluate critically the data they use.”

 

Richard Stiggins, whose been a educational leader in assessment research wrote that in 1995. He has spent over two decades combatting years of “assessment training neglect for teachers and administrators.” He coined the phrase ‘assessment literacy’ and urges us to use assessments in meaningful ways. A huge focus of his is assessment ‘for’ students rather than ‘of’ students.

When it comes to assessment literacy, there is so much to consider:

  • What exactly are you assessing? A product? A performance? Mastery of a standardized skill?
  • Does your assessment line up with what you’re teaching? This does’t mean teaching to the test, but does the scope and sequence of what students are learning align with your assessment?
  • Have you included your students in creating criteria for their assessments? Do you use rubrics? How much are teacher generated? How much are student generated?
  • How do you communicate these assessments to students? To parents? To other teachers? Do you do this through portfolios? Report Cards? Conferences?
  • What information are you getting from a standardized test? How are you using that information? Is this information used for student improvement? School improvement? Teacher improvement?
  • What does it mean to be 2 standard deviations above the mean? How valid is the assessment?

I’ve only just scratched the surface, but you can begin to see how complicated assessment for student learning can be. I used to consider myself literate in assessments knowing that it was something that would continue to evolve and require me to learn more about it. That is until now.

My students used to be given a standardized test in the fall of each year and we’d get results back in the winter. We could analyze the results, look for trends and gaps in the school as well as confirm any gaps there may be in student learning, and try to address them. While this isn’t bad, Stiggins noted that instructional decisions based on an assessment that happens once a year does not have the greatest impact on student learning. And what about the students? Were they being given this information as a tool to set new learning targets?

So this year, when our school decided to move to a computer adapted assessment that would be issued at least three times a year, I got excited. Not only would the assessment take less than 30 minutes (the old format took about 6 hours over the course of a week), but we’d get data back immediately. Unfortunately, my excitement has turned to frustration. Mostly because I can’t make heads or tails out of the data. I feel like I’ve become assessment illiterate, but I know that it’s not true.

If I’m going to give my students an assessment at least 3 times a year, I want to know how it aligns with our curriculum and what action my grade-level team can take immediately. Over the course of the year, sure we can use the data as we had previously, but in that case, why would we subject our students to it multiple times in a year. Saying that it gives kids a chance to practice filling in bubbles to prepare them for future standardized tests is an argument I never bought. It is clear that isn’t the case now as they presently ‘click’ their selection.

This time of year, we always engage the children in an author study unit. How great it would be if we could use data to fine-tune this unit and communicate this to our students. I’ll leave you with another Stiggins quote. This one from a more recent article (2009).

“Let me be clear about my mission here. The arguments I advance do not arise from a desire to end accountability- oriented standardized testing. Such tests do provide op- portunities for educators to reflect on what is and is not being achieved. If educators don’t take advantage of these opportunities, it is not the fault of the tests. I will suggest specific ways for users to take far greater advantage of standardized tests in the future. But for assessment to become truly useful, politicians, school leaders, and society in general must come to understand the gross insufficiency of these tests as a basis for assessment for school improvement. “

 

Atlanta Public Schools Open Amid a Testing Scandal – NYTimes.com

Atlanta Public Schools Open Amid a Testing Scandal – NYTimes.com.

This article from todays NYTimes is alarming, but not all that surprising to me. What is really being assessed when tests create so much anxiety and pressure on students, parents, teachers, and administrators?

As a teacher, I’ve gone from worrying about how well my students do, to actually focusing on their mistakes. Mistakes actually provide you with a lot more information. Being able to analyze kids’ errors helps me understand and reflect on what I need to change. When I look at a test item, and see that more than half of my class got the item incorrect, it’s a good place to start asking myself why.

Unfortunately, standardized tests are good for expediency, but not always good for learning. The wrong answer doesn’t always provide enough insight. Take 2-digit subtraction with regrouping (borrowing), for example. If a child got the answer wrong, was it due to a misunderstanding of place value, did the child have a directionality issue, did they miss a step in the algorithm, did they simply add by mistake. A good standardized test may include incorrect answers that reveal some of the reasons, but not necessarily all. The only way to know for sure is to observe a child doing the problem and then asking them to explain what they did and why. It’s amazing the kind of insight you can gain from a few simple questions. Furthermore, with a test that provides four possible answers there’s a good chance your student had no clue, but guessed correctly. The correct answer provides very little information.

The other problem with some of the standardized tests out there, is the timeliness of the test-makers correcting and returning the results. By the time many schools get them back, it’s well past the point that they can inform the teacher with something useful about what they can change. With NCLB (No Child Left Behind) and RTTT (Race to the Top), the focus of test scores often becomes, “How did we do?” rather than “What can we learn from this?”

Many test companies are going to computer testing, which I think is great in terms of timeliness, but I wonder how kids 7 and under will do with a mouse. I’d rather the little ones touch their answers on a touch screen, but I suppose a mouse isn’t that far removed from filling in bubbles with a pencil.

This news story isn’t the first of its kind, but I hope it helps change the kind of pressure and anxiety that these tests can place on everyone involved. I’m not opposed to standardized tests; I think there’s a place for them. We have to keep asking, though, what are these tests actually testing and how can they help us be more effective. I hope that the policy makers behind NCLB and RTTT can learn from their mistakes and make student assessment something that’s actually FOR students and teachers rather than an assessment OF them.

I have one other minor criticism about these tests: they create a mindset of having only one right answer to a problem. While this may be true for a test item, we know that innovation comes from thinking outside the bubble and entertaining many possible solutions to more complex problems. By all means use standardized tests, but also include student interviews, their own reflections and assessments, observations, and the myriad of other assessment tools available.

PDF: Playtime, Downtime, and Family Time

As I mentioned a couple of posts ago, a few colleagues and I were at an incredibly inspiring panel discussion about education which featured a diverse group of speakers from the Reverend Al Sharpton, Denise Pope, Chester Finn, Kati Haycock, Nick Hanauer, to Tyrone Howard. One thing that struck me was how each said very similar things, but each clearly had their own focus. This post focuses on Denise Pope’s angle.

Denise Pope, a senior lecturer at Stanford University’s School of Education,  stuck to her main issue that schools today do not foster healthy children – both physically and mentally. She is featured in the movie “Race to Nowhere”  and has written the book, Doing School: How We Are Cheating a Generation of Stressed Out, Materialistic and Miseducated Students.

I’ve only read parts of it, but here are a few things mentioned in the book:

  • homework has no correlation to success at the lower elementary levels
  • kids today don’t get enough sleep
  • they are more concerned about how to get an “A” than what they are learning
  • they are becoming more disengaged
  • they are more stressed and as a result, she concludes, have a higher rate of weight loss due to not eating, drug abuse (usually the use of stimulants), low self-esteem, and so on.

Denise Pope (image from Seattle U's website)

Pope co-founded Challenge Success to redefine what ‘success’ means. She asked us to imagine if our bosses would suddenly give us a test about something school related, had it timed, and then told us the stakes were high. Is that really what happens in our life? Tasks and learning for students should be authentic and relevant. She remains adamant that standards should be high for all students, but that the way we are going about it is unhealthy for all.

She gave us an acronym to remember: P.D.F. (and it’s not a document)
P = Playtime – kids need unstructured play (well-meaning adults structure their lives too much)
D = Downtime – just chilling
F = Family time
Our school has one half-day inservice devoted to community building. Today we enjoyed playtime, downtime and family time. I say family, because my colleagues are indeed like family to me. It was time well-spent.
Here’s an article Pope wrote that’s worth reading.

Pressure Cookers are Designed for Food, not Kids

I just returned from a screening of the documentary film Race to Nowhere. If you didn’t get a chance to see it, I would recommend any teacher, parent, administrator, school policy maker, and high school student to see it. This link shows where the nearest screenings are in your area. It’d be great if our school were able to host a screening for parents, teachers, and anyone in our community who wished to view it. There’s a link on that page to request a community screening.

In this country, starting in the 80s with Nation At Risk, followed in the 2000s by No Child Left Behind, the pressure for all kids to perform at high levels on tests in order to get into colleges has had an adverse effect on our students health and their ability to think critically, find and solve problems, and work well together. After a seven hour day of school and three to four hours of extra curricular activities, should our kids then tackle five to six hours of homework each night? Many of the examples were those of middle and high school students, but it was painful to watch a family end what was probably already a taxing day arguing about homework. The film reiterated what I’ve read and tried to advocate at my school, that there is no evidence linking homework in elementary school to achievement. The correlation begins in middle school, but after an hour of homework, the correlation disappears. By high school the correlation becomes stronger, but again, after two hours of homework, the correlation drops off significantly.

Many of the AP tests don’t test for critical thinking skills, but rather for a bulk of content. One teacher mentioned there is too much content to realistically learn, so they speed it up. The results are kids relying on cramming and cheating. Sadly, there is an increase in all kinds of stress related disorders with the extreme being an increase in teen suicide. It’s hard enough to be a teenager. It was extremely sad to see a parent discuss the suicide of her 13 year-old daughter over a letter grade (the letter grade was a B).

Something I struggled with was watching a teacher who, through her words and tears, was passionate about teaching and cared deeply about her students, However, through the bureaucracy of the system, she couldn’t take it anymore and decided to resign. There are already too few passionate teachers that care so much about what they do. Yet the system is so broken that it  makes them leave the profession.

What I liked about this film is that it showed many of the same kinds of pressures that kids face today to compete for a place in a ‘decent’ college regardless whether they came from an impoverished low-socio economic to wealthy suburban or private schools. The pressures trickle down from policy maker to school principal to teacher and to student. Not everyone needs to go to an Ivy league school, yet for many, they felt that it was the only choice if they wanted to be successful. What does being successful really mean anyway?  The movie mentioned that in Singapore, they offer the top 20% of the graduating class free college tuition – and a stipend – to go into the teaching profession. Here we have to go an extra year and pay for it on our own just to get the basic credentials.

Schools differ in many ways and whether a specialized public charter school or an independent one, the film makes a great case for reducing the stress on kids. Some want to extend the school day, take away recess, art, in order to cram more content into their brain. I can still remember the quadratic equation and know what to use it for, but I’ve NEVER used it since learning it in high school. Some other things, like the chemical structure of amino acids, I have completely forgotten. Are either of those things useful to me today? Did they in some way help me think in different ways? Perhaps. Or maybe I was just figured out what was going to be on the test. If that’s the case, that’s not learning. Why bother teaching if you’re just going to follow a script.

It made me think of this list from Tony Wagner’s book The Global Achievement Gap. He listed seven essential skills all people need to learn:

  1. Critical Thinking and Problem-Solving
  2. Collaboration across Networks and Leading by Influence
  3. Agility and Adaptability
  4. Initiative and Entrepreneurialism
  5. Effective Oral and Written Communication
  6. Accessing and Analyzing Information
  7. Curiosity and Imagination.

Are those things nurtured, taught, and fostered in schools?  Are they tested?

The movie calls on all stakeholders to be brave and do what they care about, say what they believe in, and take the risk when what that is may break the rules, go against policy, or even seem radical to some. If your heart is in it, and you’re doing it for the students’ benefit (and for me, stays true to the school’s mission), then it’s worth that risk. Those with the power to make decisions shouldn’t expect their employees to interact with students a certain way until they model what that looks like and treat their teachers the same way.

Below are a few related videos including the film’s trailer, and a round panel from Stanford discussing the issues.

If you watch the latter, you will hear that students in Finland (who are one of the countries that consistently produce top scores) are involved in project based learning, and have their social and emotional needs honored. They don’t ‘cover’ content. Here are some interesting links.

Edutopia

Fair Test

NYTimes article about this film.

This screening was the first in a series of three parts hosted by Seattle University. I really liked what the Dean of Education said when introducing the film. The next in the series is the screening of the film “Waiting for Superman” – I can’t wait.

 

 

 

 

Making Data Beautiful

Making sense of student ERB test scores on a spread sheet can be daunting for some, and after staring at those numbers for a while, make one’s eyes a little blurry. Turning those numbers or any kind of numerical data into something more concrete, like a pie chart or bar graph makes it much easier to read and grasp. Taking it one step further and pairing up with other data could reveal some interesting patterns. For example, with the test scores I mentioned, when comparing them to other schools, what if we were able to include data on the size of the school as well. Would the results change? What is the statistical significance when comparing a school with one class per grade to one that might have 10 classes per grade. Does the sample size change the data set in a way that might be interesting? There are many other ways one can think about data and there has been quite a rise in what is called an infographic: taking the data, adding some design to it, and representing it in a way that can be visualized so it can be easier to understand.

In his TED talk below, David McCandless draws interesting conclusions from complex datasets and pairing them together. So instead of looking at simply what country has the biggest military budget, he might pair that with the country’s GDP and suddenly, the results are quite different. He also has a blog worth checking out called Information Is Beautiful. It’s definitely worth checking out.