Thursday, May 7, 2015

Show the Data: Cluster Charts

In the last post, we explored the idea of adding bump(s) charts to our rotation of how we communicate our data. It's one way to show all of the data in a particular set. Another one I've been using quite a bit is a cluster chart. Full disclosure here, these are my own take on displaying data---a bastardized heat map, and certainly not based on heavy-duty math like real hierarchical cluster charts. So, really, I'm not sure what to call these...but in my current role, we're finding them to be very useful and I'm just rolling with cluster charts as my category.

This spreadsheet will eat your soul.
I get a lot of spreadsheets sent to me that look like the one on the right. I hate these with a fiery passion for a variety of reasons:
  • Too much "ink" in the data-to-ink ratio. With all of those little boxes, I don't know where to look.
  • And the colors. I feel like a circus came to town. But beyond that, the red and green are not particularly friendly to those with color vision issues...and I do work with some who are color blind. Are we really asking them to try and make decisions on student learning  based on this?
  • Not to mention that all of the data is colored in. What's the point?
  • And we have both numbers and colors. I'm not saying that you can't have both...or that they don't serve different purposes...but it's distracting. I'm constantly trying to make sense of the number patterns for each color.
I also think these data aren't useful because of the way they are organized. Alphabetical order is great for gradebooks, but not so much for trying to make sense of the data. Plus, we don't have any context---what if we're missing some signal in the noise? Suppose all of our low-performing students are boys...or in a minority group?

But let's say you are interested in showing both the progress students have made over time, as well as the characteristics of the students involved. We can reorganize the data by ranking the percentages on the second assessment (this is the "cluster" part). Then we can color code some additional information, such as gender or participation in a particular program. I also change the properties of the conditional formatting so that the fill and text are the same color, making the values seem invisible. Finally, I add thick white borders around all of the cells and resize the rows and columns. Here is a small part of the final product:

These are all the students who scored in the top level of our fictional grade 5 winter math assessment. Three of them improved a little, from light blue to a darker blue...others improved from lower on the scale (orange). But when we look at gender and program, another story emerges. Most of the students in the top category are female, not on free or reduced lunch, not in special education, and do not receive additional interventions through a Title I program.

See the difference when we look at students who have scored at the bottom for both fall and winter? Our population is mostly male and nearly everyone participates in one or more federal programs.

Maybe this representation doesn't necessarily hold any surprises, especially as we factor in free/reduced lunch. Children living in poverty typically do not perform as well as their peers. But one of the things I take away from this way to visualize that story is that we may need different interventions to support these students. Consider Student 39 on the right. He is receiving free or reduced lunch, special education and Title I services...and he's still ranked fourth from the bottom out of nearly 70 students. It doesn't mean that the school (and the student) aren't working as hard as they can. I do think it might mean that there are additional factors at work here that aren't (and can't be) addressed through the school. Perhaps the family is homeless or transient. Maybe the parents are going through a divorce...or the student has some medical issues. These are community-based issues and require different interventions to help close the gap for the student. I won't get up on my left coast soapbox about this right now. I'll just say that we have to work together on behalf of the whole child.

One of the pieces of information that is not represented in the visuals above is the number one item on teacher wishlists when it comes to reporting scores: progress. Sure, we have a bunch of students performing at the lowest level in the picture above, but that doesn't mean that they didn't make some growth.

This time around, I left the gender and program pieces coded the same, but I calculated the percent change between fall and spring and represented those in the leftmost column.

Look at Student 39 now. He's 12th from the top. Woo-hoo!

When we consider progress, we start to get a more equitable pattern---everyone is growing, and more often than not, it's our lowest performers who are making the biggest gains, even if they're still in the lowest part of the score breakdown.

By clustering similarly performing students together, either by scores or by progress, we get a much more useful pattern than we do with a spreadsheet that looks like a clown exploded on it. And, more importantly, we can show the data. In a very compact space, I can display everyone's scores and whatever demographic or program information is most relevant. And, I can fit the whole grade level on a single page.

I have no doubt that as we move forward, smarter people than me (I?) will continue to find new charts that help share everything we know about a group. Summary stats and charts will never go away---and they have their own purpose to serve. But sometimes we want the full version, not the Cliff's Notes. When we do, bump and cluster charts will be there.

Bonus Round
Want to see just how challenging colored squares can be? Play this online game. The rules are easy: just click on the one square that is different.

No comments:

Post a Comment