Tuesday, October 18, 2011

How Stacked Bars Stack Up

I'm starting to put together a workshop on Excel dashboards that will be in December. I'm playing around with a few different data sets and options, and have a quandary I'd like your help with: What is the best layout option when using stacked bar charts?

If you don't know what a stacked bar chart is, here is an example:
There is one bar representing 100% of whatever is being measured: numbers of students, percentage of scores, speeds of African and European swallows, and so forth. In the stacked bar chart shown above, there are four categories which have been numbered. Instead of each category having its own private bar, they get together and party in one space. The size of the categories, indicated by different shades of green in the example, shows their proportion of the 100%. It is a simple way to compare the relative sizes of categories...and all without some ridiculous pie chart.

But stacked bar graphs do not have to take things lying down. They can be vertical, too. Which leads me to my problem: When should you use the horizontal format...and when should you use the vertical format? My Google Fu hasn't turned up any rules. Seems like everyone is just letting it all hang out.

There are three layout options shown below. I know I don't have the categories labeled, but I just want to consider layout at the moment. Just FYI, the data represent scores on the 2011 Washington state test for reading. The shades of green, from lightest to darkest, show the percent of students at a school who scored at Levels 1, 2, 3, and 4.

The first layout has a very traditional look:


Labels for each grade are to the left. The graphs are (more or less) placed so that you can make a quick comparison among the grade levels. But is this better?


Same data. Same graphs. Just rotated 90 degrees. I don't know about you, but I find this easier to "read" when looking across the grade levels. It's true that there would be some issues with how closely the graphs can be pushed together due to the labels, but the overall format is okay.

And finally, my least favourite, but deserving of discussion is this:

Considering that there will be more graphs for mathematics, science, and writing, would it be more important to look at performance for a subject area across and review data for a grade level along a vertical axis (not shown here, but imagine there are stacked bar charts for the other subject areas underneath). 

What do you think? Just from the standpoint of layout, which format is most meaningful for you?

4 comments:

  1. For this kind of data it's pretty obvious that the total is 100%, so you don't need to stack the data. Clustered bars may be more useful so you could compare, for example, 4th and 5th graders with a score of 3. (Since the stacking removes the common baseline, these are harder to compare.)

    Since the different grades are in order and are uniformly spaced, I would skip the bars and use a line chart. You can see small differences very clearly as slight slopes in the lines.

    ReplyDelete
  2. Interesting suggestion. I always think about line graphs being used to show trends (and I do have different data that can be incorporated). But it would certainly be another way to have them play with things.

    I know that distribution is going to be a big deal with this crowd. With the upcoming changes to the way standardized testing is evaluated, we're going to need to start looking for growth (i.e. number of kids in Level 3 this year vs. last year). We could use a traditional bar graph for this, but I'm hoping to find something a little more "glanceable."

    ReplyDelete
  3. Well, you answered an important question ("What do I want to show?") when you said that distributions were important, and you will want to start looking at growth.

    You might try the clustered column chart I mentioned, which essentially plots four histograms of score (one per grade):

    http://peltiertech.com/images/2011-10/ClusterColGradeScore.png

    Plotting the other way (switching data rows and columns) you can see growth (look how Ds have dropped from grade 3 to grade 6):

    http://peltiertech.com/images/2011-10/ClusterColScoreGrade.png

    You can use line charts to show distribution (in this case the clustered column might be better):

    http://peltiertech.com/images/2011-10/LineScoreGrade.png

    Line charts can show growth as well (again, maybe the clustered columns work better here):

    http://peltiertech.com/images/2011-10/LineGradeScore.png

    You may want to compare good students vs. underperforming students. One way would be to group higher scores and lower scores. I did this using a stacked bar chart, plotting Cs and Ds to the left and As and Bs to the right:

    http://peltiertech.com/images/2011-10/LikertBarGradeScore.png

    (Feel free to copy these charts to your own server and incorporate them into the comment.)

    ReplyDelete
  4. Thank you!

    I'm also thinking about how these representations will work for different audiences. The group I have in December will need high level views of things. If you work with Seattle schools and have 40K students' worth of data, you may well need different views than a teacher in a classroom. Both are important stakeholders, but the choice of visual could be unique.

    ReplyDelete