Wednesday, April 29, 2015

Hide and Seek

I want to circle back to an article I wrote a few years ago about my favourite data visualization.

Hierarchical Cluster Analysis by Alex J. Bowers from
It shows all of the grades earned by students during their K - 12 journeys in two school districts. I love this chart because it finds a way to show all of the data in a dense, but succinct, format.

In The Visual Display of Quantitative Information, Edward Tufte states that Above all else, show the data. While the quote was applied to a different concept for visualizing data, when I look at the chart above, the quote rises to the surface of my thinking. Showing the data is no small task, and as educators, we spend a lot of time and energy not doing that. We summarize the data into neat little one letter grades or one number test scores. As teachers, we might see a set of scores...but we are the only ones to do so and we typically view them as numbers, not visual displays.Things hide in numbers and number sets.

But a recent paper shared in the Public Library of Science (PLoS) makes the case that things can be hidden in simple visuals, too.

CC-BY Weissgerber, Milic, Winham, Garovic

The authors of the article Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm assert that the ever-popular bar chart is a summary, and therefore "full data may suggest different conclusions from the summary statistics."  (It reminds me of Anscombe's quartet.) We often claim that pie charts are used to hide data. Et tu, bar charts?

I won't claim that the scatterplots and bump charts in the article are ground-breaking, but this paragraph in particular caught my interest (emphasis mine):

The infrequent use of univariate scatterplots, boxplots, and histograms is a missed opportunity. The ability to independently evaluate the work of other scientists is a pillar of the scientific method. These figures facilitate this process by immediately conveying key information needed to understand the authors’ statistical analyses and interpretation of the data. This promotes critical thinking and discussion, enhances the readers’ understanding of the data, and makes the reader an active partner in the scientific process. In contrast, bar and line graphs are “visual tables” that transform the reader from an active participant into a passive consumer of statistical information. Without the opportunity for independent appraisal, the reader must rely on the authors’ statistical analyses and interpretation of the data.

As educators, we might not view our work as a scientific process, but we must engage with our data. I feel pulled between the notion above that we may be oversimplifying our data presentations and some of the research about how an audience likes their data presented---which is typically charts that are the most familiar. This is not the Great Divide, mind you. We can bring these two things together with some education in the area of data literacy.

Or perhaps we underestimate our audience. I've introduced cluster maps, bump charts, and box-and-whisker diagrams to various groups this year. The first two required very little explanation. Box-and-whiskers did require a bit more orientation, but I never felt like the group using them struggled with the interpretation. I do think that concept of engagement between the visualization and the reader, as posed by the article is important. It's a different way to view interaction---a key piece of a good quality visual. It's not that the visual need be physically interactive...people don't have to be able to click, sort, or filter every chart---but we need to at least cause some thinking about what is presented.

After reading the PLoS article, I'm more convinced than ever that we need to when and why we share all the data. Bar and line charts may well be the fast food version of data viz, but we can begin to add to our visual diet by finding ways to show all of the ingredients.

Bonus Round
If you view the article on PLoS, you will have access to two Excel workbooks to help you make the charts presented in the article.

I'll share some of my own attempts to "show the data" in coming posts. Visit bump charts and cluster charts to learn more.

Sunday, April 19, 2015

Looking at Disproportionality

The concept of disproportionality underlies much of the reform movement for education in the United States. Sometimes referred to as the achievement gap or opportunity gap, the basic idea is that outcomes for all students are not equitable. While much of the conversation focuses on race, disproportionality applies to any subgroup: gender, special education, English language learners, free/reduced lunch, and so on. Over the past several years, much of the conversation about disproportionality has focused on student achievement---and test scores in particular. But just as there is more we need to look at in addition to race, there is more to children than representing them as test scores. We can look at disproportionality as it relates to sports or after school activities, student discipline,  attendance, and other factors.

If you want to examine disproportionality within your system, there are a few pieces of data that you will need to know. In the example below, I'm going to use gender (male, female) as subgroups. (Note: I realize this is a very heteronormative view of gender. Our data systems need to catch up with our increasing understanding as a society about gender identification; however, right now, most school data systems are set up to only capture the binary male/ I'm going to use it as an example.)

First, you need to know the enrollment numbers and percentages for each subgroup. In other words, how many possible students are there who could participate in an athletic program, be subject to suspension/expulsion consequences, fail Algebra, or yes, pass the state test? Many schools report gender as close to 50/50 percent, as one might expect, but variations do exist. Don't assume that you're starting off with equal pools of participants.

Secondly, you need to know the participation numbers and percentages for each subgroup. Just because everyone is eligible to pass the state test doesn't mean that they do. So, how many males/females met the standards? How many in each subgroup were suspended? Enrolled in Physics or Calculus? Turned out for basketball?

In this example, we have a school with 250 males and 275 females, with 50 from each group enrolled in Algebra. Now we need to calculate the disproportionality.

To determine the number of males required to achieve proportionality for the total population, we use the first equation described above (n males for proportionality = (50 * .52) / .48) for a result of 45.5 males. The second equation gives us 55 females needed for proportionality.

Next, we take these two and compare them with the number of students in each subgroup that are participating. For males this would be 50 - 45.5 = 4.5; for females 50 - 55 = -5. That -5? It means that we need five more females enrolled in Algebra to achieve proportionality.

While it may not be entirely realistic to achieve perfect proportionality within a system for all programs, subgroups, and outcomes, it is still important to review these data to reflect on areas where institutionalized racism or policies may be contributing to disproportionality. Another factor to consider is the size of the subgroups that you are reviewing. For example, if you only have two or three American Indian students in a grade level, it's unreasonable to expect that they are represented in everything---but you should look to be sure that they are represented somewhere among school offerings. In that case, it may be more helpful to use longitudinal data to get an idea for participation.

Here's an Excel workbook that allows you to easily compare gender equity in sports programs. I built it a couple of years ago for a program that needed it, based off an idea of Debra Dalgleish. See her site for even more ideas on data entry forms...and feel free to modify mine to suit your needs.

Monday, April 13, 2015

Session Recap: Data Displays as Powerful Feedback

I had the pleasure of presenting at the ASCD annual conference last month. Each year, I stretch myself a little further in making connections between ideas, as well as between technology and content.

My session description: Developing visual literacy is a key skill for student success with Common Core State Standards. Students also need clear feedback about their progress. Using data displays, such as charts and graphs, we can integrate these goals and increase student achievement. In this interactive session, you will learn strategies that increase visual literacy and foster communication. You will also learn to effectively use data collected in classrooms as feedback with students. Both digital and analog tools for organizing and integrating data into lessons will be provided.

Session descriptions are written at least 10 months before the actual presentation happens. This extended timeline can explain why many sessions are not as promised, which is very frustrating for attendees. You pick something from the catalog that looks like it is the most perfect thing ever, only to show up and discover that the presenter has something different in mind. I try to stick as closely as possible to my submitted description, but I admit that I end up taking a little birdwalk here and there. It's hard not to---you learn so much in between submitting a proposal and actually presenting it. For me, a lot of that growth in learning has occurred to changing jobs this year and getting a much better on the ground view of the lack of visual literacy among students and teachers.

My logic model that framed my presentation was

I started the presentation with a brief look at visual communication in general---pictures have been used far longer than text. Then, we talked about how graphics used as feedback have a larger impact on student achievement than nearly any other type of feedback (e.g., marking answers right/wrong). All of this was to build a case for becoming visually literate.

I won't bore you with all the details. I fused together some previous presentation materials and pulled a lot of pieces of this blog in as examples. But if you want a look at things, I have it all stashed on the same wiki as my other resources.

I was slated for 8 a.m. on the last day of the conference---not quite the worst possible time slot, but just about. So, I had a small, but awesome crowd. Lots of comments afterward made me feel good, from one gentleman who said it was the best hour of the entire conference (and asked if there would be a Part II) to another with a very heady offer I'm kicking around.

Proposals for next year are due in a month, so I am already kicking around things to share. I think that I will put in something about using questions to focus data use...and something similar to this year on visual literacy skills. We have to expand our conversation about visual literacy. We work so hard in schools to be literate in other ways. We practice rules for grammar, punctuation, and different forms of writing...all with the goal of improving communication. But for the most part, the visuals developed are junk. And that needs to change.

In the meantime, there is SO much I want to learn. I would love to try and go to the Eyeo Festival next year. Or somehow wrangle an invitation to the Tapestry Conference. I'm feeling a need to get beyond the borders of education and exchange ideas and resources. I continue to do lots of reading and thinking (no matter how quiet I am here) and am always pondering what to learn next. Isn't that what we want for our schools, too?