In the fall of 2016, I set out to tell 10 data stories in 10 months. It was an ambitious, but naȉve goal. Oh, I suppose I could have done it as long as I told tiny tales (short stories?). That's not what happened. With each story I built, things got both larger and more detailed. The materials I used were a better quality. I learned to dance with The Muse, understanding that it could be weeks or months in between the visits that would generate ideas and propel me forward. I shared with others and it has all become so much larger than some little side project. It's become a way of thinking and discipline and a path forward into what I want to do next with my life.
So, here we are in the winter of January 2020. And 3.5 years after I started this whole thing, we have finally reached the 10th story...but certainly not the end.
This is a story about how well our district serves students who have only attended our schools.
There was a lot of data with this one...the most I have ever tried to represent. And while 18,500 data points might not be overly large if you're working with them in a digital format...trying to show them physically is a time-sucking nightmare. In the end, I think I found an efficient way.
I pulled longitudinal (grades K - 12) data for current seniors who have only ever attended a school in our district. There were 175 of them (out of ~600). I wrestled with which data sources to use for quite awhile. Part of the challenge is that 13 years is a lot of opportunity for data systems and data collection to change. In the end, I was able to gather attendance, discipline, health room visits, enrollment in special education or the free/reduced lunch program, and performance on state assessments.
Now, out of the 18,500 possible data points, "only" 10,000 or so had something that might require representation. For example, not every student received special services (and those who did were not necessarily served every year), we didn't start using our information system to collect health room visits until these students were in third grade, and state testing doesn't happen at every grade level.
But 10,000 is still a lot. And while two of these areas (special education, free/reduced lunch) were binary, the other four had quite a range. I had to decide what sorts of ranges might be important. For a couple of these (attendance, performance), there were some already established parameters (e.g., "chronically absent" = missing 10% of the days in a year). For discipline and health, I developed some reasonable, but arbitrary, ranges. Although this might be a little indefensible, it also has the benefit of protecting student privacy.
I went round and round with this one for a long time. I really wanted to build something like a game board and with the various stops (grade levels) to the end (graduation) I wanted to share some additional information about that year and the changes to both school system and the world at large. I still think this is a great idea for a future story, but I just couldn't make it fit this one.
Somewhere along the way, someone shared a link to work by Caren Garfen. Some of her data representations use buttons or dollhouse plates or other tiny objects. I think these are so very interesting. And while I didn't end up moving in that particular direction, it did inspire me to do some sort of series of objects. This is where I finally (!) had my big a-ha. What was a small object that had six sides for my six data sets? Sure, you're probably thinking of dice now...but I thought of a pencil. What a perfect object to use. Pencils are long enough to hold longitudinal data, but small enough that 175 of them will still fit in a reasonable space.
I marked the pencils in 1 cm increments, starting at the eraser, to represent each grade level. I put one to six dots on each side of the pencil (toward the end that could be sharpened) to designate which data set was represented on that side. Then, I used leftover paint from the On the Bubble story to encode the data. This was incredibly tedious and not as precise as I would have liked, but I made it work. I sharpened the pencils that represented students who were on track to graduate at the end of the year.
After the pencils were ready, I covered six 24" x 36" cork boards with material. Five of these were designated to hold the pencils and the other for annotations. I used painters tape at measured intervals
to mark the places where cup hooks would be to hold the pencils. I used a drill to create 350 tiny holes in the boards and screwed in the hooks. All that was left was placing the pencils, adding the annotations, and hanging the boards.
This is the most interactive visualization I have ever built. Each pencil can be independently turned so that viewers can investigate and see whatever story they want. I tried to be strategic about the sides of the pencils and which data sets were there. Attendance and performance are next to one another. Health room visits and participation in the free/reduced lunch program and neighbours. A viewer can look at the same side of every pencil at once, or pick a section (e.g., at-risk students) and look for commonalities. There's also a certain gestalt associated with seeing each elementary and how many students started there and are still with us. Even taking mobility out of the equation, there are very definitely some catchment areas where the overall instability is having some sort of impact on the ones who stay.
I thought a lot about labels for this story. For obvious (i.e., legal) reasons, I did not want to add anything to the pencils that would indicate gender, race, etc. I considered some summary data—for example, numbers or percent of females/males. But in the end, I decided to let the students label themselves. I emailed all of them and asked them for their five-word stories. Not all of them wanted to contribute, which was fine, but I liked the idea of pairing how they define themselves vs. how the system sees them. It's been awesome to read what they have to share.
This story took months. I first pulled data in early October...and it was just over three months before the final product was complete. My biggest lesson along the way had to do with figuring out what happens when you have too much data. Just use a sample? Find a different story? Aggregate things? I won't claim I found the best solution, but it has been a very valuable thought experiment.
I also just had to trust the process with this one. There were several times when I wasn't 100% convinced I was on the right track. But I ended up with something incredibly personal, detailed, and very cool. The addition of student input has also been valuable. Kids have said how honored they feel to be represented, but I hope that in the long run, they will think about the imprints they leave behind and how those do or do not represent them accurately. Doing these data stories has certainly left an impression on me over these last few years...and in the next post, I'll share more about how I plan to use them to leave an even bigger impression in our community.
If you'd like to see more pictures or learn more about our data stories project in our school district, please visit the companion page on our district web site.