Last, week, we met the most adorable of Exploratory Data Analyses graphs, the Box and Whisker Plot. This week, we begin to move beyond EDA (though not too much, and certainly not all-at-once).
Looking at what we were doing last week in our Box and Whisker Plot, we see that we had student performance on a single item. While this is interesting enough, it presents a very static picture.
One of the things we like to look for in assessment is whether something has improved after, well, improvements are made. To do that, we need to track our data long-term. In fact, we may not even know when we start collecting the data precisely how we might use it. Thus, it helps to be a digital hoarder.
Below in Figure 1, I’ve created a fake set of student data. I did this in an Excel file, and I can just append new data from new terms and new students as often as I wish. Suppose I do this for Spring 2014, Fall 2014, and Spring 2015.
This format makes it easy to see that the average Final Exam Score is improving. Even better, we can see that not only is the median score improving, but the total length of the box plots are getting a bit shorter. This means we’re getting a more consistent result in student performance.
Right now, these sorts of graphs are just a bit tricky to get. However, I wanted us to start looking at them now, for two reasons. The first is that the 2016 version of Microsoft Excel is going to have native support for boxplots. I foresee these becoming a great deal more common in the near future. The other reason I wanted to show this sort of graph is that I want to start using them to easily communicate much fancier comparisons than just single number summaries.
For example, suppose instead of the above graph, I had the one in Figure 3. Each one of these has about the same median score – not much has changed there. If I were reporting data with just a single number, it would read 72%, 73%, 72.5%. But notice how much more consistent each class is getting! Instead of a very tall spread of data, with each term, it gets so that the high performing students and the low performing students are much closer together!
Please note that I’m not debating if we actually want class data that looks this way! I do claim though that looking at the summary in the box plots can give us a much fuller picture of what our data truly look like. Much better than a single point estimator such as a mean/average score.
I’m going to let this conclude our arc of discussion about box and whisker plots. Next week, we’ll look at a couple other things.