Sunday, December 18, 2016

My year in review

I decided to document 2016 by how much TV I watched.

(Click on the image to enlarge.  Another version available here: )

It all started with a meme on Facebook: NAME YOUR VAGINA using the name of the last movie you watched!  

This led me to scroll through my Netflix viewing activity, a catalog of every episode of every show and movie I've ever watched, listed chronologically by day from the day I created my account. I realized I could remember specific times in my life based on what I was watching: shows I binged alone, shows I watched with a housemate or significant other.

(My vagina is named Moonlight, by the way. I saw that in the theater long after I streamed The Imitation Game.)

My Netflix history made me wonder.  Could I visualize the data, perhaps plot my Netflix viewing over time? Could I see how intensely I binged certain shows? Could I tell when I was single and spending more time at home? Was it obvious when summer came along and I was out of the house more, or when winter came along and the seasonal depression kicked in?

I love data visualization. I have a lot of experience with manipulating data sets in Microsoft Excel, which may not be the most elegant way to do it, but that's where I started. I copied my Netflix viewing history and pasted it into a new Excel workbook (Paste Special, Text).

Initially, I intended to plot how many shows I watched in any given day, so I used the COUNTIF function to count how many individual entries (episode or movie) I watched in every day in 2016.

Here is the data using a COLUMN plot.

I prefer the AREA plot since it looks prettier and shows sharp changes and clusters of data well.

Realizing it would be more interesting to break the data up by individual show, I visually determined the top 17 shows (ones I obviously binged) and separated the data further using the COUNTIFS function to count how many episodes of each show I watched in any given day.

Here's part of that datasheet.  I colorized it using Conditional Formatting - Color Scales to make the binged shows "pop".

Anything that wasn't a binged TV show (including comedy specials, movies, shows I only watched a few episodes of) fell in the "other" category.

When I looked at this initial visualization though, I was bothered that a few months were practically blank, as if I wasn't watching anything. Did I magically get a life? What was I doing in October and November?

It turns out I was binging on my Hulu account.

Hulu account history keeps track of past viewing chronologically as well, but anything older than 4 weeks is labeled by the month only. I binged a season of RuPaul's Drag Race "3 months ago" but I don't know what specific days, or how many episodes I watched in one sitting.

In order to include Hulu data in my graph, I couldn't plot them by number of shows per day like I could the Netflix shows, so I averaged the number of episodes I watched over the course of the month. For example, I watched fifty episodes of Project Runway one month ago, which is approximately 1.67 episodes a day for that month. This is why Hulu shows are shown as wide columns and not "spikes" like the Netflix data. I feel it's a fair representation of the data based on the limited information on the data set. (If anyone knows how to get more specific Hulu history data, please let me know!)

Sidenote: I did some Hulu data manipulation to account for the few times I knew I was out of town and away from any TV entirely, so I excluded those days from any monthly Hulu average.

After verifying that I indeed watched a lot more TV on the weekends, I decided to distinguish Saturdays and Sundays on the plot.  I created a column plot just displaying weekends to be used as an overlay. They're shown as the vertical stripes in the background of the main plot.

My TV habits during the week in a simple column plot.

Out of curiosity I also plotted up my monthly totals:

The current December is looking to surpass all previous months, I only have data for the first half of the month but I've watched more TV than all but one of the previous months.  I blame the Portland snowpocalypse.

I decided to add a little blip for every Seattle Seahawks game because I watched nearly every game this season on TV, so that's the fourth dataset.  So, the plot at the top of the article is actually four overlaid Excel plots: Netflix data, Hulu data, Seahawks games, and weekend demarcations.

Once I finished creating the data plots, I copied and pasted them into Microsoft Powerpoint and carefully arranged them by dragging the corners to align the axes. I then layered them using simple object arrangement options like Send to Back.  I kept only the axis labels and axis titles visible in one of the plots.

Finally, I added some comments in the plot at text boxes, as certain life events seemed to have some impact or explain certain increases or decreases in viewing habits of certain shows. For example, my parents came to visit in late September and dad started binging Narcos. Also, and I totally forgot about this, but around Valentines Day I apparently binged a whole season about an animal rescue dog shelter called "Animal House." Yes, I happened to be single at that time.

Not shown in the chart are anything I watched on cable TV. Shows that could have some significance:
  • CBS This Morning, I watch most mornings while I get ready for work 
  • The Voice, sometimes 
  • Jeopardy, sometimes 
  • Viceland. The whole channel. Seriously any show. I was watching obsessively once I discovered it in July or Aug. Comcast seemed to drop the channel from my basic cable lineup sometime in Nov. I miss it a lot.