Those who follow this blog know that I like all sorts of data. Whether it be tracking the tweets of a company dealing with a public relations issue, or analyzing the social media posts of a public utility company during a blackout, I love finding stories in data.
For the past three years, I’ve been collecting social media data for the top 13 contestants on the television show, American Idol. I wanted to know if social media data could predict who was going home every week.
I started by capturing both Facebook and Twitter data. It didn’t take too long to see a correlation between contestants’ Facebook numbers and those being sent home each week. Although the results weren’t perfect, the accuracy of these Facebook-derived predictions proved more accurate than simple chance. For example, in Season 11, when the television audience voted to chop the Top 24 to the Top 10, the data predicted 8 of the top 10. And during the course of the next 13 weeks, the data successfully predicted 18 of 25 bottom-three (bottom 2 in week # 13) candidates, for an accuracy of 72%.
By the end of the first season, I had come up with a theory that the correlation to America’s voting had less to do with the total number of Facebook fans and more to do with the number of fans gained the night immediately preceding an elimination show. This method worked in all three seasons with one major caveat: it’s accuracy dropped with the number of contestants. The larger the pool, the easier it was to choose both the bottom three and who was likely to go home. However, as the talent pool got smaller, the method faltered, as proven by the fact that it chose the wrong winners in two of the three seasons.
Now, with three individual seasons of data under my belt, I thought it would be fun (yeah, I know) to compare the Facebook data from the top three performers from each season, side by side. I created the interactive bubble chart above for you to play with.
Note: Because American Idol ran for different lengths of time each season, I needed to find a way to align the data between seasons. Since all three data sets had the “last 11 weeks of data,” that’s the information that I used. Once the chart was built, however, I could see a flaw that needs to be explained. In its default configuration, Facebook Fans are on the x-axis, “Talking about” on the y-axis, and a number (1901) on the slider. Unfortunately, the Google Motion Chart wants that bottom number in years, yet in my data is actually weeks (1 through 11). So, when you animate the graph, note that 1901= week 1 for all three seasons, 1902 = week 2 for all three seasons, and so on.
One of the things I did expect to see was a year-over-year reduction in the number of Facebook fans, since supposedly the show’s television ratings are falling. The past three years of Nielsen ratings verifies that story:
2012 (Season 11) Finale: 20.7 million
2013 (Season 12) Finale: 13.3 million
2014 (Season 13) Finale: 10.4 million
So, when we compare the top three from each season, it’s no surprise that the total number of their Facebook fans have dropped:
2012 (Season 11) Sum of the Top 3 FB Fans: 574,531
2013 (Season 12) Sum of the Top 3 FB Fans: 153,586
2014 (Season 13) Sum of the Top 3 FB fans: 114,567
Not only is the number of FB fans per contestant is dropping, but the season-over-season differences are significant. For example, Season 11′s second-place contestant, Jessica Sanchez, accumulated almost as many Facebook fans in her first week (43,799) as Candice Glover, Season 12′s winner gained over her entire season (45,219). And except for the socially-popular Angie Miller from Season 12, Season 11′s 3rd place contestant, Joshua Ledet, had more fans than all four finalists from Season’s 12 and 13 (Kree Harrison, Candice Glover, Jena Irene, and Caleb Johnson).
But, as often happens, the data did reveal something unexpected. Although the total number of Facebook fans fell from season to season, the overall engagement of those audiences rose. Season 13′s final three (Alex Preston, Jena Irene, and Caleb Johnson) lead all nine contestants of in “talk-to-fan” ratios (Facebook’s “Talking About” metric divided by the number of fans). The data suggests that today’s America Idol fans, although down in viewership, are talking bout their favorite contestants more.
Dwindling television viewership may not be the only reason why Facebook numbers are down. The social media landscape is different. For example, Instagram may have drawn some audience from Facebook since it came onto the scene in 2010. The chart above shows the number of Instagram followers for all nine contestants after Season 13 ended. Both Alex (Season 13′s #3) and Caleb (Season 13′s winner) had more Instagram followers than they did Facebook followers.
So, what did I conclude from this three year experiment?:
- The number of Facebook fans gained the night between the performance and elimination night is a good indicator of who is going home for the first five weeks. But as the pool of contestants dwindles, so does the accuracy of the method. The method is ineffective in choosing the winner on Finale night.
- Although the number of Facebook fans is down year-to-year, today’s fans are talking about their favorites more than yesterday’s.
Someone who’d been following my experiment asked if I was disappointed that social data could not help predict the outcome of the contest. Absolutely not! In an age of big data, it’s comforting to know that humans and freewill still rule.
Probably the most important knowledge that I gained during these past three seasons was the ability to use motion charts. If you want to play with the motion charts for each season, you can view them here: