A Data Science Central Community
This blog was originally published on our Text Analysis blog, the blog post set out to analyze and visualize 4 million tweets collected during Superbowl XLIX. Not surprisingly, Superbowl XLIX generated a huge amount of chatter on social networks with Twitter Estimating that over 28.4 million posts made with terms relating to the Superbowl.
At AYLIEN, we collected just under 4 million Tweets from Hashtags, Handles and Keywords we were monitoring. To keep our sample clean, we removed any reTweets and spam from the Tweets collected and only worked with those Tweets that were written in English. We were left with about 3.5 million Tweets to play with. Our idea was to collect a sample of Tweets run them through our Text Analysis API and visualise the results in some interactive graphs.
First of all, we looked at where most of the Twitter activity was coming from. Not surprisingly, the most activity was coming from the US. Europeans were also quite active and by the looks of it were happy to suffer from lack of sleep at work on the Monday in order to stay up and experience the event live.
The second thing we did was, we analysed the volume of Tweets over time. We hoped to see how major events before, during and immediately after the game affected how vocal fans were. Lo and behold it worked.
On top, we have the total volume of Tweets and in the lower graph we have displayed the Tweets related to certain entities, teams, players, coaches and of course Katy Perry.
What’s interesting here is, you can see exactly when the activity kicked off for the pre-game coverage. You can imagine fans settling down for the game and voicing their opinions and predictions on Twitter via their phone or tablet. Throughout the game, there were 3 major spikes in activity, just before the kickoff, halftime and the turning point of the game when the patriots went ahead 28-24.
By far the most mentions went to none other than…Katy Perry, who headlined the much-anticipated halftime show. What does this tell us about the Superbowl fans? They love pop music?
Half time shows aside, there were some interesting constants and spikes in mentions as the game developed. References to the much loved Tom Brady were pretty constant throughout the game. Pete Carroll, however, only really featured with a spike in activity towards the end of the game (I wonder why that was?). Not immediately after the game but once the dust had settled and reality sunk in, there was a huge spike of activity mentioning the Seahawks, perhaps from disgruntled fans expressing their frustration or maybe even loyal fans reiterating their support for a team who were narrowly defeated. Somehow, I doubt it was the latter.
So while the spikes in mentions are interesting, what is more, insightful is the context of the Tweets. Whether these are posts made in support of teams and individuals or the opposite.
One of the more interesting visualizations we produced was the sentiment intensity graph below:
This displays the polarity (positive or negative) of Tweets which mentioned either team. Perhaps the most interesting event on this graph is the extreme swing in polarity of Tweets mentioning the Seahawks. Right when the Patriots took the lead the polarity of posts, mentioning Seahawks, went from slightly negative to “very negative”. It’s also pretty clear from the drop in positivity that, Seahawks fans reacted very negatively to handing over their lead to their opponents.
We were also interested in how the Polarity of Tweets mentioning certain entities, developed throughout the game. We displayed some of the most interesting ones below by focusing on the teams, key players and Pete Carroll.
Tweets mentioning the Patriots had very little negativity associated with them, which was also evident in our first sentiment graph. From this, we can assume that either Patriots fans have a lot more faith in their team or, they are generally liked a lot more by football fans with no tie to either team. Either way they had a much larger following of supporters than their opponents. Another interesting aspect to this visualisation is how the “hero” Tom Brady stayed in the positive range throughout while sentiment, toward Marshawn Lynch and Pete Carroll especially, plummeted after the game as fans voiced their opinions on Carrolls Superbowl losing decision. Opting not to run the ball with Marshawn Lynch and go for the touchdown pass instead, was a decision that cost him dearly.
These days, the Superbowl is as much about the ads and halftime show, as it is about the football. Before the game, we decided to track a few of the bigger name brands to try and get a feel for who won the ads battle.
Of the 4 brands we followed Budweisers #lostdog campaign dominated with more than 5X the mentions on Twitter than the other brands. We also tracked viewers reactions to the advertisements by again analyzing the sentiment of Tweets made that referenced the brand.
While Budweiser had the most mentions, they also had the strongest positive reaction to the ad, as shown below. However, the same can’t be said for T-Mobiles ad with Kim Kardashian, which was very poorly received by Superbowl fans. But you know what they say, bad publicity is good publicity.
About AYLIEN: We are a Text Analysis company who have built a Text Analysis API, among other products, designed to help developers, data scientists, business people and academics extract meaning from text. You can try out our API here.