Subscribe to DSC Newsletter

Hashtag Myths, Real Time Meme Detection

Building a great list of Twitter followers, and judiciously using #hashtags seem to be the holy grail to grow traffic and reap rewards from social media. A good reference on the subject is the Mashable article How to Get the Most Out of Twitter #Hashtags.

The reality is very different. First, when did you read a tweet for the last time? For most of us, probably long ago. Maybe never. Also, hashtags tend to discourage people from clicking on your link. In practice, even with 6,000 real followers, a tweet will generate 0 click at worst, 20 clicks at best. This is true if you only count clicks coming straight from your tweet, and ignore indirect traffic, and the multiplier effect described below.

By the way, here's an example of tweet with two hashtags, in case you've never seen one before: 

naXys to Focus on the Study of Complex Systems:

Is it really all that bad?

No, it actually works. First, if your tweet gets re-tweeted, it is not difficult to generate tons of additional clicks that can be attributed to Twitter (and even more from other sources), even for subjects as specialized as data science. The Twitter multiplier can be as high as 3.

For instance, a great article is published in our weekly digest and gets 800 clicks that are directly attributed to the eBlast sent to more than 60,000. After 12 months, the article has been viewed 12,000 times. When you analyze the data, you find that 2,400 of the total visits can be attributed to Twitter (people re-posting your link and sharing it via Twitter).

There are also some other positive side effects. Some publishers run web crawlers to discover, fetch and categorize information, and #hashtags are one of the tools they use to detect and classify pieces of news. We are one of them - both as a crawler, and as a crawlee. The same publishers also publish lists of top data scientists: their lists are based on who is popular on specific hashtags - whether or not the tweets in questions are generated automatically (syndicated content) or by a human being.

And while a tweet full of special characters is a deterrent for the average Internet user, it is mush less so for geeks (if you read this article, chances are that you are, at least marginally, a geek). However you must still comply with the following rules:

  • you don't have more than 2 hashtags per tweet,
  • the hashtags are relevant,
  • 50% of your tweets have no hashtags.

Otherwise, people will stop reading what you post and then you are faced with the challenge of finding new followers, faster than you are loosing old ones.

Hashtags are particular suited to comment on a webinar as it goes live: INFORMS could use the #INFORMS2013 hashtag for people commenting on their yearly conference. Hashtags are also helpful to promote or discuss some event. For instance, we created #abdsc to reward bloggers tweeting about us.

Still, it is not so easy to be listed under Top rather than All People, on Twitter, when posting a tweet. And if you are not in the Top list, your hashtag has no value. In addition, you can be victim of hashtag hijacking: hashtag spammers abusing a popular hashtag to promote their own stuff.

Clicks multiplier achieved through social networks

The above table, based on recent posts tracked via, shows that a blog post can easily see its traffic statistics triple (rightmost column), thanks to other people who like what you wrote and re-tweet or re-post on their blog. Note that the fewer internal clicks, the bigger the multiplier: when we don't do much to promote one of our posts, then other bloggers promote it on our behalf. The most extreme case here being one internal click, with multiplier 53.

Will hashtags work for you? How to know?

An easy way to check whether hashtags provide a lift is to perform some A/B testing. You can use the following protocol:

  • Over a 14-day time period, tweet two links each day: one at noon, one at 6 pm.
  • The first day, in the morning, add the hashtag #datascience to your tweet. No hashtag in the evening.
  • On the second day, do the other way around: no hashtag at noon, hashtag #datascience at 6 pm.
  • Keep alternating noon / 6 pm to determine when to add the hashtag
  • Attach a specific query string to each link, for instance, ?date=0528&hashtag=yes&time=6pm to the URL you are tweeting (before shorterning the URL). This is for tracking purposes.
  • Use to shorten your URLs

For example, your tweet (hashtag version) could be something like

Testing Twitter hashtags performance #datascience #hashtagABtest

Note that when you click on, the query string shows up in the reconstructed URL in your browser (after the redirect). What's more, you can monitor how many clicks you get by looking at (same URL, with + added at the end).

After 20 days, you can now perform a statistical analysis to check if

  1. hashtags boost the number of direct clicks
  2. hashtags boost the multiplier effect discussed earlier

You'll be able to detect if clicks variations are explained by time, weekend/weekday, some other factor such as link, and whether hashtags positively contribute, and quantify this contribution. Indeed, I invite all of you to do this A/B test (any way you want, you don't need to follow my protocol) and tweet your results using the #hashtagABtest hashtag.

Real time detection of viral tweets

Micro-memes are emergent topics for which a hashtag is created, used widely for a few days, then disappears. How can they be detected, and leveraged? Journalists and publishers like us are particularly interested in this. 

TrendSpottr and Top-Hashtags are websites offering this type of service. But they rely on sample data - a major drawback to discover new, rare events that suddenly spike. It is especially difficult since these memes use made-up keywords that never existed before, such as #abdsc or #hashtagABtest.

Also, Top-Hashtag worked with Instagram, but not with Twitter, last time I checked. Note that Facebook does not allow hashtags. Google+ does. For data science, the most interesting large social networks are

  • LinkedIn (more professional),
  • Google+ (our Google+ Big Data community is growing much faster than our much bigger Facebook group),
  • Quora (very geeky) and
  • Twitter.

So being able to identify in real time new trends that are emerging on these networks is important for all of us.

One solution consists of following the most influential people in your domain, and get alerts each time one of their new hashtags get 10+ tweets. If some new hashtag from an unknown guy starts creating a lot of buzz, these influencers (at least some of them) will quickly notice, and you too, since you follow the influencers. Just pick up the right list of keywords and hashtags to identify top influencers: data science, hadoop, big data, business analytics, visualization, predictive modeling, etc.

Some lists of top influencers exist (we will publish one soon). We created a list of top tweets a while back, but we need to revise it (click here and check the box just below Analytic News).

Related articles

Previous digest | Recent jobs | Top Links | Data Science eBook 
Apprenticeship | Contest | Events | Press Releases


Views: 2112


You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service