Analyzing the Dynamic Evolution of Hashtags on Twitter: a Language-Based Approach

This article examine aspects of the dissemination of hashtags in Twitter, aiming at understanding the process of propagation of innovative hashtags in light of linguistic theories.


Hashtags are used in Twitter to classify messages, propagate ideas and also to promote specific topics and people. In this paper, we present a linguistic-inspired study of how these tags are created, used and disseminated by the members of information networks. We study the propagation of hashtags in Twitter grounded on models for the analysis of the spread of linguistic innovations in speech communities, that is, in groups of people whose members linguistically influence each other. Differently from traditional linguistic studies, though, we consider the evolution of terms in a live and rapidly evolving stream of content, which can be analyzed in its entirety. In our experimental results, using a large collection crawled from Twitter, we were able to identify some interesting aspects – similar to those found in studies of (offline) speech – that led us to believe that hashtags may effectively serve as models for characterizing the propagation of linguistic forms, including: (1) the existence of a “preferential attachment process”, that makes the few most common terms ever more popular, and (2) the relationship between the length of a tag and its frequency of use. The understanding of formation patterns of successful hashtags in Twitter can be useful to increase the effectiveness of real-time streaming search algorithms.

By Cunha at al. 

(Source: aclweb.org)

Differences in the Mechanics of Information Diffusion Across Topics: Idioms, Political Hashtags, and Complex Contagion on Twitter

This article analyzes sources of variation in how most widely-used hashtags on Twitter spread within its user population.


There is a widespread intuitive sense that different kinds of information spread differently on-line, but it has been difficult to evaluate this question quantitatively since it requires a setting where many different kinds of information spread in a shared environment. Here we study this issue on Twitter, analyzing the ways in which tokens known as hashtags spread on a network defined by the interactions among Twitter users. We find significant variation in the ways that widely-used hashtags on different topics spread.

Our results show that this variation is not attributable simply to differences in “stickiness,” the probability of adoption based on one or more exposures, but also to a quantity that could be viewed as a kind of “persistence” - the relative extent to which repeated exposures to a hashtag continue to have significant marginal effects. We find that hashtags on politically controversial topics are particularly persistent, with repeated exposures continuing to have unusually large marginal effects on adoption; this provides, to our knowledge, the first large-scale validation of the “complex contagion” principle from sociology, which posits that repeated exposures to an idea are particularly crucial when the idea is in some way controversial or contentious. Among other findings, we discover that hashtags representing the natural analogues of Twitter idioms and neologisms are particularly non-persistent, with the effect of multiple exposures decaying rapidly relative to the first exposure. 

We also study the subgraph structure of the initial adopters for different widely-adopted hashtags, again finding structural differences across topics. We develop simulation-based and generative models to analyze how the adoption dynamics interact with the network structure of the early adopters on which a hashtag spreads.

By Daniel M. Romero, Brendan Meeder, and Jon Kleinberg

(Source: cs.cornell.edu)

What’s in a Hashtag? Content based Prediction of the Spread of Ideas in Microblogging Communities

Oren Tsure and Ari Rappoport demonstrates that the content of an idea (hashtag) plays an important role in its acceptance by the community.


Current social media research mainly focuses on temporal trends of the information flow and on the topology of the social graph that facilitates the propagation of information. In this paper we study the effect of the content of the idea on the information propagation. We present an efficient hybrid approach based on a linear regression for predicting the spread of an idea in a given time frame. We show that a combination of content features with temporal and topological features minimizes prediction error.

Our algorithm is evaluated on Twitter hashtags extracted from a dataset of more than 400 million tweets. We analyze the contribution and the limitations of the various feature types to the spread of information, demonstrating that content aspects can be used as strong predictors thus should not be disregarded. We also study the dependencies between global features such as graph topology and content features.

(Source: cs.huji.ac.il)

visualizing Twitter activity of April 11, 2012 for keyword tsunami

On April 11, 2012 a powerful earthquake of M8.7 was detected off the west coast of northern Sumatra, Indonesia. A tsunami watch was issued across the Indian Ocean region. Soon after, news of the earthquake and tsunami watch started spreading across Twitter.

These visualizations shows how the news of tsunami spread across Twitter. 
We started monitoring for the keyword ‘tsunami’ around 14:49 Malé time.

Twitter users are represented by points (nodes) and relations by lines (edges).

@infoBMKG happens to be the source with the highest retweets during our monitoring period

Users retweeting from the same source are identified with the same line colors.

a large number of users retweeted from twitter user @BBCBreaking

4 hours later the tsunami watch was called off in Indian Ocean countries.

Salience vs. Commitment: Dynamics of Political Hashtags in Russian Twitter

in social media sites, higher levels of mentioning often correlate with higher levels of engagement (e.g., users tweet about a political rally), while false indicators of engagement are rare: if a user wishes to mention a political movement to disagree with it, she will often not use a tag or specific name referring to that movement, but use a variant of it (e.g., a Twitter user who wants Vladimir Putin out of power may use the tag #Putinout instead of #Putin when tweeting about the prime minister and future Russian president). 

Barash, V. & Kelly, J. (2012) ‘Salience vs. Commitment: Dynamics of Political Hashtags in Russian Twitter’