Time-Aware Authorship Attribution for Short Text Streams

Our paper "Time-Aware Authorship Attribution for Short Text Streams", with Hosein Azarbonyad, Jaap Kamps, and Maarten Marx, has been accepted as a short paper at the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'15). \o/

SIGIR2015_PosterIdentifying authors of short texts on Internet or social media based communication systems is an important tool against fraud and cybercrimes. Besides the challenges raised by the limited length of these short messages, evolving language and writing styles of authors of these texts makes authorship attribution difficult. Most current short text authorship attribution approaches only address the challenge of limited text length. However, neglecting the second challenge may lead to poor performance of authorship attribution for authors who change their writing styles.

In this paper, we analyze the temporal changes of word usage by authors of tweets and emails and based on this analysis we propose an approach to estimate the dynamicity of authors’ word usage. The proposed approach is inspired by time-aware language models and can be employed in any time-unaware authorship attribution method. Our experiments on Tweets and the Enron email dataset show that the proposed time-aware authorship attribution approach significantly outperforms baselines that neglect the dynamicity of authors.

For more details, please read this paper: