Sources of Evidence for Automatic Indexing of Political Texts

Our paper “Sources of Evidence for Automatic Indexing of Political Texts”, with Hosein Azarbonyad, Jaap Kamps, and Maarten Marx, has been accepted as a short paper at the 37th European Conference on Information Retrieval (ECIR’15). \o/  Political texts on the Web, documenting laws and policies and the process leading to them, are of key importance to […]

Entity Linking by Focusing DBpedia Candidate Entities

Entity Linking (EL) is the task of detecting mentioned entities in a text and linking them to the corresponding entries of a Knowledge Base. EL is traditionally composed of three major parts: Spotting, Candidate generation, and Candidate disambiguation. The performance of an EL system is highly dependent on the accuracy of each individual part. Regarding […]

Authorship Identification Using Dynamic Selection of Features from Probabilistic Feature Set

Our paper “Authorship Identification Using Dynamic Selection of Features from Probabilistic Feature Set”, with Hamed Zamani, Hossein Nasr Esfahani, Pariya Babaie, Samira Abnar, and Azadeh Shakery, has been accepted at the Conference and Labs of the Evaluation Forum (CLEF’14). \o/ Authorship identification was introduced as one of the important problems in the law and journalism […]

Email Management in Multilingual Environments

Today, Email has become one of the most prevalent communication media that allows people to exchange information. The ease of this communication has led to producing a large volume of emails that causes a problem termed “Email Overloading“. Nowadays, solving the email overloading problem is pressingly urgent and “Email Management” has emerged as a new […]

Email Datasets

In order to evaluate the effectiveness of my proposed conversation thread reconstruction method on multilingual emails, as a part of my Master thesis, I created some datasets. Here are brief descriptions of the created datasets and links for downloading them: The ConThread-BC3 Corpus Description: The ConThread-BC3 Corpus is a special preparation of a portion W3C corpus that […]