Luhn Revisited: Significant Words Language Models

Our paper "Luhn Revisited: Significant Words Language Models", with Hosein Azarbonyad, Jaap Kamps, Djoerd Hiemstra and Maarten Marx, has been accepted as a long paper at The 25th ACM International Conference on Information and Knowledge Management (CIKM'16). \o/ On of the key factors affecting search quality is the fact that our queries are ultra-short statements […]

Mixed-Language and Multilingual Document Processing

Mixed-language and multilingual text information are rapidly growing on the Web. Processing this type of data poses additional challenges compared to monolingual information. As a part of my MSc thesis, Email Management in Multilingual Environments, I focused on finding an efficient way for measuring mixed-language and multilingual document similarity. In order to process multilingual emails, we […]