Working Papers – The Observatorium, Variation of Information and stuff…

Rodrigues, D. (submited to ECCS’10); The Observatorium – The structure of news: topic monitoring in online media with mutual information

Abstract: Large, real time text classification systems are becoming a popular topic. We present a method for automatically extracting correlated news from online media using a dynamic similarity graph and use the variation of information as a measure to identify topics, lifespan and key terms. The presented method has the advantage of requiring no human intervention or training and having no pre-assigned categories because they emerge from the dynamics of the generated network.

And this is the reason why I’ve been a bit away from blogging lately… Now… next deadline is by March 26 for Open Source Intelligence & Web Mining 2010… and then to Brussels for an ASSYST meeting. It’s going to be a full March…